Opened 15 years ago
Closed 15 years ago
#302 closed defect (fixed)
Improvements for graph maps
Reported by: | Peter Kovacs | Owned by: | Peter Kovacs |
---|---|---|---|
Priority: | major | Milestone: | LEMON 1.2 release |
Component: | core | Version: | release branch 1.1 |
Keywords: | Cc: | ||
Revision id: |
Description
Attachments (9)
Change History (40)
Changed 15 years ago by
Attachment: | 302-maps-709cfcba8777.patch added |
---|
comment:1 Changed 15 years ago by
Milestone: | → LEMON 1.2 release |
---|---|
Owner: | changed from Alpar Juttner to Peter Kovacs |
Status: | new → assigned |
comment:2 follow-up: 4 Changed 15 years ago by
There is a major problem with CrossRefMap
: it cannot handle the multiple values correctly. Consider the following code:
Graph gr; CrossRefMap<Graph, Node, char> map(gr); map.set(n1, 'A'); map.set(n2, 'B'); map.set(n1, 'B'); map.set(n2, 'A'); std::cout << map[n1] << ',' << map[n2] << '\n'; std::cout << gr.id(map('A')) << ',' << gr.id(map('B')) << '\n';
It prints out:
B,A 1,-1
instead of
B,A 1,0
The reason for such problems is that the class uses std::map
for storing the inverse map. std::multimap
would be much better, but it would make the value iteration more difficult (multiple values should be iterated only once).
What do you think? Should we switch to std::multimap
?
comment:3 Changed 15 years ago by
Now standrad graph maps are reference maps, so one could expect CrossRefMap
to be reference map, too. Should we make it so?
comment:4 follow-up: 8 Changed 15 years ago by
Replying to kpeter:
There is a major problem with
CrossRefMap
: it cannot handle the multiple values correctly. Consider the following code:Graph gr; CrossRefMap<Graph, Node, char> map(gr); map.set(n1, 'A'); map.set(n2, 'B'); map.set(n1, 'B'); map.set(n2, 'A'); std::cout << map[n1] << ',' << map[n2] << '\n'; std::cout << gr.id(map('A')) << ',' << gr.id(map('B')) << '\n';It prints out:
B,A 1,-1instead of
B,A 1,0The reason for such problems is that the class uses
std::map
for storing the inverse map.std::multimap
would be much better, but it would make the value iteration more difficult (multiple values should be iterated only once).
Alternatively, we can simply state that the iterators work only when the map is indeed a one-to-one mapping. Otherwise it is considered to be in transitional state.
What do you think? Should we switch to
std::multimap
?
What is the alternative solution to this problem?
Changed 15 years ago by
Attachment: | 302-crossrefmap-fix-e8254acfab7d.patch added |
---|
comment:5 follow-up: 6 Changed 15 years ago by
[e8254acfab7d] contains a solution for the problem. std::multimap
is used and the value iterators traverse the values with multiplicity. A count()
function is also added, which gives back the number of items with the given associated value.
What do you think about it?
Should CrossRefMap
be reference map?
comment:6 follow-ups: 7 9 Changed 15 years ago by
Replying to kpeter:
Should
CrossRefMap
be reference map?
I have a look at the implementation of the iterable maps that are planned to be ported, see #73. According to them, I think this class couldn't be implemented as a reference map correctly. Am I right?
Another question is the relation between IterableValueMap
and CrossRefMap
. The fixed version of CrossRefMap
(with std::multimap
, see [e8254acfab7d]) can be easily extended to provide exactly the same functionality as IterableValueMap
.
So what should we do? Should we have one or two tools for such functionality? Which implementation seems to be better/faster? Storing an std::map
for the values and linked lists for each value (IterableValueMap
) or store an std::multimap
(which supports iterating on items that have the same value) and maybe an std::set
if we would like to omit the multiplicity for value iterators.
Note that the solution suggested in [e8254acfab7d] is not compatible with IterableValueMap
, since the value iterator of the later one traverses each value only once.
comment:7 follow-up: 12 Changed 15 years ago by
Replying to kpeter:
Replying to kpeter:
Should
CrossRefMap
be reference map?I have a look at the implementation of the iterable maps that are planned to be ported, see #73. According to them, I think this class couldn't be implemented as a reference map correctly. Am I right?
It could be implemented, but it would not be efficient. I know two at least two different solutions.
Another question is the relation between
IterableValueMap
andCrossRefMap
. The fixed version ofCrossRefMap
(withstd::multimap
, see [e8254acfab7d]) can be easily extended to provide exactly the same functionality asIterableValueMap
.So what should we do? Should we have one or two tools for such functionality? Which implementation seems to be better/faster? Storing an
std::map
for the values and linked lists for each value (IterableValueMap
) or store anstd::multimap
(which supports iterating on items that have the same value) and maybe anstd::set
if we would like to omit the multiplicity for value iterators.Note that the solution suggested in [e8254acfab7d] is not compatible with
IterableValueMap
, since the value iterator of the later one traverses each value only once.
I don't know which is the better solution. However, it should be checked the similarity between BucketHeaps? and IterableIntMaps?.
comment:8 follow-up: 10 Changed 15 years ago by
Peter, could you comment this:
Alternatively, we can simply state that the iterators work only when the map is indeed a one-to-one mapping. Otherwise it is considered to be in transitional state.
How does it relates to the [e8254acfab7d]?
comment:9 follow-up: 11 Changed 15 years ago by
Replying to kpeter:
I have a look at the implementation of the iterable maps that are planned to be ported, see #73. According to them, I think this class couldn't be implemented as a reference map correctly. Am I right?
The usual way of doing it is that operator[] returns a proxy class and the reading/writing functionalities are implemented in the conversion and assignment operators. What is the problem with this approach here? I can't see efficiency problems either.
(Though I'm still concerned about the importance of this feature.)
comment:10 Changed 15 years ago by
Replying to alpar:
Alternatively, we can simply state that the iterators work only when the map is indeed a one-to-one mapping. Otherwise it is considered to be in transitional state.
How does it relates to the [e8254acfab7d]?
If each value is used only once, then the value iterator of CrossRefMap
in [e8254acfab7d] does exactly what you want, so you can state "that the iterators work only when the map is indeed a one-to-one mapping", but I think it is beter to state "the iterators work with multiplicity, so each value is traversed for each item it is assigned to", since it is more precise. Your interpretation is just a special case of it.
Alternatively we can extend the implementation (using std::multimap
) with an std::set
for the used values in order to support "natural" value iteration (each value is traversed once).
comment:11 follow-up: 14 Changed 15 years ago by
Replying to alpar:
The usual way of doing it is that operator[] returns a proxy class and the reading/writing functionalities are implemented in the conversion and assignment operators. What is the problem with this approach here? I can't see efficiency problems either.
(Though I'm still concerned about the importance of this feature.)
In IterableIntMap
and IterableBoolMap
these proxy classes have value-specific operations, such as operator+=
, operator*=
, operator&=
etc., which could not be implmented in a generic way. IterableValueMap
is not reference map. I thought it is because this solution could not be implemented correctly in a generic way, but maybe I was wrong.
In fact, it is not so important question. It is much more critical to make clear the relation between the iterable maps and CrossRefMap
.
What about just dropping IterableValueMap
and keeping only CrossRefMap
extended with value iteration (with one of the mentioned implementations) and rename the other iterable maps to CrossRefIntMap
, CrossRefBoolMap
or IntCrossRefMap
, BoolCrossRefMap
. Actually, the bool map did not have to be separated, it could be a template specialization for CrossRefMap
(we can omit TrueIt
and FalseIt
and keep only ItmeIt
), but the int version does not provide the same functionality and characteristics as the int
version of the general map, so it should be a separate class. Both int maps seem to be important, since one of them is much better for "dense" value sets, while the other is better for "sparse" value sets.
comment:12 Changed 15 years ago by
Replying to deba:
I don't know which is the better solution. However, it should be checked the similarity between BucketHeaps? and IterableIntMaps?.
They have indeed similar internal structure and implementation, but the purposes they are intended for and their interfaces are rather different, so I don't find it problematic.
comment:13 Changed 15 years ago by
I went through [709cfcba8777] and [e8254acfab7d] and I basically like them.
comment:14 follow-up: 16 Changed 15 years ago by
Replying to kpeter:
Replying to alpar:
The usual way of doing it is that operator[] returns a proxy class and the reading/writing functionalities are implemented in the conversion and assignment operators. What is the problem with this approach here? I can't see efficiency problems either.
(Though I'm still concerned about the importance of this feature.)
In
IterableIntMap
andIterableBoolMap
these proxy classes have value-specific operations, such asoperator+=
,operator*=
,operator&=
etc., which could not be implmented in a generic way.IterableValueMap
is not reference map. I thought it is because this solution could not be implemented correctly in a generic way, but maybe I was wrong.In fact, it is not so important question. It is much more critical to make clear the relation between the iterable maps and
CrossRefMap
.What about just dropping
IterableValueMap
and keeping onlyCrossRefMap
extended with value iteration (with one of the mentioned implementations)
Yes, one class should be enough here, except for some possible the efficiency issues. See the next comment.
and rename the other iterable maps to
CrossRefIntMap
,CrossRefBoolMap
orIntCrossRefMap
,BoolCrossRefMap
.
I slightly prefer the Iterable*
names here because they reflects a bit better to the main usage. (Though notice the diminutives in this sentence.) The CrossRefMap
is different for the "iteration" is a trivial operation.
Actually, the bool map did not have to be separated, it could be a template specialization for CrossRefMap
(we can omit TrueIt
and FalseIt
and keep only ItmeIt
)
I prefer to keep a TrueIt
and FalseIt
and I'm not a fan of the template specialization business anyway.
comment:15 follow-up: 19 Changed 15 years ago by
My feeling is that std::multimap
approach is more efficient when most values belong to only a very few keys (i.e. in cases the CrossRefMap
was designed to), while the iteration over the keys are more efficient in case of the linked list implementation (i.e. in the use-cases of Iterable*
).
Could someone check this statement? (i.e. analyze the memory footprint of the two solutions plus perform some very basic runtime tests comparing the iteration in an std::multimap
vs. in a linked list)
comment:16 follow-up: 17 Changed 15 years ago by
Replying to alpar:
I slightly prefer the
Iterable*
names here because they reflects a bit better to the main usage. (Though notice the diminutives in this sentence.) TheCrossRefMap
is different for the "iteration" is a trivial operation.
I also prefer the Iterable*
names, but CrossRefMap
is a part of a stable release (with a buggy implementation, actually), so it cannnot be removed.
I prefer to keep a
TrueIt
andFalseIt
and I'm not a fan of the template specialization business anyway.
Okay, let's keep it as a separate class.
comment:17 follow-up: 18 Changed 15 years ago by
Version: | hg main → release branch 1.1 |
---|
Replying to kpeter:
I also prefer the
Iterable*
names, butCrossRefMap
is a part of a stable release (with a buggy implementation, actually), so it cannnot be removed.
Good point. Could you separate this fix from the other changes/improvements and create a chgset applicable to both branches? The chgset might also include a simple test for this.
Changed 15 years ago by
Attachment: | 302-bugfix-both-branches-7b1a6e963018.patch added |
---|
The bug fix on the top of [257e91516e09] to be merged into both branches
Changed 15 years ago by
Attachment: | 302-improvements-main-6e8c27ee9079.patch added |
---|
Improvements on the top of [7b1a6e963018] to be merged only into the main branch
comment:18 Changed 15 years ago by
Replying to alpar:
Good point. Could you separate this fix from the other changes/improvements and create a chgset applicable to both branches? The chgset might also include a simple test for this.
See [7b1a6e963018] for this patch. [6e8c27ee9079] contains improvements for the main branch only.
comment:19 Changed 15 years ago by
Replying to alpar:
My feeling is that
std::multimap
approach is more efficient when most values belong to only a very few keys (i.e. in cases theCrossRefMap
was designed to), while the iteration over the keys are more efficient in case of the linked list implementation (i.e. in the use-cases ofIterable*
).Could someone check this statement? (i.e. analyze the memory footprint of the two solutions plus perform some very basic runtime tests comparing the iteration in an
std::multimap
vs. in a linked list)
I made simple tests comparing these two solutions. A graph is constructed with n
nodes and k
different values are set to them. Node i
gets the value i%k
, so each value is used for n/k
items.
The test results show the running times for the n
set()
operations and for iterating over all elements with ItemIt
.
Maps test - n = 1000000, k = 1000000, n/k = 1 CrossRefMap - set 1.15375 IterableItemMap - set 0.759874 CrossRefMap - ItemIt 0.391914 IterableItemMap - ItemIt 0.224154 Maps test - n = 1000000, k = 100000, n/k = 10 CrossRefMap - set 1.87104 IterableItemMap - set 0.217853 CrossRefMap - ItemIt 0.0536051 IterableItemMap - ItemIt 0.0243449 Maps test - n = 1000000, k = 10000, n/k = 100 CrossRefMap - set 7.09427 IterableItemMap - set 0.11754 CrossRefMap - ItemIt 0.0631342 IterableItemMap - ItemIt 0.0200191 Maps test - n = 1000000, k = 1000 CrossRefMap - set 94.8407 IterableItemMap - set 0.071444 CrossRefMap - ItemIt 0.071887 IterableItemMap - ItemIt 0.0276492
The conclusions are absolutely clear. The map + linked lists solution is faster in all cases, even if all values are different. And the more multiple values there are, the bigger this difference is, and it can be really huge. However note that the item iteration is "only" 2-3 times slower for the multimap solution, surprisingly the set()
function becomes really slow when n/k
is bigger.
Maybe std::multimap
should be implemented like IterableValueMap
... :)
comment:20 follow-up: 21 Changed 15 years ago by
Two other questions about these maps:
- In [6e8c27ee9079] I added a
count()
function forCrossRefMap
, but maybe it is not so logical as I thought before. IfCrossRefMap
andIterableValueMap
will also kept with different interfaces, then it would be more logical not to add such a function, sinceCrossRefMap
is intended mainly for invertable maps. However a function calledinvertable()
would worth to have, which would returntrue
if all values are used only once (i.e. ifcount(x)==1
for all used valuesx
). - There are
ItemIt
andValueIterator
forIterableValueMap
, which is a bit strange. MaybeValueIt
would be better, but inCrossRefMap
ValueIterator
is used in release 1.1, thus it should be used for the iterable maps, too.
comment:21 Changed 15 years ago by
Replying to kpeter:
Two other questions about these maps:
- In [6e8c27ee9079] I added a
count()
function forCrossRefMap
, but maybe it is not so logical as I thought before. IfCrossRefMap
andIterableValueMap
will also kept with different interfaces, then it would be more logical not to add such a function, sinceCrossRefMap
is intended mainly for invertable maps. However a function calledinvertable()
would worth to have, which would returntrue
if all values are used only once (i.e. ifcount(x)==1
for all used valuesx
).
In the long run the interface can be the same. Is it not a problem is some functions has little meanings in one of these classes.
By "in the long run" I mean that it is not necessary to implement all the functions immediately in both classes but we should not have contradicting interfaces (except for the different behavior of ItemIt
)
- There are
ItemIt
andValueIterator
forIterableValueMap
, which is a bit strange. MaybeValueIt
would be better, but inCrossRefMap
ValueIterator
is used in release 1.1, thus it should be used for the iterable maps, too.
ValueIterator
is obviously a mistake. We should keep it for backward compatibility, but it should be deprecated in favor of ValueIt
.
Changed 15 years ago by
Attachment: | 302+73-maps-all-in-one.bundle added |
---|
comment:22 Changed 15 years ago by
The attached bundle file contains all the 9 changesets that I suggest to be merged into the main branch about graph maps (from tickets #302, #73).
The first four changesets are discussed in the tickets, and seemed to be acceptable:
- [7bda7860e0a8] deba: Port iterable maps from SVN 3509 (#73)
- [71939d63ae77] kpeter: Improvements for iterable maps (#73)
- [7b1a6e963018] kpeter: Fix the implementation and doc of
CrossRefMap
(#302) - [6e8c27ee9079] kpeter: Improvements for graph maps (#302)
The new five changesets are the followings:
- [99124ea4f048] kpeter: Merge
It merges the two branches aboutCrossRefMap
(#302) and the iterable maps (#73). - [b52189c479fb] kpeter: Doc improvements for several graph maps (#302)
- [628519c21586] kpeter: Rearrange the doc of graph maps (#302)
In order to achieve a better order of the graph maps in the module doc. - [0aba23bf2a85] kpeter: Extend maps_test.cc (#302)
Add tests forSourceMap
,TargetMap
,ForwardMap
,BackwardMap
,InDegMap
,OutDegMap
and extend the tests forLoggerBoolMap
. - [a6705908fde4] kpeter: Rename
ValueIterator
toValueIt
in graph maps (#302)
but keepValueIterator
as an alias inCrossRefMap
(only for reverse compatibility).
Note that [7b1a6e963018] is a bug fix patch that should also be merged into release branch 1.1.
comment:23 follow-up: 24 Changed 15 years ago by
[628519c21586] - What this change actually do? Why is it important? From the diff, it looks like the whole lemon/maps.h
were changed to a new file.
Changed 15 years ago by
Attachment: | 302-628519c21586.patch added |
---|
comment:24 follow-up: 25 Changed 15 years ago by
Replying to alpar:
[628519c21586] - What this change actually do? Why is it important? From the diff, it looks like the whole
lemon/maps.h
were changed to a new file.
This changeset does what it is said to do, i.e. rearranges the doc of graph maps in order to achieve a better order.
More precisely it modifies the following strange order
class IdMap< GR, K > class CrossRefMap< GR, K, V > class RangeIdMap< GR, K > class IterableBoolMap< GR, K > class IterableIntMap< GR, K > class IterableValueMap< GR, K, V > class SourceMap< GR > class TargetMap< GR > class ForwardMap< GR > class BackwardMap< GR > class InDegMap< GR > class OutDegMap< GR > class PotentialDifferenceMap< GR, POT >
to this one, which I find much more logical:
class IdMap< GR, K > class RangeIdMap< GR, K > class SourceMap< GR > class TargetMap< GR > class ForwardMap< GR > class BackwardMap< GR > class CrossRefMap< GR, K, V > class IterableBoolMap< GR, K > class IterableIntMap< GR, K > class IterableValueMap< GR, K, V > class InDegMap< GR > class OutDegMap< GR > class PotentialDifferenceMap< GR, POT >
Actually, it moves CrossRefMap
and the iterable maps backwards and next to each other, since they are similar.
Unfortunatelly this change, despite it is only for rearranging the classes in the doc module, can only be achieved by modifying many lines in the source file. However nothing else was done, only rearranging (and many other classes in the file are untouched).
Why is it important?
It is a matter of taste. I don't claim it is important (just like other doc improvements), but I think it makes the doc better. I hardly do such changes, but in this case I found it useful.
comment:25 follow-up: 28 Changed 15 years ago by
Why is it important?
It is a matter of taste. I don't claim it is important (just like other doc improvements), but I think it makes the doc better. I hardly do such changes, but in this case I found it useful.
I do not support pushing this changeset to the main branch. It improves the doc a little bit, but makes the merge of anything crossing this changeset a nightmare.
comment:28 follow-up: 29 Changed 15 years ago by
Replying to alpar:
I do not support pushing this changeset to the main branch.
Okay.
What about the pending changesets [0aba23bf2a85] and [a6705908fde4]? If they are accepted, I rebase them.
comment:29 follow-up: 30 Changed 15 years ago by
Replying to kpeter:
Replying to alpar:
I do not support pushing this changeset to the main branch.
Okay.
What about the pending changesets [0aba23bf2a85] and [a6705908fde4]? If they are accepted, I rebase them.
Do it, please.
Changed 15 years ago by
Attachment: | 302-acdd0bd75a55.patch added |
---|
Changed 15 years ago by
Attachment: | 302-d8073df341f6.patch added |
---|
Changed 15 years ago by
Attachment: | 302-11404088d1a5.patch added |
---|
comment:30 Changed 15 years ago by
Replying to alpar:
What about the pending changesets [0aba23bf2a85] and [a6705908fde4]? If they are accepted, I rebase them.
Do it, please.
Here they are: [acdd0bd75a55], [d8073df341f6].
[11404088d1a5] is a new changeset that adds two creator functions (for those only two map types that did not had it before).
comment:31 Changed 15 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Everything is in the main branch now.
[709cfcba8777] adds tests for
IdMap
,RangeIdMap
,CrossRefMap
and contains some doc improvements for them.