COIN-OR::LEMON - Graph Library

Opened 9 years ago

Closed 9 years ago

#302 closed defect (fixed)

Improvements for graph maps

Reported by: kpeter Owned by: kpeter
Priority: major Milestone: LEMON 1.2 release
Component: core Version: release branch 1.1
Keywords: Cc:
Revision id:

Description


Attachments (9)

302-maps-709cfcba8777.patch (7.6 KB) - added by kpeter 9 years ago.
302-crossrefmap-fix-e8254acfab7d.patch (6.9 KB) - added by kpeter 9 years ago.
302-bugfix-both-branches-7b1a6e963018.patch (6.3 KB) - added by kpeter 9 years ago.
The bug fix on the top of [257e91516e09] to be merged into both branches
302-improvements-main-6e8c27ee9079.patch (7.9 KB) - added by kpeter 9 years ago.
Improvements on the top of [7b1a6e963018] to be merged only into the main branch
302+73-maps-all-in-one.bundle (15.5 KB) - added by kpeter 9 years ago.
A bundle file of 9 changesets related to tickets #302 and #73
302-628519c21586.patch (23.0 KB) - added by kpeter 9 years ago.
302-acdd0bd75a55.patch (3.7 KB) - added by kpeter 9 years ago.
302-d8073df341f6.patch (8.0 KB) - added by kpeter 9 years ago.
302-11404088d1a5.patch (1.1 KB) - added by kpeter 9 years ago.

Download all attachments as: .zip

Change History (40)

Changed 9 years ago by kpeter

comment:1 Changed 9 years ago by kpeter

  • Milestone set to LEMON 1.2 release
  • Owner changed from alpar to kpeter
  • Status changed from new to assigned

[709cfcba8777] adds tests for IdMap, RangeIdMap, CrossRefMap and contains some doc improvements for them.

comment:2 follow-up: Changed 9 years ago by kpeter

There is a major problem with CrossRefMap: it cannot handle the multiple values correctly. Consider the following code:

    Graph gr;
    CrossRefMap<Graph, Node, char> map(gr);
    
    map.set(n1, 'A');
    map.set(n2, 'B');

    map.set(n1, 'B');
    map.set(n2, 'A');

    std::cout << map[n1] << ',' << map[n2] << '\n';
    std::cout << gr.id(map('A')) << ',' << gr.id(map('B')) << '\n';

It prints out:

    B,A
    1,-1

instead of

    B,A
    1,0

The reason for such problems is that the class uses std::map for storing the inverse map. std::multimap would be much better, but it would make the value iteration more difficult (multiple values should be iterated only once).

What do you think? Should we switch to std::multimap?

comment:3 Changed 9 years ago by kpeter

Now standrad graph maps are reference maps, so one could expect CrossRefMap to be reference map, too. Should we make it so?

comment:4 in reply to: ↑ 2 ; follow-up: Changed 9 years ago by alpar

Replying to kpeter:

There is a major problem with CrossRefMap: it cannot handle the multiple values correctly. Consider the following code:

    Graph gr;
    CrossRefMap<Graph, Node, char> map(gr);
    
    map.set(n1, 'A');
    map.set(n2, 'B');

    map.set(n1, 'B');
    map.set(n2, 'A');

    std::cout << map[n1] << ',' << map[n2] << '\n';
    std::cout << gr.id(map('A')) << ',' << gr.id(map('B')) << '\n';

It prints out:

    B,A
    1,-1

instead of

    B,A
    1,0

The reason for such problems is that the class uses std::map for storing the inverse map. std::multimap would be much better, but it would make the value iteration more difficult (multiple values should be iterated only once).

Alternatively, we can simply state that the iterators work only when the map is indeed a one-to-one mapping. Otherwise it is considered to be in transitional state.

What do you think? Should we switch to std::multimap?

What is the alternative solution to this problem?

Changed 9 years ago by kpeter

comment:5 follow-up: Changed 9 years ago by kpeter

[e8254acfab7d] contains a solution for the problem. std::multimap is used and the value iterators traverse the values with multiplicity. A count() function is also added, which gives back the number of items with the given associated value.
What do you think about it?

Should CrossRefMap be reference map?

comment:6 in reply to: ↑ 5 ; follow-ups: Changed 9 years ago by kpeter

Replying to kpeter:

Should CrossRefMap be reference map?

I have a look at the implementation of the iterable maps that are planned to be ported, see #73. According to them, I think this class couldn't be implemented as a reference map correctly. Am I right?

Another question is the relation between IterableValueMap and CrossRefMap. The fixed version of CrossRefMap (with std::multimap, see [e8254acfab7d]) can be easily extended to provide exactly the same functionality as IterableValueMap.

So what should we do? Should we have one or two tools for such functionality? Which implementation seems to be better/faster? Storing an std::map for the values and linked lists for each value (IterableValueMap) or store an std::multimap (which supports iterating on items that have the same value) and maybe an std::set if we would like to omit the multiplicity for value iterators.

Note that the solution suggested in [e8254acfab7d] is not compatible with IterableValueMap, since the value iterator of the later one traverses each value only once.

comment:7 in reply to: ↑ 6 ; follow-up: Changed 9 years ago by deba

Replying to kpeter:

Replying to kpeter:

Should CrossRefMap be reference map?

I have a look at the implementation of the iterable maps that are planned to be ported, see #73. According to them, I think this class couldn't be implemented as a reference map correctly. Am I right?

It could be implemented, but it would not be efficient. I know two at least two different solutions.

Another question is the relation between IterableValueMap and CrossRefMap. The fixed version of CrossRefMap (with std::multimap, see [e8254acfab7d]) can be easily extended to provide exactly the same functionality as IterableValueMap.

So what should we do? Should we have one or two tools for such functionality? Which implementation seems to be better/faster? Storing an std::map for the values and linked lists for each value (IterableValueMap) or store an std::multimap (which supports iterating on items that have the same value) and maybe an std::set if we would like to omit the multiplicity for value iterators.

Note that the solution suggested in [e8254acfab7d] is not compatible with IterableValueMap, since the value iterator of the later one traverses each value only once.

I don't know which is the better solution. However, it should be checked the similarity between BucketHeaps? and IterableIntMaps?.

comment:8 in reply to: ↑ 4 ; follow-up: Changed 9 years ago by alpar

Peter, could you comment this:

Alternatively, we can simply state that the iterators work only when the map is indeed a one-to-one mapping. Otherwise it is considered to be in transitional state.

How does it relates to the [e8254acfab7d]?

comment:9 in reply to: ↑ 6 ; follow-up: Changed 9 years ago by alpar

Replying to kpeter:

I have a look at the implementation of the iterable maps that are planned to be ported, see #73. According to them, I think this class couldn't be implemented as a reference map correctly. Am I right?

The usual way of doing it is that operator[] returns a proxy class and the reading/writing functionalities are implemented in the conversion and assignment operators. What is the problem with this approach here? I can't see efficiency problems either.

(Though I'm still concerned about the importance of this feature.)

comment:10 in reply to: ↑ 8 Changed 9 years ago by kpeter

Replying to alpar:

Alternatively, we can simply state that the iterators work only when the map is indeed a one-to-one mapping. Otherwise it is considered to be in transitional state.

How does it relates to the [e8254acfab7d]?

If each value is used only once, then the value iterator of CrossRefMap in [e8254acfab7d] does exactly what you want, so you can state "that the iterators work only when the map is indeed a one-to-one mapping", but I think it is beter to state "the iterators work with multiplicity, so each value is traversed for each item it is assigned to", since it is more precise. Your interpretation is just a special case of it.

Alternatively we can extend the implementation (using std::multimap) with an std::set for the used values in order to support "natural" value iteration (each value is traversed once).

comment:11 in reply to: ↑ 9 ; follow-up: Changed 9 years ago by kpeter

Replying to alpar:

The usual way of doing it is that operator[] returns a proxy class and the reading/writing functionalities are implemented in the conversion and assignment operators. What is the problem with this approach here? I can't see efficiency problems either.

(Though I'm still concerned about the importance of this feature.)

In IterableIntMap and IterableBoolMap these proxy classes have value-specific operations, such as operator+=, operator*=, operator&= etc., which could not be implmented in a generic way. IterableValueMap is not reference map. I thought it is because this solution could not be implemented correctly in a generic way, but maybe I was wrong.

In fact, it is not so important question. It is much more critical to make clear the relation between the iterable maps and CrossRefMap.

What about just dropping IterableValueMap and keeping only CrossRefMap extended with value iteration (with one of the mentioned implementations) and rename the other iterable maps to CrossRefIntMap, CrossRefBoolMap or IntCrossRefMap, BoolCrossRefMap. Actually, the bool map did not have to be separated, it could be a template specialization for CrossRefMap (we can omit TrueIt and FalseIt and keep only ItmeIt), but the int version does not provide the same functionality and characteristics as the int version of the general map, so it should be a separate class. Both int maps seem to be important, since one of them is much better for "dense" value sets, while the other is better for "sparse" value sets.

comment:12 in reply to: ↑ 7 Changed 9 years ago by kpeter

Replying to deba:

I don't know which is the better solution. However, it should be checked the similarity between BucketHeaps? and IterableIntMaps?.

They have indeed similar internal structure and implementation, but the purposes they are intended for and their interfaces are rather different, so I don't find it problematic.

comment:13 Changed 9 years ago by alpar

I went through [709cfcba8777] and [e8254acfab7d] and I basically like them.

comment:14 in reply to: ↑ 11 ; follow-up: Changed 9 years ago by alpar

Replying to kpeter:

Replying to alpar:

The usual way of doing it is that operator[] returns a proxy class and the reading/writing functionalities are implemented in the conversion and assignment operators. What is the problem with this approach here? I can't see efficiency problems either.

(Though I'm still concerned about the importance of this feature.)

In IterableIntMap and IterableBoolMap these proxy classes have value-specific operations, such as operator+=, operator*=, operator&= etc., which could not be implmented in a generic way. IterableValueMap is not reference map. I thought it is because this solution could not be implemented correctly in a generic way, but maybe I was wrong.

In fact, it is not so important question. It is much more critical to make clear the relation between the iterable maps and CrossRefMap.

What about just dropping IterableValueMap and keeping only CrossRefMap extended with value iteration (with one of the mentioned implementations)

Yes, one class should be enough here, except for some possible the efficiency issues. See the next comment.

and rename the other iterable maps to CrossRefIntMap, CrossRefBoolMap or IntCrossRefMap, BoolCrossRefMap.

I slightly prefer the Iterable* names here because they reflects a bit better to the main usage. (Though notice the diminutives in this sentence.) The CrossRefMap is different for the "iteration" is a trivial operation.

Actually, the bool map did not have to be separated, it could be a template specialization for CrossRefMap (we can omit TrueIt and FalseIt and keep only ItmeIt)

I prefer to keep a TrueIt and FalseIt and I'm not a fan of the template specialization business anyway.

comment:15 follow-up: Changed 9 years ago by alpar

My feeling is that std::multimap approach is more efficient when most values belong to only a very few keys (i.e. in cases the CrossRefMap was designed to), while the iteration over the keys are more efficient in case of the linked list implementation (i.e. in the use-cases of Iterable*).

Could someone check this statement? (i.e. analyze the memory footprint of the two solutions plus perform some very basic runtime tests comparing the iteration in an std::multimap vs. in a linked list)

comment:16 in reply to: ↑ 14 ; follow-up: Changed 9 years ago by kpeter

Replying to alpar:

I slightly prefer the Iterable* names here because they reflects a bit better to the main usage. (Though notice the diminutives in this sentence.) The CrossRefMap is different for the "iteration" is a trivial operation.

I also prefer the Iterable* names, but CrossRefMap is a part of a stable release (with a buggy implementation, actually), so it cannnot be removed.

I prefer to keep a TrueIt and FalseIt and I'm not a fan of the template specialization business anyway.

Okay, let's keep it as a separate class.

comment:17 in reply to: ↑ 16 ; follow-up: Changed 9 years ago by alpar

  • Version changed from hg main to release branch 1.1

Replying to kpeter:

I also prefer the Iterable* names, but CrossRefMap is a part of a stable release (with a buggy implementation, actually), so it cannnot be removed.

Good point. Could you separate this fix from the other changes/improvements and create a chgset applicable to both branches? The chgset might also include a simple test for this.

Changed 9 years ago by kpeter

The bug fix on the top of [257e91516e09] to be merged into both branches

Changed 9 years ago by kpeter

Improvements on the top of [7b1a6e963018] to be merged only into the main branch

comment:18 in reply to: ↑ 17 Changed 9 years ago by kpeter

Replying to alpar:

Good point. Could you separate this fix from the other changes/improvements and create a chgset applicable to both branches? The chgset might also include a simple test for this.

See [7b1a6e963018] for this patch. [6e8c27ee9079] contains improvements for the main branch only.

comment:19 in reply to: ↑ 15 Changed 9 years ago by kpeter

Replying to alpar:

My feeling is that std::multimap approach is more efficient when most values belong to only a very few keys (i.e. in cases the CrossRefMap was designed to), while the iteration over the keys are more efficient in case of the linked list implementation (i.e. in the use-cases of Iterable*).

Could someone check this statement? (i.e. analyze the memory footprint of the two solutions plus perform some very basic runtime tests comparing the iteration in an std::multimap vs. in a linked list)

I made simple tests comparing these two solutions. A graph is constructed with n nodes and k different values are set to them. Node i gets the value i%k, so each value is used for n/k items.

The test results show the running times for the n set() operations and for iterating over all elements with ItemIt.

Maps test - n = 1000000, k = 1000000, n/k = 1
CrossRefMap - set         1.15375
IterableItemMap - set     0.759874
CrossRefMap - ItemIt      0.391914
IterableItemMap - ItemIt  0.224154

Maps test - n = 1000000, k = 100000, n/k = 10
CrossRefMap - set         1.87104
IterableItemMap - set     0.217853
CrossRefMap - ItemIt      0.0536051
IterableItemMap - ItemIt  0.0243449

Maps test - n = 1000000, k = 10000, n/k = 100
CrossRefMap - set         7.09427
IterableItemMap - set     0.11754
CrossRefMap - ItemIt      0.0631342
IterableItemMap - ItemIt  0.0200191

Maps test - n = 1000000, k = 1000
CrossRefMap - set         94.8407
IterableItemMap - set     0.071444
CrossRefMap - ItemIt      0.071887
IterableItemMap - ItemIt  0.0276492

The conclusions are absolutely clear. The map + linked lists solution is faster in all cases, even if all values are different. And the more multiple values there are, the bigger this difference is, and it can be really huge. However note that the item iteration is "only" 2-3 times slower for the multimap solution, surprisingly the set() function becomes really slow when n/k is bigger.

Maybe std::multimap should be implemented like IterableValueMap... :)

comment:20 follow-up: Changed 9 years ago by kpeter

Two other questions about these maps:

  • In [6e8c27ee9079] I added a count() function for CrossRefMap, but maybe it is not so logical as I thought before. If CrossRefMap and IterableValueMap will also kept with different interfaces, then it would be more logical not to add such a function, since CrossRefMap is intended mainly for invertable maps. However a function called invertable() would worth to have, which would return true if all values are used only once (i.e. if count(x)==1 for all used values x).
  • There are ItemIt and ValueIterator for IterableValueMap, which is a bit strange. Maybe ValueIt would be better, but in CrossRefMap ValueIterator is used in release 1.1, thus it should be used for the iterable maps, too.

comment:21 in reply to: ↑ 20 Changed 9 years ago by alpar

Replying to kpeter:

Two other questions about these maps:

  • In [6e8c27ee9079] I added a count() function for CrossRefMap, but maybe it is not so logical as I thought before. If CrossRefMap and IterableValueMap will also kept with different interfaces, then it would be more logical not to add such a function, since CrossRefMap is intended mainly for invertable maps. However a function called invertable() would worth to have, which would return true if all values are used only once (i.e. if count(x)==1 for all used values x).

In the long run the interface can be the same. Is it not a problem is some functions has little meanings in one of these classes.

By "in the long run" I mean that it is not necessary to implement all the functions immediately in both classes but we should not have contradicting interfaces (except for the different behavior of ItemIt)

  • There are ItemIt and ValueIterator for IterableValueMap, which is a bit strange. Maybe ValueIt would be better, but in CrossRefMap ValueIterator is used in release 1.1, thus it should be used for the iterable maps, too.

ValueIterator is obviously a mistake. We should keep it for backward compatibility, but it should be deprecated in favor of ValueIt.

Changed 9 years ago by kpeter

A bundle file of 9 changesets related to tickets #302 and #73

comment:22 Changed 9 years ago by kpeter

The attached bundle file contains all the 9 changesets that I suggest to be merged into the main branch about graph maps (from tickets #302, #73).

The first four changesets are discussed in the tickets, and seemed to be acceptable:

The new five changesets are the followings:

  • [99124ea4f048] kpeter: Merge
    It merges the two branches about CrossRefMap (#302) and the iterable maps (#73).
  • [b52189c479fb] kpeter: Doc improvements for several graph maps (#302)
  • [628519c21586] kpeter: Rearrange the doc of graph maps (#302)
    In order to achieve a better order of the graph maps in the module doc.
  • [0aba23bf2a85] kpeter: Extend maps_test.cc (#302)
    Add tests for SourceMap, TargetMap, ForwardMap, BackwardMap, InDegMap, OutDegMap and extend the tests for LoggerBoolMap.
  • [a6705908fde4] kpeter: Rename ValueIterator to ValueIt in graph maps (#302)
    but keep ValueIterator as an alias in CrossRefMap (only for reverse compatibility).

Note that [7b1a6e963018] is a bug fix patch that should also be merged into release branch 1.1.

comment:23 follow-up: Changed 9 years ago by alpar

[628519c21586] - What this change actually do? Why is it important? From the diff, it looks like the whole lemon/maps.h were changed to a new file.

Changed 9 years ago by kpeter

comment:24 in reply to: ↑ 23 ; follow-up: Changed 9 years ago by kpeter

Replying to alpar:

[628519c21586] - What this change actually do? Why is it important? From the diff, it looks like the whole lemon/maps.h were changed to a new file.

This changeset does what it is said to do, i.e. rearranges the doc of graph maps in order to achieve a better order.

More precisely it modifies the following strange order

class  	IdMap< GR, K >
class  	CrossRefMap< GR, K, V >
class  	RangeIdMap< GR, K >
class  	IterableBoolMap< GR, K >
class  	IterableIntMap< GR, K >
class  	IterableValueMap< GR, K, V >
class  	SourceMap< GR >
class  	TargetMap< GR >
class  	ForwardMap< GR >
class  	BackwardMap< GR >
class  	InDegMap< GR >
class  	OutDegMap< GR >
class  	PotentialDifferenceMap< GR, POT >

to this one, which I find much more logical:

class  	IdMap< GR, K >
class  	RangeIdMap< GR, K >
class  	SourceMap< GR >
class  	TargetMap< GR >
class  	ForwardMap< GR >
class  	BackwardMap< GR >
class  	CrossRefMap< GR, K, V >
class  	IterableBoolMap< GR, K >
class  	IterableIntMap< GR, K >
class  	IterableValueMap< GR, K, V >
class  	InDegMap< GR >
class  	OutDegMap< GR >
class  	PotentialDifferenceMap< GR, POT >

Actually, it moves CrossRefMap and the iterable maps backwards and next to each other, since they are similar.

Unfortunatelly this change, despite it is only for rearranging the classes in the doc module, can only be achieved by modifying many lines in the source file. However nothing else was done, only rearranging (and many other classes in the file are untouched).

Why is it important?

It is a matter of taste. I don't claim it is important (just like other doc improvements), but I think it makes the doc better. I hardly do such changes, but in this case I found it useful.

comment:25 in reply to: ↑ 24 ; follow-up: Changed 9 years ago by alpar

Why is it important?

It is a matter of taste. I don't claim it is important (just like other doc improvements), but I think it makes the doc better. I hardly do such changes, but in this case I found it useful.

I do not support pushing this changeset to the main branch. It improves the doc a little bit, but makes the merge of anything crossing this changeset a nightmare.

comment:26 Changed 9 years ago by alpar

I got lost. Are [709cfcba8777] and [e8254acfab7d] deprecated?

comment:27 Changed 9 years ago by kpeter

Yes, they are.

comment:28 in reply to: ↑ 25 ; follow-up: Changed 9 years ago by kpeter

Replying to alpar:

I do not support pushing this changeset to the main branch.

Okay.

What about the pending changesets [0aba23bf2a85] and [a6705908fde4]? If they are accepted, I rebase them.

comment:29 in reply to: ↑ 28 ; follow-up: Changed 9 years ago by alpar

Replying to kpeter:

Replying to alpar:

I do not support pushing this changeset to the main branch.

Okay.

What about the pending changesets [0aba23bf2a85] and [a6705908fde4]? If they are accepted, I rebase them.

Do it, please.

Changed 9 years ago by kpeter

Changed 9 years ago by kpeter

Changed 9 years ago by kpeter

comment:30 in reply to: ↑ 29 Changed 9 years ago by kpeter

Replying to alpar:

What about the pending changesets [0aba23bf2a85] and [a6705908fde4]? If they are accepted, I rebase them.

Do it, please.

Here they are: [acdd0bd75a55], [d8073df341f6].

[11404088d1a5] is a new changeset that adds two creator functions (for those only two map types that did not had it before).

comment:31 Changed 9 years ago by alpar

  • Resolution set to fixed
  • Status changed from assigned to closed

Everything is in the main branch now.

Note: See TracTickets for help on using tickets.