Opened 8 years ago
Last modified 2 years ago
#381 new enhancement
Simplified heaps without priority update
Reported by: | kpeter | Owned by: | alpar |
---|---|---|---|
Priority: | major | Milestone: | LEMON 1.4 release |
Component: | core | Version: | hg main |
Keywords: | Cc: | ||
Revision id: |
Description (last modified by kpeter)
The existing heap implementations in LEMON cointain an Item->int map to indicate the current location of each item. It is required to implement increase(), decrease(), erase(), etc. functions.
However, simplified heaps could be implemented with a limited functionality (push(), pop(), top(), prio(), size(), empty(), clear(), etc.) without this cross reference map. For such heaps, the basic push() and pop() operations could be implemented more efficiently, but the duplications of items could not be avoided.
A Dijkstra or Prim algorithm could be implemented with such heaps, but it would require slight modifications. A node should be pushed each time its distance label is updated (i.e. more than once in some cases), and the duplicate nodes should be skipped after each pop() operation.
It would be nice to introduce such implementations in LEMON. I think, they would lead to better performance in many practical cases, because not too many duplications would be expected on typical graphs. However, there are some problems with this proposal. First, such heaps would not conform to the current heap concept. Second, using them would require different implementation of the algorithms.
Change History (10)
comment:1 Changed 8 years ago by kpeter
comment:2 Changed 8 years ago by kpeter
- Description modified (diff)
comment:3 follow-up: ↓ 4 Changed 8 years ago by alpar
This thesis might be interesting to check out:
comment:4 in reply to: ↑ 3 Changed 8 years ago by kpeter
Replying to alpar:
This thesis might be interesting to check out:
Thak you for finding this thesis. That is a remarkable study of this idea. However surprising it is, they conclude that Dijkstra is always faster with such a simplified heap:
"We ran tests on graphs of varying types and edge densities, and in all of these tests, we found that Dijsktra-nodec consistently outperformed Dijkstra-dec. When the edge density of the graph became very high, the performance diff erence became small. The running time and memory usage overheads in supporting Decrease-Key could not be justi fied in any of the graphs we tested. Therefore, until there is data indicating otherwise, we recommend using Dijkstra-nodec with an efficient priority queue that does not support Decrease-Key over using Dijkstra-dec for solving SSSP problems on any graph type."
As far as I see, they did not check specific worst-case graphs (full graphs for which Decrease-Key is needed on almost all arcs). For such instances, the standard implementation could be faster, but these graphs are far from typical.
comment:5 Changed 8 years ago by alpar
Let us decide what to do. Shall we
- implement this as a separate class, bin head hard coded?
- Easier to implement but causes quite a bit code duplication.
- integrate it with the existing Dijkstra as a separate run() function?
- Easy, as well, but the question is how the avoid duplicated heap allocation. (A positive side effect of the "don't-update" approach is that it uses less memory)
- implement is using a SimplifiedHeap concept plus enable_if hacking, as usual.
- Can be a bit cumbersome to implement.
Finally, should this approach be the default?
comment:6 Changed 6 years ago by alpar
I would like to see this feature being implemented.
But I'm pretty unsure how to do it efficiently. Do you think we can incorporate this feature along with the current implementation?
comment:7 Changed 6 years ago by kpeter
I think we should avoid duplicated run() functions and duplicated classes of the algorithms (Dijkstra, Prim).
IMHO the best way for implementing this feature would be to:
- introduce a SimplifiedHeap concept so that the current Heap concept would become an extension of that (other name alternatives: SimpleHeap, BasicHeap);
- add "simplified" implementations for the heap structures, at least for BinHeap (I would also interested in QuadHeap, DHeap, etc.);
- apply some template technique to select the appropriate implementation for processNextNode() in Dijkstra's algorithm.
Maybe we could introduce this feature within the current Heap concept in such a way that these simplified implementations:
- do not use the IN_HEAP state (PRE_HEAP would be used for these cases as well);
- throw an exception when an unsupported method is called.
It would be much easier to do so, but it would not be so clear, I think.
comment:8 Changed 5 years ago by alpar
- Milestone changed from LEMON 1.3 release to LEMON 1.4 release
comment:9 Changed 5 years ago by kpeter
- Description modified (diff)
comment:10 Changed 5 years ago by alpar
Another question here. I only consider the binary heap below.
Currently, the heap datastructure consists of (Node, dist) pairs and whenever a node is finalized, its distance is written into the write-only dist map. Another option would be that that the dist map is read/write and holds the current dist value of the nodes. Then, the dist values in the heap could be omitted. It has however some important consequences:
- It reduces the memory usage, for the heap having to store less data.
- The running time gain or loss is unclear. The smaller size may improve the caching performance, while the additional cross referencing to a different memory segment may worsen it.
- If we want to make Dijkstra able to solve shortest path problem with negative edges (but not negative cycle), then we must allow re-evaluating nodes. Therefore dist map must be read-write anyways.
- Lastly and most importantly, assume that we apply the idea of the ticket (reinserting nodes to the heap instead of updating its key) together with the idea of only storing the nodes but not the keys in heap. Then the keys of the items will change (decrease) during the execution without moving them in the heap. Thus the bin-heap structure becomes corrupt. I have a feeling that Dijkstra algorithm will still run correctly, i.e. the valid nodes will still be taken out of the heap in the right order. But this is just a feeling right now, one should formally check it.
We could introduce a SimplifiedHeap concept. The main question is whether we could generalize the Dijkstra and Prim implementations to accept both types of heaps or not. I would really dislike duplicate algorithm classes.