Opened 11 years ago
Closed 9 years ago
#179 closed task (done)
Port the min mean cycle algorithms
Reported by: | Alpar Juttner | Owned by: | Peter Kovacs |
---|---|---|---|
Priority: | critical | Milestone: | LEMON 1.2 release |
Component: | core | Version: | hg main |
Keywords: | Cc: | ||
Revision id: |
Description (last modified by )
Attachments (18)
Change History (63)
comment:1 Changed 11 years ago by
Description: | modified (diff) |
---|
comment:2 Changed 11 years ago by
Type: | defect → enhancement |
---|
comment:3 Changed 11 years ago by
Description: | modified (diff) |
---|
comment:4 Changed 11 years ago by
Milestone: | LEMON 1.1 release |
---|
comment:5 Changed 11 years ago by
Status: | new → assigned |
---|---|
Type: | enhancement → task |
comment:6 Changed 10 years ago by
Milestone: | → LEMON 1.2 release |
---|
comment:7 Changed 10 years ago by
The current implementation of MinMeanCycle
is due to Howard. I would like to port it, but also add two other implementations: the basic algorithm of Karp and the algorithm of Hartmann-Orlin, which is a more efficient variant of the Karp algorithm (with an early termination heuristic).
How should these classes be called? Using Karp
, HartmannOrlin
and Howard
seems to be straightforward. However Howard's algorithm is a more general method, but I implemented it only for this problem. Won't it cause problems to use the class name Howard
for the special version?
I found these notes about Howard's algorithm: "It is a well known algorithm to the stochastic control community but a relatively unknown algorithm to the graph theory community."
According to the experimental papers of A. Dasdan, S. S. Irani and G. K. Gupta and my benchmark results, Howard's algorithm is by far the fastest, although all known bounds for its running time are exponential. Hartman-Orlin is the second.
comment:8 follow-up: 13 Changed 10 years ago by
Most of the min. mean cycle algorithms can be generalized to the minimum cost-to-time ratio problem, in which a directed cycle C have to be found with minimum c(C)/t(C) ratio (where c(C) denotes the total cost, t(C) denotes the total delay time on the cycle C). The min. mean cycle problem is clearly a special case of this problem, in which t=1 for all arcs.
Do you think this problem is important for LEMON? If we would like to have algorithms for this problem (in the future), it could cause naming conflicts. My above comment is also about this problem with the addition that Howard's algorithm seems to be a more general method than the others.
How can we call the min. mean cycle classes?
Changed 10 years ago by
Attachment: | 179-1-port-b31e130db13d.patch added |
---|
Changed 10 years ago by
Attachment: | 179-2-interface-d66ff32624e2.patch added |
---|
Changed 10 years ago by
Attachment: | 179-3-implementation-83ce7ce39f21.patch added |
---|
Changed 10 years ago by
Attachment: | 179-4-traits-5795860737f5.patch added |
---|
Changed 10 years ago by
Attachment: | 179-5-rename-03887b5e0f6f.patch added |
---|
comment:9 Changed 10 years ago by
I attached five patches:
- [b31e130db13d] ports the
MinMeanCycle
class. - [d66ff32624e2] simplifies the interface of the execution control: removes
init()
andreset()
, and moves their content intofindMinMean()
. - [83ce7ce39f21] contains implementation improvements and fixes (see the commit log for details). The new version is about two times faster on average and an order of magnitude faster if there are a lot of strongly connected components.
- [5795860737f5] adds a Traits class and named parameters. The most important types are
LargeValue
andPath
. The first one is a value type that is used for internal computations. It must be typically larger than the type of the arc lengths, since the lengths are summed and multiplied by the numer of arcs on the currently found cycle (in order to avoid all divisions in the computations, thus improve the numerical stability). Therefore the default traits class defines this type tolong long
if the arc length type is integer and todouble
otherwise. It can be modified with a named parameter. - [03887b5e0f6f] renames
cyclePath()
tocycle()
, since the latter name is used for the query function.
I would be glad to hear any feedback about these changesets, especially about the new interface of the class.
Important questions:
- Is this Traits class solution good? What do you think about introducing the
LargeValue
type and switching to a more stable computation method? - Is this
LargeValue
name acceptable? Do you have any other suggestion for that? - The query functions are called
cycleLength()
,cycleArcNum()
,cycleMean()
andcycle()
. (Conversion todouble
and division are needed only for thecycleMean()
function now.) Are these names good? E.g.cycleArcNum()
could be renamed tocycleSize()
, which would be simpler but maybe misleading. What do you think? - The name of the class (see the above comments).
Changed 10 years ago by
Attachment: | 179-6-test-e5b1b18cf064.patch added |
---|
Changed 10 years ago by
Attachment: | 179-7-rename-f2f32caf7d14.patch added |
---|
Changed 10 years ago by
Attachment: | 179-8-add-karp-f7ddbf131179.patch added |
---|
Changed 10 years ago by
Attachment: | 179-9-add-ho-00dd2f3ccfa9.patch added |
---|
Changed 10 years ago by
Attachment: | 179-10-impl-impr-8642452f583c.patch added |
---|
Changed 10 years ago by
Attachment: | 179-11-doc-group-631e955e3b94.patch added |
---|
Changed 10 years ago by
Attachment: | 179-b31e130db13d--631e955e3b94.bundle added |
---|
comment:11 follow-up: 12 Changed 10 years ago by
I attached five more patches:
- [f2f32caf7d14] Rename MinMeanCycle to Howard (#179)
- [f7ddbf131179] Add Karp algorithm class (#179) based on the old MinMeanCycle implementation in SVN -r3436. The interface is reworked to be the same as Howard's interface.
- [00dd2f3ccfa9] Add HartmannOrlin algorithm class (#179). This algorithm is an improved version of Karp's original method, it applies an efficient early termination scheme. The interface is the same as Karp's and Howard's interface.
- [8642452f583c] Simplify comparisons in min mean cycle classes (#179) using extreme INF values instead of bool flags.
- [631e955e3b94] Separate group for the min mean cycle classes (#179)
The attached bundle file contains all the 11 changesets from [b31e130db13d] to [631e955e3b94].
Could you please review these changesets?
The main questions are the ones that are asked before, moreover the followings:
- The three classes have exactly the same default traits class but with different names. Should we use only one default traits class (
MinMeanCycleDefaultTraits
) instead? - Now the
LargeValue
andPath
types can be changed for each algorithm using named parameters. Apart from them, it would be nice to be able to change theTolerance
class or rather theepsilon
value (using a real type). How should we implement it? Is there any similar solution in other classes? (Not with an operation traits class like in Dijkstra).
comment:12 Changed 10 years ago by
Summary: | Port the min. mean cycle alg. → Port the min mean cycle algorithms |
---|
Replying to kpeter:
Apart from them, it would be nice to be able to change the
Tolerance
class or rather theepsilon
value (using a real type). How should we implement it? Is there any similar solution in other classes? (Not with an operation traits class like in Dijkstra).
Of course, there are similar solutions (I should have searched for them). E.g. in Preflow
, Circulation
and HaoOrlin
, however they are not consistent (see #306).
I applied the solution of Preflow
and Circulation
for the MMC classes in [5d1795170bb6].
Changed 10 years ago by
Attachment: | 179-12-tolerance-5d1795170bb6.patch added |
---|
comment:13 Changed 10 years ago by
Replying to kpeter:
Most of the min. mean cycle algorithms can be generalized to the minimum cost-to-time ratio problem, in which a directed cycle C have to be found with minimum c(C)/t(C) ratio (where c(C) denotes the total cost, t(C) denotes the total delay time on the cycle C). The min. mean cycle problem is clearly a special case of this problem, in which t=1 for all arcs.
Do you think this problem is important for LEMON?
Yes, it is important.
comment:14 follow-up: 15 Changed 10 years ago by
The third template parameter of the classes (TR) are not documented and they seems to be mandatory from the doc, which - I hope - false.
comment:15 follow-up: 16 Changed 10 years ago by
Replying to alpar:
The third template parameter of the classes (TR) are not documented and they seems to be mandatory from the doc, which - I hope - false.
Just like for many other classes: Bfs
, Dfs
, Dijkstra
, BellmanFord
, Preflow
, Circulation
etc. That was a decision that was suggested or at least approved by you, as far as I remember.
comment:16 follow-up: 17 Changed 10 years ago by
Replying to kpeter:
Replying to alpar:
The third template parameter of the classes (TR) are not documented and they seems to be mandatory from the doc, which - I hope - false.
Just like for many other classes:
Bfs
,Dfs
,Dijkstra
,BellmanFord
,Preflow
,Circulation
etc. That was a decision that was suggested or at least approved by you, as far as I remember.
Are you sure I have ever suggested
- showing a parameter in the doc but not document it and
- documenting an optional parameter as if it were mandatory?
comment:17 Changed 10 years ago by
Replying to alpar:
Just like for many other classes:
Bfs
,Dfs
,Dijkstra
,BellmanFord
,Preflow
,Circulation
etc. That was a decision that was suggested or at least approved by you, as far as I remember.Are you sure I have ever suggested
- showing a parameter in the doc but not document it and
- documenting an optional parameter as if it were mandatory?
No, I'm not sure, that's why I wrote: at least approved by you, as far as I remember. We had a few letters about it on 28/11/2008 ("Nehany kerdes"). In these letters Balazs wrote:
Szerintem is hasznos, ha toroljuk, akar ez a Traits osztaly kikerulhetne a doksibol. (template parameterek DOXYGEN szerinti ifdef-elese, es a doskibol kiszedejuk)
He suggested to hide these parameters entirely. As the letters show I liked this idea, so there must have been a dominant reason to change my mind. I can't remember exactly, but I'm almost sure that either Balazs or you suggested this "middle course" solution. After that I created #185 with a patch that hides these \tparam
entries, but keeps them in the list of template parameters.
Finally, I performed the analogous modifications for all classes that have traits classes. All these patches were accepted up to now.
comment:18 follow-up: 19 Changed 10 years ago by
However not this "historical" question is the most important. Tell me how to modify the doc in all of these classes, and I will do it.
comment:19 follow-up: 20 Changed 10 years ago by
Replying to kpeter:
However not this "historical" question is the most important. Tell me how to modify the doc in all of these classes, and I will do it.
As a rule of thumb, we can hide features in the doc, but cannot state false things.
comment:20 Changed 10 years ago by
comment:21 follow-up: 22 Changed 10 years ago by
What to do then? Are there any open questions here? Can we merge this line of changes to the main branch?
comment:22 follow-up: 24 Changed 10 years ago by
Replying to alpar:
What to do then? Are there any open questions here? Can we merge this line of changes to the main branch?
Yes, there are some important questions about the names and the interfaces of these classes, see comment1 9 and 11:
- Is this
Traits
class solution good? What do you think about introducing theLargeValue
type and switching to a more stable computation method? - Is this
LargeValue
name acceptable? Do you have any other suggestion for that? - The query functions are called
cycleLength()
,cycleArcNum()
,cycleMean()
andcycle()
. (Conversion to double and division are needed only for thecycleMean()
function now.) Are these names good? E.g.cycleArcNum()
could be renamed tocycleSize()
, which would be simpler but maybe misleading. What do you think? - The names of the classes considering the minimum cost-to-time ratio problem in the future (see the above comments).
- The three classes have exactly the same default traits class but with different names. Should we use only one default traits class (
MinMeanCycleDefaultTraits
) instead?
Changed 10 years ago by
Attachment: | 179-184-mmc-bib-4ac26e328df3.patch added |
---|
comment:23 Changed 10 years ago by
The attached patch [4ac26e328df3] adds citations to the min mean cycle algorithms. It should be merged with the changesets in #184.
Changed 10 years ago by
Attachment: | 179-doc-fix-623660793d90.patch added |
---|
comment:24 Changed 10 years ago by
Replying to kpeter:
- The three classes have exactly the same default traits class but with different names. Should we use only one default traits class (
MinMeanCycleDefaultTraits
) instead?
Not exactly, Karp
and HartmannOrlin
use addFront()
function, while Howard
uses addBack()
function for the Path
type. See the attached [623660793d90] doc fix.
So separate traits class seem better, only four questions are left here.
comment:25 follow-up: 26 Changed 10 years ago by
Resolution: | → done |
---|---|
Status: | assigned → closed |
A couple of these changesets here didn't compile due to a missing include. I added these to the changesets. I also augmented test_tools.h
in order to avoid an "ambiguous else{}" warning. The new changeset ids are [b31e130db13d], [d66ff32624e2], [83ce7ce39f21], [5795860737f5], [03887b5e0f6f], [93cd93e82f9b], [1fac515a59c1], [3b544a9c92db], [97744b6dabf8], [11c946fa8d13], [0a42883c8221], [e746fb14e680], [8452ca46e29a] and [f964a00b9068].
So I hope that every changeset in this ticket is in the main branch now.
comment:26 Changed 10 years ago by
Replying to alpar:
So I hope that every changeset in this ticket is in the main branch now.
Yes, thank you.
comment:27 Changed 10 years ago by
Resolution: | done |
---|---|
Status: | closed → reopened |
Don't hurry to close this ticket. We shouldn't evade these questions:
- Is this
LargeValue
name acceptable? Do you have any other suggestion for that? - The query functions are called
cycleLength()
,cycleArcNum()
,cycleMean()
andcycle()
. (Conversion to double and division are needed only for thecycleMean()
function now.) Are these names good? E.g.cycleArcNum()
is not so nice. It could be renamed tocycleSize()
, which would be simpler but maybe misleading.Length
could be replaced withCost
(not to confuse with the number of arcs), but all shortest path algorithms useLength
. What do you think? - The names of the classes considering the minimum cost-to-time ratio problem in the future (see the above comments).
comment:29 Changed 9 years ago by
The attached patch [aa8c9008b3de] modifies the return value type of cycleLength()
functions, which caused the warnings reported in #350.
The original Value
type is used instead of the LargeValue
type which is introduced for internal computations. Similarly to the shortest path algorithms, we suppose that this value type can store the total cost on the found cycle/path without overflow. (Of course, the Value
type can also be set to a larger type.) In the internal computations, these total costs are multiplied by the number of arcs on the cycle, which could be significantly larger, that's why we use LargeValue
type.
Changed 9 years ago by
Attachment: | 179-valuetype-aa8c9008b3de.patch added |
---|
comment:30 follow-ups: 31 32 33 Changed 9 years ago by
I have just noticed that BGL also implements the Howard policy iteration algorithm for the "minimum ratio cycle" problem, which is the generalization of the min. mean cycle problem, as you can see at the beginning of this ticket. I think, we should check that interface and decide how we can extend the interface of these classes in LEMON, or how we could name the general classes if they are implemented separately.
Could you please help me in this question?
comment:31 Changed 9 years ago by
Priority: | major → critical |
---|
Replying to kpeter:
I have just noticed that BGL also implements the Howard policy iteration algorithm for the "minimum ratio cycle" problem, which is the generalization of the min. mean cycle problem, as you can see at the beginning of this ticket. I think, we should check that interface
I do not like the BGL interface, since it focuses on the cycle mean/ratio, and it does not provide convenient way to obtain the found cycle itself and the numerator and the denominator of the cycle ratio. However, that interface supports the general problem: it can solve the min/max ratio problem for any pairs of edge weight maps.
Could you please help me in this question?
The main questions:
- How should we call the classes that solve the general problem, i.e. "minimum cycle ratio" or "optimum cycle ratio" if it supports both min and max. For example,
HowardRatio
,MrcHoward
,McrHoward
(from "min ratio cycle" or "min cycle ratio"). A possible solution would be to have only the general interface, but the generalized versions of the algorithms are not implemented yet, and I cannot see a possible extension of the current interface in a backward compatible way.
- Suppose we have an algorithm class X for the ratio problem. How should we call the two weight functions?
cost
andtime
(this is typical in applications)cost
andlength
(this reflects to the fact, that the ratio problem is the generalization of the mean problem, in which the length is the number of arcs on the cycle)cost1
andcost2
weight1
andweight2
- Are the current interfaces "compatible" with these plans? E.g. I suggest the following renamings:
length
-->cost
(since the latter one seems to be more typical for this problem and cannot be confused with the number of arcs)findMinMean
-->findMean
(since we will probably support max. cycle ratio search, too)- What about
cycleArcNum()
? Should it becycleSize()
orcycleLength()
?
Could you please help me? I find these questions important.
comment:32 follow-up: 35 Changed 9 years ago by
I've always had a bad feeling when we named the algorithms after the authors in case when it is not generally accepted. There is nothing wrong with calling the shortest path algorithm as Dijkstra
, but for example it is a bad idea to call a min mean cycle alg. simply as Karp
. This guy has a lot of contribution to combinatorial optimization, thus this kind of choice will certainly lead to confusion sooner or later.
Another example for the possible problems with this habit is well demonstrated by the Howard
, because it solves the more general problem minimum ratio cycle problem, but can also be specialized to the minimum mean cycle problem.
The solution can be to include both the problem name and the author's name.
As far as the minimum ratio cycle problem (or the more general fractional optimization frameworks) concerned, they are important on their own right, and there are various nice technique for solving them (including e.g. strongly polynomial time algorithm for min. mean cycle). So, IMHO there is nothing wrong with having two implementation for the Howard algorithm, a general one for the minimum ratio cycle problem and a specialized version for finding minimum mean cycles
comment:33 follow-up: 34 Changed 9 years ago by
Replying to kpeter:
Could you please help me in this question?
Could you give an online reference explaining this Howard algorithm?
comment:34 follow-up: 36 Changed 9 years ago by
Replying to alpar:
Replying to kpeter:
Could you please help me in this question?
Could you give an online reference explaining this Howard algorithm?
E.g. you can find a pseudo-code and explanation here for Howard's algorithm applied to this problem:
http://www.boost.org/doc/libs/1_42_0/libs/graph/doc/dasdan-dac99.pdf
As far as I know, Howard's algorithm is a general method that can be used for solving several different problems. In the above paper the authors wrote: "One of the most surprising results of this paper is that Howard's algorithm, known primarily in the stochastic control community, is by far the fastest algorithm on our test suite".
Maybe we should introduce a new naming convention here? E.g. adding some prefix (or suffix) indicating the problem to the current names? For example, MmcHoward
, MmcKarp
, MmcHartmanOrlin
for the min mean cycle problem and MrcHoward
etc. for the min/max ratio cycle problem (later). I could get familiar with such names. It would also solve the problem that "it is a bad idea to call a min mean cycle alg. simply as Karp. This guy has a lot of contribution to combinatorial optimization, thus this kind of choice will certainly lead to confusion sooner or later."
What do you think?
comment:35 Changed 9 years ago by
Replying to alpar:
So, IMHO there is nothing wrong with having two implementation for the Howard algorithm, a general one for the minimum ratio cycle problem and a specialized version for finding minimum mean cycles.
Just like we have Suurbale
, wich is hardly more than a specialized implementation of CapacityScaling
. (In fact, it also includes a trivial path decomposition method.)
comment:36 follow-up: 37 Changed 9 years ago by
Replying to kpeter:
Maybe we should introduce a new naming convention here? E.g. adding some prefix (or suffix) indicating the problem to the current names? For example,
MmcHoward
,MmcKarp
,MmcHartmanOrlin
for the min mean cycle problem andMrcHoward
etc. for the min/max ratio cycle problem (later). I could get familiar with such names.
Yes, it is a good idea. I slightly prefer the reversed order (e.g. KarpMmc
etc.), but I don't mind using that one either. The important thing is to use NameProblem
vs. ProblemName
uniformly in the lib.
comment:37 Changed 9 years ago by
Replying to alpar:
Yes, it is a good idea. I slightly prefer the reversed order (e.g.
KarpMmc
etc.), but I don't mind using that one either.
I don't have preference in this question. However, we already have GlpkLp
, CplexLp
etc., which is a bit different but similar situation. Thus KarpMmc
etc. could be slightly better. (As far as I remember it was you who suggested this order for the Lp and Mip classes.)
Btw. BGL uses names like problem_author
, e.g. ocr_howard (optimum cycle ratio).
The important thing is to use
NameProblem
vs.ProblemName
uniformly in the lib.
Definitely.
comment:38 follow-up: 39 Changed 9 years ago by
what about the class names? Should they be karp_mmc.h
etc.?
comment:39 follow-up: 41 Changed 9 years ago by
Replying to kpeter:
what about the class names? Should they be
karp_mmc.h
etc.?
I'm sorry, I mean the file names.
Changed 9 years ago by
Attachment: | 179-rename-9aa0784c0c94.patch added |
---|
comment:40 Changed 9 years ago by
The attached patch [9aa0784c0c94] does a lot of renamings related to these files. It is based on the bugfix [a93f1a27d831] form #354.
The renamed classes:
- Karp --> KarpMmc
- HartmannOrlin --> HartmannOrlinMmc
- Howard --> HowardMmc
(The header files are renamed accordingly.)
The renamed members:
- cycleLength() --> cycleCost()
- findMinMean() --> findCycleMean()
- Value --> Cost
- LargeValue --> LargeCost
- SetLargeValue --> SetLargeCost
All useage of 'length' is changed to 'cost' in these files, because 'cost' is typically used in the min ratio cycle problems.
What about cycleArcNum()
? Should it be renamed to cycleSize()
or cycleLength()
?
comment:41 Changed 9 years ago by
comment:42 follow-up: 43 Changed 9 years ago by
How do you like the above renamings? What to do with cycleArcNum()
?
comment:43 follow-up: 44 Changed 9 years ago by
Replying to kpeter:
How do you like the above renamings? What to do with
cycleArcNum()
?
I prefer cycleSize()
, but the current one isn't bad either.
Changed 9 years ago by
Attachment: | 179-rename-new-d3ea191c3412.patch added |
---|
comment:44 follow-up: 45 Changed 9 years ago by
Replying to alpar:
I prefer
cycleSize()
, but the current one isn't bad either.
I also prefer this name. The new patch [d3ea191c3412] extends the former renaming patch [9aa0784c0c94] with this change.
I think, we can close this ticket once [d3ea191c3412] is applied.
comment:45 Changed 9 years ago by
Resolution: | → done |
---|---|
Status: | reopened → closed |
This ticket is more or less related to #51 and #180. Maybe it would be better in release 1.2.