COIN-OR::LEMON - Graph Library

Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#605 closed defect (invalid)

Overflow in Optimal Cost

Reported by: mgara Owned by: Alpar Juttner
Priority: major Milestone: LEMON 1.4 release
Component: core Version: hg main
Keywords: Cc:
Revision id:

Description

Motivation ==========

Please consider the following simple class of problems (which I will refer to as the primal MRF problem [1]):

max_{x,r} c' r

wrt

r = |Ax - b|

where A is a network/graph matrix (the rows define the oriented edges, the matrix is Totally Unimodular), and c >= 0, b are integral vectors. The dual of this problem is an instance of an MCF problem defined on a graph represented by the network matrix A with b as the costs of the edges and c as the capacities. Strong duality holds here.

In our work, we are interested in examining and evaluating different approaches to solving either the MRF or MCF problem using either primal, dual or primal-dual methods. The context we have in mind is application to image processing tasks, in particular we are interested in the phase unwrapping problem in InSAR.

Properties of MCF Instances ===========================

  1. The MCF dual problem of the MRF problem is defined on the same graph as

defined by A, with the addition of an opposite arc with negative cost for every oriented arc in A.

  1. The MCF dual problem has no node imbalances by definition. The network flow

condition must still be enforced.

  1. By corollary of 1 & 2, the optimal values for the MCF dual problem must be

negative.

Current Issue with Lemon: =========================

Please consider the attached code that was tested with Lemon version 1.3.1. It should be self contained and compilable with the make command if the LEMON_HOME directory is correctly set in the Makefile.

Once compiled it can be run as follows (assuming a linux environment with a BASH shell) to reproduce the issue.

for scale in $(seq .1 .1 .4); do ./lemon_mcf_solver netgen_8_08a.txt $scale; done > output.txt

In output.txt we should see that as we scale the costs b by a factor of .1 - .4 smoothly we get a negative optimal value initially, but when we reach .4 we obtain a positive optimal value:

grep "simplex cost:\|scaling costs" output.txt

INFO: scaling costs by 0.100000 INFO: network simplex cost: -575280232 INFO: scaling costs by 0.200000 INFO: network simplex cost: -1154018674 INFO: scaling costs by 0.300000 INFO: network simplex cost: -1732004546 INFO: scaling costs by 0.400000 INFO: network simplex cost: 1984124216

Given property 3 of these MCF instances we know that positive cost solution is impossible to these MCF instances. In fact, we've verified that the solution is correct up to scale .3 (by comparing with other solvers) and around the scale of .4 is when we hit this issue.

We came across this issue by modifying some NETGEN instances to have the properties of the MCF dual problem we expect.

Please let us know if this is an issue with how we are using the Lemon library, or perhaps if this is a bug in the Lemon library that can be addressed.

Thank you in advance, Matt

[1] Kolmogorov, Vladimir. "Primal-dual algorithm for convex Markov random fields." Microsoft Research MSR-TR-2005-117 (2005).

Attachments (1)

lemon_issue.tar.gz (16.3 KB) - added by mgara 8 years ago.
Reproducible Code Example

Download all attachments as: .zip

Change History (5)

Changed 8 years ago by mgara

Attachment: lemon_issue.tar.gz added

Reproducible Code Example

comment:1 Changed 8 years ago by Alpar Juttner

Having just a short look at the code, I believe it is not a bug, but indeed just a simple integer overflow.

You use long long int weights and costs, but use the default settings of solvers which are ints. Try to use CapacityScaling<lemon::ListDigraph, long long int>, lemon::CostScaling<lemon::ListDigraph, long long int> and lemon::NetworkSimplex<lemon::ListDigraph, long long int>

Note that

  1. The above MCF implementations even allow using different data types for the capacity and for the cost calculation (see the doc).
  2. Using long long int type for the capacity and (even more importantly) for the costs makes sense even if the input consists of 32bit integers only. During the calculations capacity values are added together, and the cost values are multiplied with capacity values and added together, which can easily cause integer overflow.

comment:2 Changed 8 years ago by Peter Kovacs

I agree. You should use long long int type as described above.

Furthermore, note that the total flow cost may not fit in the data type used for internal calculations of the algorithm. That's why the totalCost() method has its own template argument. You can use it like this:

long long int totalCost1 = ns.totalCost();
double totalCost2 = ns.totalCost<double>();

See the documentation here: http://lemon.cs.elte.hu/pub/doc/1.3.1/a00276.html#a4e1efd04a6b234645d1ca18d2635d57e

comment:3 Changed 8 years ago by Alpar Juttner

Resolution: invalid
Status: newclosed

comment:4 Changed 8 years ago by mgara

Thank you for the information; we've tested that this actually fixes things and so we are continuing our benchmarking and testing. Thanks again and sorry for the mix-up.

Note: See TracTickets for help on using tickets.