COIN-OR::LEMON - Graph Library

Opened 16 months ago

Closed 15 months ago

Last modified 15 months ago

#605 closed defect (invalid)

Overflow in Optimal Cost

Reported by: mgara Owned by: alpar
Priority: major Milestone: LEMON 1.4 release
Component: core Version: hg main
Keywords: Cc:
Revision id:

Description

Motivation
==========

Please consider the following simple class of problems (which I will refer to
as the primal MRF problem [1]):

max_{x,r} c' r

wrt

r = |Ax - b|

where A is a network/graph matrix (the rows define the oriented edges, the
matrix is Totally Unimodular), and c >= 0, b are integral vectors. The dual of
this problem is an instance of an MCF problem defined on a graph represented
by the network matrix A with b as the costs of the edges and c as the
capacities. Strong duality holds here.

In our work, we are interested in examining and evaluating different approaches
to solving either the MRF or MCF problem using either primal, dual or
primal-dual methods. The context we have in mind is application to image
processing tasks, in particular we are interested in the phase unwrapping
problem in InSAR.

Properties of MCF Instances
===========================

  1. The MCF dual problem of the MRF problem is defined on the same graph as

defined by A, with the addition of an opposite arc with negative cost for every
oriented arc in A.

  1. The MCF dual problem has no node imbalances by definition. The network flow

condition must still be enforced.

  1. By corollary of 1 & 2, the optimal values for the MCF dual problem must be

negative.

Current Issue with Lemon:
=========================

Please consider the attached code that was tested with Lemon version 1.3.1. It
should be self contained and compilable with the make command if the LEMON_HOME
directory is correctly set in the Makefile.

Once compiled it can be run as follows (assuming a linux environment with a
BASH shell) to reproduce the issue.

for scale in $(seq .1 .1 .4); do ./lemon_mcf_solver netgen_8_08a.txt $scale; done > output.txt

In output.txt we should see that as we scale the costs b by a factor of .1 - .4
smoothly we get a negative optimal value initially, but when we reach .4 we
obtain a positive optimal value:

grep "simplex cost:\|scaling costs" output.txt

INFO: scaling costs by 0.100000
INFO: network simplex cost: -575280232
INFO: scaling costs by 0.200000
INFO: network simplex cost: -1154018674
INFO: scaling costs by 0.300000
INFO: network simplex cost: -1732004546
INFO: scaling costs by 0.400000
INFO: network simplex cost: 1984124216

Given property 3 of these MCF instances we know that positive cost solution is
impossible to these MCF instances. In fact, we've verified that the solution is
correct up to scale .3 (by comparing with other solvers) and around the
scale of .4 is when we hit this issue.

We came across this issue by modifying some NETGEN instances to have the
properties of the MCF dual problem we expect.

Please let us know if this is an issue with how we are using the Lemon library,
or perhaps if this is a bug in the Lemon library that can be addressed.

Thank you in advance,
Matt

[1] Kolmogorov, Vladimir. "Primal-dual algorithm for convex Markov random
fields." Microsoft Research MSR-TR-2005-117 (2005).

Attachments (1)

lemon_issue.tar.gz (16.3 KB) - added by mgara 16 months ago.
Reproducible Code Example

Download all attachments as: .zip

Change History (5)

Changed 16 months ago by mgara

Reproducible Code Example

comment:1 Changed 16 months ago by alpar

Having just a short look at the code, I believe it is not a bug, but indeed just a simple integer overflow.

You use long long int weights and costs, but use the default settings of solvers which are ints.
Try to use
CapacityScaling<lemon::ListDigraph, long long int>,
lemon::CostScaling<lemon::ListDigraph, long long int> and
lemon::NetworkSimplex<lemon::ListDigraph, long long int>

Note that

  1. The above MCF implementations even allow using different data types for the capacity and for the cost calculation (see the doc).
  2. Using long long int type for the capacity and (even more importantly) for the costs makes sense even if the input consists of 32bit integers only. During the calculations capacity values are added together, and the cost values are multiplied with capacity values and added together, which can easily cause integer overflow.

comment:2 Changed 16 months ago by kpeter

I agree. You should use long long int type as described above.

Furthermore, note that the total flow cost may not fit in the data type used for internal calculations of the algorithm. That's why the totalCost() method has its own template argument. You can use it like this:

long long int totalCost1 = ns.totalCost();
double totalCost2 = ns.totalCost<double>();

See the documentation here:
http://lemon.cs.elte.hu/pub/doc/1.3.1/a00276.html#a4e1efd04a6b234645d1ca18d2635d57e

comment:3 Changed 15 months ago by alpar

  • Resolution set to invalid
  • Status changed from new to closed

comment:4 Changed 15 months ago by mgara

Thank you for the information; we've tested that this actually fixes things and so we are continuing our benchmarking and testing. Thanks again and sorry for the mix-up.

Note: See TracTickets for help on using tickets.