[Lemon-user] bug in ListDigraph::erase(Node &) ?

Wed Mar 30 07:09:56 CEST 2011

Hi Duraid,

> 	I think there is a bug in list_graph.h's erase(Node) function, which
> in my version is this:

Thank you for the observation. I believe it is not a bug (see below),
but the rational behind is certainly not described anywhere.

> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>     void erase(const Node& node) {
>       int n = node.id;
>       
>       if(nodes[n].next != -1) {
>         nodes[nodes[n].next].prev = nodes[n].prev;
>       }
>        
>       if(nodes[n].prev != -1) {
>         nodes[nodes[n].prev].next = nodes[n].next;
>       } else {
>         first_node = nodes[n].next;
>       }
>        
>       nodes[n].next = first_free_node;
>       first_free_node = n;
>       nodes[n].prev = -2;
>     
>     }
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> 	Suppose nodes[n].prev is -2. Then the check against 1 will succeed,
> and nodes[-2].next will be accessed, which will _usually_ not lead to a
> crash, just silent corruption.

nodes[n].prev==-1 means n is the first node in the list. 
nodes[n].prev==-2 means n has been deleted.

The point is that you are oly allowed to delete _existing_ items. If you
want to delete a node with prev==-2 then your code is certainly buggy.
We may want to try to hide the bug in your code for some time, but this
will not solve the problem.

You may ask, why is prev set to -2 if it is never accessed by a valid
code. The reason is to make it possible to write language bindings.
Assume you write a Python interface to LEMON. You cannot prevent your
user from writing a Python code that double deletes an item, but you
still have to keep the memory in a consistent state so that the Python
interpreter itself will not crash. Thus you have to check if an item is
valid with this function:

    bool ListDigraph::valid(Node n) const {
      return n.id >= 0 && n.id < static_cast<int>(nodes.size()) &&
        nodes[n.id].prev != -2;
    }

Nota bene, this will still _not_ safely localize double erase of a node.
If you erase a node, then add new ones, then the id of an erased node
may become valid again. If you erase again the old one later, it will be
unnoticed, but you will delete a different node that you intended.

But, the graph data structure will at least remain in a consistent
state.

> 	If the intended interpretation of "x != -1" here is "x is valid",

Not exactly. x==-1 means "x in INVALID". (where INVALID is just a node
constant, thus n!=INVALID does not mean n is valid). Validity of a node
is defined a bit differently (see above).

> maybe it would make sense to change all these comparisons to "x >= 0" ?

I think it would just hide the bugs in the user code even more. For
example, when you double delete an item, Valgind (http://valgrind.org/ ,
the most useful debugging tool for C/C++) will point out the array
underindexing with the current code, but not with your modification.

I think it would really make more sense (and more work) to check --- in
DEBUG mode only --- the validity of the items and ASSERT if they aren't
valid. In fact, this could be done at each and every graph operation.

Regards,
Alpar