COIN-OR::LEMON - Graph Library

Opened 3 years ago

Last modified 7 weeks ago

#597 new enhancement

VF2 (sub)graph isomoprism algorithm

Reported by: alpar Owned by: alpar
Priority: major Milestone: LEMON 1.4 release
Component: core Version: hg main
Keywords: Cc:
Revision id:

Description

The changesets [a037254714b3], [2f479109a71d] and [f85ee41c84bc] implement a slightly modified version of the VF2 algorithm for finding isomorphic copy of a graph as subgraph of another. It can find either normal or induced subgraph, or an isomorphism between the two graphs. It is also capable of enumerating all possible embedding. Finally, nodes can be labeled in order to model different (i.e. unmatchable) node types.

The function type interface can be considered final, the base class may be refined later.

This development was sponsored by QuantumBio Inc.

Attachments (10)

597-minor-fixes.patch (2.1 KB) - added by kpeter 3 years ago.
3feba0ea1bda.patch (76.6 KB) - added by alpar 2 months ago.
597-1-120b9031eada-merge-tests.patch (22.7 KB) - added by kpeter 7 weeks ago.
597-2-76349d8212d3-docs.patch (15.9 KB) - added by kpeter 7 weeks ago.
597-3-b9fad0f9f8ab-rename-fields.patch (20.7 KB) - added by kpeter 7 weeks ago.
597-4-c89884c1737b-rename-methods.patch (2.0 KB) - added by kpeter 7 weeks ago.
597-5-d9f79b81ef6c-graph-type.patch (2.2 KB) - added by kpeter 7 weeks ago.
597-6-b79ff94e27d9-unused-classes.patch (1.4 KB) - added by kpeter 7 weeks ago.
597-7-3ca508482e4c-wrong-method-name.patch (1.2 KB) - added by kpeter 7 weeks ago.
597-8-e68f0ef37e77-unused-class.patch (2.2 KB) - added by kpeter 7 weeks ago.

Download all attachments as: .zip

Change History (14)

Changed 3 years ago by kpeter

comment:1 Changed 3 years ago by kpeter

I checked these commits as I happened to work with the VF2 algorithm several times (unrelated to LEMON).

First, I made some trivial improvements in the doc, see the changesets [abc24245d276] and [233ad6942a26] in the attached patch file.

Furthermore, I would like to add some notes regarding the implementation:

  • In each iteration of the main loop of the extMatch() function, we search for a neighbor of the current node (order[depth]) that is already mapped to node in g2. That is, a neighbor that precedes the current node according to order. Note that the result will be the same for any given node, but it may be calculated multiple times if the backtracking algorithm visits the corresponding depth value more than once. So it could be more efficient to calculate fstMatchedE only once for each depth and store it somehow.
  • If a node i in g1 has multiple neighbors that precedes i according to order, than we could arbitrarily select one of them when fstMatchedE is determined. Currently, the code always selects the first one, but selecting the one having the smallest degree (or having the less neighbors that follow i according to order) may yield better overall performance.
  • A more sophisticated way of doing this could be to check the resulting currPNode (with respect to e.g. its degree or the number of unmatched neighbors) for each candidate fstMatchedE, but this approach would mean that the selection would depend not only on the input graphs and order, but also on the current mapping, so this approach would invalidate the improvement in the first suggestion. (The difference is only important in case of subgraph isomorphism, but even in this case, I suspect that it is better to check only in g1 and use the first improvement as well, because that seems to be more important than the differences between the degrees in g1 and g2.)
  • In this loop, order[depth] is used many times (and isn't changed), so I would suggest extracting it to a local variable that reflects to the actual meaning of this expression (i.e. the current node in g1). It would improve readablity.
  • The current formatting of the code does not fully conform to other source files in LEMON. Actually, this is a matter of taste, and the file itself is consistent, but I wanted to note this difference.
    // The code uses this style
    if (condition)
      {
        command1;
        command2;
      }
    
    // instead of this
    if (condition) {
      command1;
      command2;
    }
    

comment:2 Changed 2 years ago by alpar

Thanks for the improvements and comments.
Changesets [abc24245d276] and [233ad6942a26] are in the main branch now.

I keep the ticket open to be a reminder to your other suggestions.

Changed 2 months ago by alpar

comment:3 Changed 2 months ago by alpar

The attached changeset [3feba0ea1bda] contains many of your suggestions above and also implements a new, improved graph isomorphism algorithm called Vf2pp. In addition, the changeset refactors a some parts of the code to avoid duplication.

A publication about Vf2pp has been submitted to a special issue of the journal Discrete Applied Mathematics. A reference will be given once it is accepted.

Changed 7 weeks ago by kpeter

Changed 7 weeks ago by kpeter

Changed 7 weeks ago by kpeter

Changed 7 weeks ago by kpeter

Changed 7 weeks ago by kpeter

Changed 7 weeks ago by kpeter

Changed 7 weeks ago by kpeter

Changed 7 weeks ago by kpeter

comment:4 Changed 7 weeks ago by kpeter

I reviewed these codes and made a series of minor patches:

  • [120b9031eada] Merge tests of VF2 and VF2++ to avoid code duplication
  • [76349d8212d3] Improve and unify comments and API docs of VF2 algorithms
  • [b9fad0f9f8ab] Unify naming scheme of fields in Vf2 and Vf2pp. Names of private fields start with underscore.
  • [c89884c1737b] Rename private methods in Vf2 and Vf2pp
  • [d9f79b81ef6c] Change the default graph type of Vf2 and Vf2pp. These algorithms work with undirected graphs, so it is more resonable to select ListGraph instead of ListDigraph as the default.
  • [b79ff94e27d9] Remove unused auxiliary classes: DfsLeaveOrder and BfsLeaveOrder are not used at all in Vf2pp.
  • [3ca508482e4c] Change misleading method name in Vf2pp. It processes an entire connected component of the graph _g1 using BFS, so processBfsTree() is more appropriate name than processBFSLevel().
  • [e68f0ef37e77] Remove unused auxiliary class in Vf2: as BfsLeaveOrder turned out to be more efficient in practice, I don't think that we should keep DfsLeaveOrder as unused or commented code.

Besides these modifications, I have a few additional questions:

  • "VF2 Plus Plus" or "VF2++"? I would prefer the latter form, but the former one is used in the API docs. It seems that you also used VF2++ previously, e.g. here.
  • Could you attach test graphs for benchmarking? Some interesting suggestions I wrote above are not implemented in the algorithms, but they would be pretty easy to implement and I am interested in their effects.
  • Why do we need the getPtrToVf2Object() method on the API of the Wizard classes?
  • There are some code duplication between Vf2 and Vf2pp (as well as between Vf2Wizard and Vf2ppWizard). Wouldn't it be better to reduce this? Could we extract common codes e.g. to a common base class?
Note: See TracTickets for help on using tickets.