diff -r bc6edeef8717 -r 76dc74f824ba damecco.tex --- a/damecco.tex Fri Nov 18 16:18:02 2016 +0100 +++ b/damecco.tex Fri Nov 18 16:45:03 2016 +0100 @@ -47,7 +47,7 @@ %% for the whole article with \linenumbers. %% \usepackage{lineno} -\journal{Nuclear Physics B} +\journal{Discrete Applied Mathematics} \begin{document} @@ -73,20 +73,63 @@ %% \address{Address\fnref{label3}} %% \fntext[label3]{} -\title{} +\title{Improved Algorithms for Matching Biological Graphs} %% use optional labels to link authors explicitly to addresses: %% \author[label1,label2]{} %% \address[label1]{} %% \address[label2]{} -\author{} +\author{Alp{\'a}r J{\"u}ttner and P{\'e}ter Madarasi} -\address{} +\address{Dept of Operations Research, ELTE} \begin{abstract} -%% Text of abstract +Subgraph isomorphism is a well-known NP-Complete problem, while its +special case, the graph isomorphism problem is one of the few problems +in NP neither known to be in P nor NP-Complete. Their appearance in +many fields of application such as pattern analysis, computer vision +questions and the analysis of chemical and biological systems has +fostered the design of various algorithms for handling special graph +structures. +The idea of using state space representation and checking some +conditions in each state to prune the search tree has made the VF2 +algorithm one of the state of the art graph matching algorithms for +more than a decade. Recently, biological questions of ever increasing +importance have required more efficient, specialized algorithms. + +This paper presents VF2++, a new algorithm based on the original VF2, +which runs significantly faster on most test cases and performs +especially well on special graph classes stemming from biological +questions. VF2++ handles graphs of thousands of nodes in practically +near linear time including preprocessing. Not only is it an improved +version of VF2, but in fact, it is by far the fastest existing +algorithm regarding biological graphs. + +The reason for VF2++' superiority over VF2 is twofold. Firstly, taking +into account the structure and the node labeling of the graph, VF2++ +determines a state order in which most of the unfruitful branches of +the search space can be pruned immediately. Secondly, introducing more +efficient - nevertheless still easier to compute - cutting rules +reduces the chance of going astray even further. + +In addition to the usual subgraph isomorphism, specialized versions +for induced subgraph isomorphism and for graph isomorphism are +presented. VF2++ has gained a runtime improvement of one order of +magnitude respecting induced subgraph isomorphism and a better +asymptotical behaviour in the case of graph isomorphism problem. + +After having provided the description of VF2++, in order to evaluate +its effectiveness, an extensive comparison to the contemporary other +algorithms is shown, using a wide range of inputs, including both real +life biological and chemical datasets and standard randomly generated +graph series. + +The work was motivated and sponsored by QuantumBio Inc., and all the +developed algorithms are available as the part of the open source +LEMON graph and network optimization library +(http://lemon.cs.elte.hu). \end{abstract} \begin{keyword}