lemon-vf2: comparison damecco.tex

equal deleted inserted replaced

-:0050d7927c3a
+:b0be2b06ea29
 twofold. Firstly, it is based on a new approach for determining the
 matching order of the nodes, and secondly, more efficient -
 nevertheless easier to compute - cutting rules significantly
 reducing the search space are applied.
-In addition to the usual subgraph isomorphism, the paper also
+In addition to the usual \emph{Subgraph Isomorphism Problem}, the paper also
-presents specialized versions for the \emph{Induced Subgraph
+presents specialized algorithms for the \emph{Induced Subgraph
 Isomorphism} and for the \emph{Graph Isomorphism Problems}.
 Finally, an extensive experimental evaluation is provided using a
-wide range of inputs, including both real life biological and
+wide range of inputs, including both real-life biological and
 chemical datasets and standard randomly generated graph series. The
 results show major and consistent running time improvements over the
 other known methods.
-The C++ implementations of the algorithms are available open source as
+The C++ implementations of the algorithms are available open-source as
-the part of the LEMON graph and network optimization library.
+part of the LEMON graph and network optimization library.
 \end{abstract}
 \begin{keyword}
 Computational Biology, Subgraph Isomorphism Problem
 Complex biological systems arise from the interaction and cooperation
 of plenty of molecular components. Getting acquainted with such
 systems at the molecular level is of primary importance, since
 protein-protein interaction, DNA-protein interaction, metabolic
 interaction, transcription factor binding, neuronal networks, and
-hormone signaling networks can be understood this way.
+hormone signalling networks can be understood this way.
-Many chemical and biological structures can easily be modeled
+Many chemical and biological structures can easily be modelled
 as graphs, for instance, a molecular structure can be
 considered as a graph, whose nodes correspond to atoms and whose
 edges to chemical bonds. The similarity and dissimilarity of
 objects corresponding to nodes are incorporated to the model
 by \emph{node labels}. Understanding such networks basically
 requires finding specific subgraphs, thus it calls for efficient
 graph matching algorithms.
 Other real-world fields related to some
 variants of graph matching include pattern recognition
-and machine vision \cite{HorstBunkeApplications}, symbol recognition
+and machine vision~\cite{HorstBunkeApplications}, symbol recognition~\cite{CordellaVentoSymbolRecognition}, and face identification~\cite{JianzhuangYongFaceIdentification}.  \\
-\cite{CordellaVentoSymbolRecognition}, face identification
-\cite{JianzhuangYongFaceIdentification}.  \\
 Subgraph and induced subgraph matching problems are known to be
-NP-Complete\cite{SubgraphNPC}, while the graph isomorphism problem is
+NP-Complete~\cite{SubgraphNPC}, while the graph isomorphism problem is
 one of the few problems in NP neither known to be in P nor
-NP-Complete. Although polynomial time isomorphism algorithms are known
+NP-Complete. Although polynomial-time isomorphism algorithms are known
 for various graph classes, like trees and planar
-graphs\cite{PlanarGraphIso}, bounded valence
+graphs~\cite{PlanarGraphIso}, bounded valence
-graphs\cite{BondedDegGraphIso}, interval graphs\cite{IntervalGraphIso}
+graphs~\cite{BondedDegGraphIso}, interval graphs~\cite{IntervalGraphIso}
-or permutation graphs\cite{PermGraphIso}, and recently, an FPT algorithm has been presented for the coloured hypergraph isomorphism problem in \cite{ColoredHiperGraphIso}.
+or permutation graphs~\cite{PermGraphIso}. Furthermore, an FPT algorithm has also been presented for the coloured hypergraph isomorphism problem in~\cite{ColoredHiperGraphIso}.
 In the following, some algorithms based on other approaches are
 summarized, which do not need any restrictions on the graphs. Even though,
 an overall polynomial behaviour is not expectable from such an
 alternative, it may often have good practical performance, in fact,
 it might be the best choice even on a graph class for which polynomial
 algorithm is known.
 The first practically usable approach was due to
-\emph{Ullmann}\cite{Ullmann} which is a commonly used depth-first
+\emph{Ullmann}~\cite{Ullmann}, which is a commonly used algorithm based on depth-first
-search based algorithm with a complex heuristic for reducing the
+search with a complex heuristic for reducing the
 number of visited states. A major problem is its $\Theta(n^3)$ space
 complexity, which makes it impractical in the case of big sparse
 graphs.
-In a recent paper, Ullmann\cite{UllmannBit} presents an
+In a recent paper, Ullmann~\cite{UllmannBit} presents an
 improved version of this algorithm based on a bit-vector solution for
 the binary Constraint Satisfaction Problem.
-The \emph{Nauty} algorithm\cite{Nauty} transforms the two graphs to
+The \emph{Nauty} algorithm~\cite{Nauty} transforms the two graphs to
-a canonical form before starting to check for the isomorphism. It has
+a canonical form before starting to check for isomorphism. It has
 been considered as one of the fastest graph isomorphism algorithms,
 although graph categories were shown in which it takes exponentially
 many steps. This algorithm handles only the graph isomorphism problem.
-The \emph{LAD} algorithm\cite{Lad} uses a depth-first search
+The \emph{LAD} algorithm~\cite{Lad} uses a depth-first search
 strategy and formulates the matching as a Constraint Satisfaction
 Problem to prune the search tree. The constraints are that the mapping
 has to be injective and edge-preserving, hence it is possible to
 handle new matching types as well.
-The \emph{RI} algorithm\cite{RI} and its variations are based on a
+The \emph{RI} algorithm~\cite{RI} and its variations are based on a
 state space representation. After reordering the nodes of the graphs,
 it uses some fast executable heuristic checks without using any
 complex pruning rules. It seems to run really efficiently on graphs
 coming from biology, and won the International Contest on Pattern
-Search in Biological Databases\cite{Content}.
+Search in Biological Databases~\cite{Content}.
-The currently most commonly used algorithm is the
+Currently, the most commonly used algorithm is the
-\emph{VF2}\cite{VF2}, the improved version of \emph{VF}\cite{VF}, which was
+\emph{VF2}~\cite{VF2}, the improved version of \emph{VF}~\cite{VF}, which was
 designed for solving pattern matching and computer vision problems,
 and has been one of the best overall algorithms for more than a
-decade. Although, it can't be up to new specialized algorithms, it is
+decade. Although, it is not as fast as some of the new specialized algorithms, it is still widely used due to its simplicity and space efficiency. VF2 uses
-still widely used due to its simplicity and space efficiency. VF2 uses
 a state space representation and checks some conditions in each state
 to prune the search tree.
-Meanwhile, another variant called \emph{VF2 Plus}\cite{VF2Plus} has
+Meanwhile, another variant called \emph{VF2~Plus}~\cite{VF2Plus} has
 been published. It is considered to be as efficient as the RI
-algorithm and has a strictly better behavior on large graphs.  The
+algorithm and has a strictly better behaviour on large graphs.  The
-main idea of VF2 Plus is to precompute a heuristic node order of the
+main idea of VF2~Plus is to precompute a heuristic node order of the graph to be embedded, in which VF2 works more efficiently.
-small graph, in which the VF2 works more efficiently.
 This paper introduces \emph{VF2++}, a new further improved algorithm
-for the graph and (induced)subgraph isomorphism problem, which uses
+for the graph and (induced) subgraph isomorphism problems, which uses
 efficient cutting rules and determines a node order in which VF2 runs
 significantly faster on practical inputs.
 The rest of the paper is structured as
 follows. Section~\ref{sec:ProbStat} defines the exact problems to be
 solved, Section~\ref{sec:VF2Alg} provides a description of VF2. Based
 on that, Section~\ref{sec:VF2ppAlg} introduces VF2++. Some technical
 details necessary for an efficient implementation are discussed in
 Section~\ref{sec:VF2ppImpl}.  Finally, Section~\ref{sec:ExpRes}
-provide a detailed experimental evaluation of VF2++ and its comparison
+provides a detailed experimental evaluation of VF2++ and its comparison
 to the state-of-the-art algorithm.
 It must also be mentioned that the C++ implementations of the
 algorithms have been made available for evaluation and use under an
-open source license as a part of LEMON\cite{LEMON} open source graph
+open-source license as a part of LEMON~\cite{LEMON} graph
 library.
 \section{Problem Statement}\label{sec:ProbStat}
 This section provides a formal description of the problems to be
 solved.
 \end{definition}
 \subsection{Common problems}\label{sec:CommProb}
-The focus of this paper is on two extensively studied topics, the
+The focus of this paper is on the following problems appearing in many applications.
-subgraph isomorphism and its variations. However, the following
-problems also appear in many applications.
+The \textbf{subgraph isomorphism problem} is the following: is
-The \textbf{subgraph matching problem} is the following: is
 $G_{1}$ isomorphic to any subgraph of $G_{2}$ by a given node
 label?
-The \textbf{induced subgraph matching problem} asks the same about the
+The \textbf{induced subgraph isomorphism problem} asks the same about the
 existence of an induced subgraph.
 The \textbf{graph isomorphism problem} can be defined as induced
 subgraph matching problem where the sizes of the two graphs are equal.
-In addition, one may want to find a \textbf{single} mapping or \textbf{enumerate} all of them.
+In addition, one may want to find a \textbf{single} embedding or \textbf{enumerate} all of them.
-Note that some authors refer to the term
-\emph{subgraph isomorphism problem} as an \emph{induced subgraph
-isomorphism problem}.
 \section{The VF2 Algorithm}\label{sec:VF2Alg}
-This algorithm is the basis of both the VF2++ and the VF2 Plus.  VF2
+This algorithm is the basis of both the VF2++ and the VF2~Plus.  VF2
-is able to handle all the variations mentioned in Section
+is able to handle all the variations mentioned in Section~\ref{sec:CommProb}.  Although it can also handle directed graphs,
-\ref{sec:CommProb}.  Although it can also handle directed graphs,
 for the sake of simplicity, only the undirected case will be
 discussed.
 \subsection{Common notations}
 \indent Assume $G_{1}$ is searched in $G_{2}$.  The following
-definitions and notations will be used throughout the whole paper.
+definitions and notations will be used throughout this paper.
 \begin{definition}
 An injection $\mathfrak{m} : D \longrightarrow V_2$ is called (partial) \textbf{mapping}, where $D\subseteq V_1$.
 \end{definition}
 \begin{notation}
 A mapping $\mathfrak{m}$ is $\mathbf{whole\ mapping}$ if $\mathfrak{m}$ covers all the
 nodes of $V_{1}$, i.e. $\mathfrak{D}(\mathfrak{m})=V_1$.
 \end{definition}
 \begin{definition}
-Let \textbf{extend}$(\mathfrak{m},(u,v))$ denote the function $f : \mathfrak{D}(\mathfrak{m})\cup\{u\}\longrightarrow\mathfrak{R}(\mathfrak{m})\cup\{v\}$, for which $\forall w\in \mathfrak{D}(\mathfrak{m}) : \mathfrak{m}(w)=f(w)$ and $f(u)=v$ holds. Where $u\in V_1\setminus\mathfrak{D}(\mathfrak{m})$ and $v\in V_2\setminus\mathfrak{R}(\mathfrak{m})$, otherwise $extend(\mathfrak{m},(u,v))$ is undefined.
+Let \textbf{extend}$(\mathfrak{m},(u,v))$ denote the function $f : \mathfrak{D}(\mathfrak{m})\cup\{u\}\longrightarrow\mathfrak{R}(\mathfrak{m})\cup\{v\}$, for which $\forall w\in \mathfrak{D}(\mathfrak{m}) : f(w)=\mathfrak{m}(w)$ and $f(u)=v$ holds, where $u\in V_1\setminus\mathfrak{D}(\mathfrak{m})$ and $v\in V_2\setminus\mathfrak{R}(\mathfrak{m})$; otherwise $extend(\mathfrak{m},(u,v))$ is undefined.
 \end{definition}
 \begin{notation}
 Throughout the paper, $\mathbf{PT}$ denotes a generic problem type
-which can be substituted by any of the $\mathbf{ISO}$, $\mathbf{SUB}$
+which can be substituted by any of the $\mathbf{SUB}$, $\mathbf{IND}$
-and $\mathbf{IND}$ problems.
+and $\mathbf{ISO}$ problems, which stand for the the problems mentioned in Section~\ref{sec:CommProb} respectively.
 \end{notation}
 \begin{definition}
-Let $\mathfrak{m}$ be a mapping. A logical function $\mathbf{Cons_{PT}}$ is a
+Let $\mathfrak{m}$ be a mapping. The \textbf{consistency function for } $\mathbf{PT}$ is a logical function, for which $\mathbf{Cons_{PT}}(\mathfrak{m})$ is true if and only if $\mathfrak{m}$ satisfies the requirements of $\mathbf{PT}$ considering the subgraphs of $G_{1}$ and $G_{2}$ induced by $\mathfrak{D}(\mathfrak{m})$ and $\mathfrak{R}(\mathfrak{m})$, respectively.
-\textbf{consistency function by } $\mathbf{PT}$ if the following
-holds. If there exists a whole mapping $w$ satisfying the requirements of $PT$, for which $\mathfrak{m}$ is exactly $w$ restricted to $\mathfrak{D}(\mathfrak{m})$.
+%$\mathbf{Cons_{PT}}(\mathfrak{m})$ is true if there exists a whole mapping $w$ satisfying the requirements of $PT$, for which $\mathfrak{m}$ is exactly $w$ restricted to $\mathfrak{D}(\mathfrak{m})$.
 \end{definition}
 \begin{definition}
 Let $\mathfrak{m}$ be a mapping. A logical function $\mathbf{Cut_{PT}}$ is a
-\textbf{cutting function by } $\mathbf{PT}$ if the following
+\textbf{cutting function for } $\mathbf{PT}$ if the following
 holds. $\mathbf{Cut_{PT}(\mathfrak{m})}$ is false if there exists a sequence of extend operations, which results in a whole mapping satisfying the requirements of $PT$.
 \end{definition}
 \begin{definition}
-$\mathfrak{m}$ is said to be \textbf{consistent mapping by} $\mathbf{PT}$ if
+A mapping $\mathfrak{m}$ is said to be \textbf{consistent mapping by} $\mathbf{PT}$ if
 $Cons_{PT}(\mathfrak{m})$ is true.
 \end{definition}
 $Cons_{PT}$ and $Cut_{PT}$ will often be used in the following form.
 \begin{notation}
 $Cons_{PT}$ will be used to check the consistency of the already
 covered nodes, while $Cut_{PT}$ is for looking ahead to recognize if
 no whole consistent mapping can contain the current mapping.
 \subsection{Overview of the algorithm}
-VF2 uses a state space representation of mappings, $Cons_{PT}$ for
-excluding inconsistency with the problem type and $Cut_{PT}$ for
+VF2 begins with an empty mapping and gradually extends it with respect to the consistency and cutting functions until a whole mapping is reached.
-pruning the search tree.
+Algorithm~\ref{alg:VF2Pseu} is a high-level description of
-Algorithm~\ref{alg:VF2Pseu} is a high level description of
+the VF2 algorithm. Each state of the matching process can
-the VF2 matching algorithm. Each state of the matching process can
 be associated with a mapping $\mathfrak{m}$. The initial state
 is associated with a mapping $\mathfrak{m}$, for which
 $\mathfrak{D}(\mathfrak{m})=\emptyset$, i.e. it starts with an empty mapping.
 \Procedure{VF2}{Mapping $\mathfrak{m}$, ProblemType $PT$}
 \If{$\mathfrak{m}$ covers
 $V_{1}$} \State Output($\mathfrak{m}$)
 \Else
-\State Compute the set $P_\mathfrak{m}$ of the pairs candidate for inclusion
+\State Compute the set $P_\mathfrak{m}$ of the candidate pairs for extending $\mathfrak{m}$ \ForAll{$p\in{P_\mathfrak{m}}$} \If{Cons$_{PT}$($p,\mathfrak{m}$) $\wedge$
-in $\mathfrak{m}$ \ForAll{$p\in{P_\mathfrak{m}}$} \If{Cons$_{PT}$($p,\mathfrak{m}$) $\wedge$
 $\neg$Cut$_{PT}$($p,\mathfrak{m}$)}
 \State \textbf{call}
 VF2($extend(\mathfrak{m},p)$, $PT$) \EndIf \EndFor \EndIf \EndProcedure
 \end{algorithmic}
 \end{algorithm}
 For the current mapping $\mathfrak{m}$, the algorithm computes $P_\mathfrak{m}$, the set of
-candidate node pairs for adding to the current mapping $\mathfrak{m}_s$.
+candidate node pairs for extending the current mapping $\mathfrak{m}$.
 For each pair $p$ in $P_\mathfrak{m}$, $Cons_{PT}(p,\mathfrak{m})$ and
 $Cut_{PT}(p,\mathfrak{m})$ are evaluated. If the former is true and
 the latter is false, the whole process is recursively applied to
 $extend(\mathfrak{m},p)$. Otherwise, $extend(\mathfrak{m},p)$ is not consistent by $PT$, or it
 can be proved that $\mathfrak{m}$ can not be extended to a whole mapping.
-In order to make sure of the correctness, see
+In order to make sure of the correctness, see Claim~\ref{claim:consMapps}.
-\begin{claim}
+\begin{claim}\label{claim:consMapps}
 Through consistent mappings, only consistent whole mappings can be
 reached, and all the consistent whole mappings are reachable through
 consistent mappings.
 \end{claim}
 Note that a mapping may be reached in exponentially many different ways, since the
 order of extensions does not influence the nascent mapping.
-However, one may observe
+However, one may make the following observations.
+%\begin{claim}
+%\label{claim:claimTotOrd}
+%Let $\prec$ be an arbitrary total ordering relation on $V_{1}$.  If
+%the algorithm ignores each $p=(u,v) \in P_\mathfrak{m}$, for which
+%\begin{center}
+%$\exists (\tilde{u},\tilde{v})\in P_\mathfrak{m}: \tilde{u} \prec u$,
+%\end{center}
+%then no mapping can be reached more than once, and each whole mapping %remains reachable.
+%\end{claim}
+\begin{definition}
+A total order $(u_{\sigma(1)},u_{\sigma(2)},..,u_{\sigma(|V_{1}|)})$ of
+$V_{1}$ is \textbf{matching order} if VF2 can cover $u_{\sigma(d)}$ on the $d$-th level for all $d\in\{1,..,|V_{1}|\}$.
+\end{definition}
 \begin{claim}
 \label{claim:claimTotOrd}
-Let $\prec$ be an arbitrary total ordering relation on $V_{1}$.  If
+If VF2 is prescribed to cover the nodes of $G_1$ according to a matching order, then no mapping can be reached more than once and each whole mapping remains reachable.
-the algorithm ignores each $p=(u,v) \in P_\mathfrak{m}$, for which
-\begin{center}
-$\exists (\tilde{u},\tilde{v})\in P_\mathfrak{m}: \tilde{u} \prec u$,
-\end{center}
-then no mapping can be reached more than once, and each whole mapping remains reachable.
 \end{claim}
-Note that the cornerstone of the improvements to VF2 is a proper
+Note that the cornerstone of the improvements to VF2 is to choose a proper
-choice of a total ordering.
+matching order.
 \subsection{The candidate set}
 \label{candidateComputingVF2}
 Let $P_\mathfrak{m}$ be the set of the candidate pairs for inclusion in $\mathfrak{m}$.
 \begin{notation}
 Let $\mathbf{T_{1}(\mathfrak{m})}:=\{u \in V_{1}\backslash\mathfrak{D}(\mathfrak{m}) : \exists \tilde{u}\in{\mathfrak{D}(\mathfrak{m}): (u,\tilde{u})\in E_{1}}\}$, and
 $\mathbf{T_{2}(\mathfrak{m})} := \{v \in V_{2}\backslash\mathfrak{R}(\mathfrak{m}) : \exists\tilde{v}\in{\mathfrak{R}(\mathfrak{m}):(v,\tilde{v})\in E_{2}}\}$.
 \end{notation}
-The set $P_\mathfrak{m}$ includes the pairs of uncovered neighbours of covered
+The set $P_\mathfrak{m}$ contains the pairs of uncovered neighbours of covered
 nodes, and if there is not such a node pair, all the pairs containing
 two uncovered nodes are added. Formally, let
 \[
 P_\mathfrak{m}\!=\!
 \begin{cases}
 &\hspace{-0.15cm}\text{otherwise}.
 \end{cases}
 \]
 \subsection{Consistency}
-Suppose $p=(u,v)$, where $u\in V_{1}$ and $v\in V_{2}$, $\mathfrak{m}$ is a consistent mapping by
+Let $p=(u,v)\in V_{1}\times V_{2}$, and suppose $\mathfrak{m}$ is a consistent mapping by
 $PT$. $Cons_{PT}(p,\mathfrak{m})$ checks whether
-including pair $p$ into $\mathfrak{m}$ leads to a consistent mapping by $PT$.
+adding pair $p$ into $\mathfrak{m}$ leads to a consistent mapping by $PT$.
-For example, the consistency function of induced subgraph isomorphism is as follows.
+For example, the consistency function of the induced subgraph isomorphism problem is the following.
 \begin{notation}
 Let $\mathbf{\Gamma_{1} (u)}:=\{\tilde{u}\in V_{1} :
 (u,\tilde{u})\in E_{1}\}$, and $\mathbf{\Gamma_{2}
-(v)}:=\{\tilde{v}\in V_{2} : (v,\tilde{v})\in E_{2}\}$, where $u\in V_{1}$ and $v\in V_{2}$.
+(v)}:=\{\tilde{v}\in V_{2} : (v,\tilde{v})\in E_{2}\}$, where $u\in V_{1}$ and $v\in V_{2}$. That is, $\mathbf{\Gamma_{i} (w)}$ denotes the set of neighbours of node $w$ in $G_i$ $(i=1,2)$.
 \end{notation}
-$extend(\mathfrak{m},(u,v))$ is a consistent mapping by $IND$ $\Leftrightarrow
-(\forall \tilde{u}\in \mathfrak{D}(\mathfrak{m}): (u,\tilde{u})\in E_{1}
-\Leftrightarrow (v,\mathfrak{m}(\tilde{u}))\in E_{2})$. The
-following formulation gives an efficient way of calculating
-$Cons_{IND}$.
 \begin{claim}
-$Cons_{IND}((u,v),\mathfrak{m}):=\mathcal{L}(u)\!\!=\!\!\mathcal{L}(v)\wedge(\forall \tilde{v}\in \Gamma_{2}(v)\cap\mathfrak{R}(\mathfrak{m}):(u,\mathfrak{m}^{-1}(\tilde{v}))\in E_{1})\wedge
+$extend(\mathfrak{m},(u,v))$ is a consistent mapping by $IND$ if and only if $\mathfrak{m}$ is consistent and $(\forall \tilde{u}\in \mathfrak{D}(\mathfrak{m}): (u,\tilde{u})\in E_{1}
+\Leftrightarrow (v,\mathfrak{m}(\tilde{u}))\in E_{2})$.
+\end{claim}
+The following formulation gives an efficient way of calculating $Cons_{IND}$.
+\begin{claim}
+$Cons_{IND}((u,v),\mathfrak{m}):=Cons_{IND}(\mathfrak{m})\wedge\mathcal{L}(u)\!\!=\!\!\mathcal{L}(v)\wedge(\forall \tilde{v}\in \Gamma_{2}(v)\cap\mathfrak{R}(\mathfrak{m}):(u,\mathfrak{m}^{-1}(\tilde{v}))\in E_{1})\wedge
 (\forall \tilde{u}\in \Gamma_{1}(u)
-\cap \mathfrak{D}(\mathfrak{m}):(v,\mathfrak{m}(\tilde{u}))\in E_{2})$ is a
+\cap \mathfrak{D}(\mathfrak{m}):(v,\mathfrak{m}(\tilde{u}))\in E_{2})$ is the consistency function for $IND$.
-consistency function in the case of $IND$.
 \end{claim}
 \subsection{Cutting rules}
 $Cut_{PT}(p,\mathfrak{m})$ is defined by a collection of efficiently
 verifiable conditions. The requirement is that $Cut_{PT}(p,\mathfrak{m})$ can
 be true only if it is impossible to extend $extend(\mathfrak{m},p)$ to a
 whole mapping.
-As an example, the cutting function of induced subgraph isomorphism is presented.
+As an example, a cutting function of induced subgraph isomorphism problem is presented.
 \begin{notation}
 Let $\mathbf{\tilde{T}_{1}}(\mathfrak{m}):=(V_{1}\backslash
 \mathfrak{D}(\mathfrak{m}))\backslash T_{1}(\mathfrak{m})$, and
 \\ $\mathbf{\tilde{T}_{2}}(\mathfrak{m}):=(V_{2}\backslash
 \mathfrak{R}(\mathfrak{m}))\backslash T_{2}(\mathfrak{m})$.
 \begin{claim}
 $Cut_{IND}((u,v),\mathfrak{m}):= |\Gamma_{2} (v)\ \cap\ T_{2}(\mathfrak{m})| <
 |\Gamma_{1} (u)\ \cap\ T_{1}(\mathfrak{m})| \vee |\Gamma_{2}(v)\cap
 \tilde{T}_{2}(\mathfrak{m})| < |\Gamma_{1}(u)\cap
-\tilde{T}_{1}(\mathfrak{m})|$ is a cutting function by $IND$.
+\tilde{T}_{1}(\mathfrak{m})|$ is a cutting function for $IND$.
 \end{claim}
 \section{The VF2++ Algorithm}\label{sec:VF2ppAlg}
-Although any total ordering relation makes the search space of VF2 a
+Although any matching order makes the search space of VF2 a
 tree, its choice turns out to dramatically influence the number of
 visited states. The goal is to determine an efficient one as quickly
 as possible.
-The main reason for VF2++' superiority over VF2 is twofold. Firstly,
+The main reason for the superiority of VF2++ over VF2 is twofold. Firstly,
-taking into account the structure and the node labeling of the graph,
+taking into account the structure and the node labelling of the graph,
-VF2++ determines a state order in which most of the unfruitful
+VF2++ determines a matching order in which most of the unfruitful
 branches of the search space can be pruned immediately. Secondly,
 introducing more efficient --- nevertheless still easier to compute
 --- cutting rules reduces the chance of going astray even further.
-In addition to the usual subgraph isomorphism, specialized versions
+In addition to the usual subgraph isomorphism problem, specialized versions
-for induced subgraph isomorphism and for graph isomorphism have been
+for induced subgraph and graph isomorphism problems have been
 designed.
-Note that a weaker version of the cutting rules and an efficient
+Note that a weaker version of the cutting rules of VF2++ and an efficient
-candidate set calculating were described in \cite{VF2Plus}.
+candidate set calculation method were described in~\cite{VF2Plus}.
 It should be noted that all the methods described in this section are
 extendable to handle directed graphs and edge labels as well.
 The basic ideas and the detailed description of VF2++ are provided in
 the following.\newline
 The goal is to find a matching order in which the algorithm is able to
 recognize inconsistency or prune the infeasible branches on the
 highest levels and goes deep only if it is needed.
 \begin{notation}
-Let $\mathbf{Conn_{H}(u)}:=|\Gamma_{1}(u)\cap H\}|$, that is the
+Let $\mathbf{Conn_{H}(u)}:=|\Gamma_{1}(u)\cap H|$, that is the
 number of neighbours of u which are in H, where $u\in V_{1} $ and
 $H\subseteq V_{1}$.
 \end{notation}
 The principal question is the following. Suppose a mapping $\mathfrak{m}$ is
 consistent pair in $G_{2}$? The more covered neighbours a node in
 $T_{1}(\mathfrak{m})$ has --- i.e. the largest $Conn_{\mathfrak{D}(\mathfrak{m})}$ it has
 ---, the more rarely satisfiable consistency constraints for its pair
 are given.
-In biology, most of the graphs are sparse, thus several nodes in
+Most of the graphs of biological and chemical structures are sparse, thus several nodes in
 $T_{1}(\mathfrak{m})$ may have the same $Conn_{\mathfrak{D}(\mathfrak{m})}$, which makes
 reasonable to define a secondary and a tertiary order between them.
 The observation above proves itself to be as determining, that the
 secondary ordering prefers nodes with the most uncovered neighbours
 among which have the same $Conn_{\mathfrak{D}(\mathfrak{m})}$ to increase
-$Conn_{\mathfrak{D}(\mathfrak{m})}$ of uncovered nodes so much, as possible.  The
+$Conn_{\mathfrak{D}(\mathfrak{m})}$ of uncovered nodes as much, as possible.  The tertiary ordering prefers nodes having the rarest uncovered labels in $G_2$.
-tertiary ordering prefers nodes having the rarest uncovered labels.
+Note that the secondary ordering is the same as ordering by degrees,
-Note that the secondary ordering is the same as the ordering by $deg$,
 which is a static data in front of the above used.
 These rules can easily result in a matching order which contains the
 nodes of a long path successively, whose nodes may have low $Conn$ and
-is easily matchable into $G_{2}$. To avoid that, a BFS order is
+is easily matchable into $G_{2}$. To try to avoid that, a Breadth-first-search order is used, and on each of its levels, the ordering procedure described above is applied.
-used, which provides the shortest possible paths.
 \newline
 In the following, some examples on which the VF2 may be slow are
 described, although they are easily solvable by using a proper
 matching order.
 $\mathcal{L}(\tilde{u}):=red \ \forall \tilde{u}\in V_{1}\backslash
 \{u\}$
 \newline
 $\mathcal{L}(\tilde{v}):=red \ \forall \tilde{v}\in V_{2}\backslash
 \{v\}$
-\newline
 Now, any mapping by $\mathcal{L}$ must contain $(u,v)$, since
 $u$ is black and no node in $V_{2}$ has a black label except
 $v$. If unfortunately $u$ were the last node which will get covered,
 VF2 would check only in the last steps, whether $u$ can be matched to
 $v$.
-\newline
 However, had $u$ been the first matched node, u would have been
 matched immediately to v, so all the mappings would have been
 precluded in which node labels can not correspond.
 \end{example}
 \begin{example}
-Suppose there is no node label given, $G_{1}$ is a small graph and
+Suppose there is no node label given, and $G_{1}$ is a small graph that can not be mapped into $G_{2}$ and $u\in V_{1}$.
-can not be mapped into $G_{2}$ and $u\in V_{1}$.
 \newline
 Let $G'_{1}:=(V_{1}\cup
 \{u'_{1},u'_{2},..,u'_{k}\},E_{1}\cup
 \{(u,u'_{1}),(u'_{1},u'_{2}),..,(u'_{k-1},u'_{k})\})$, that is,
 $G'_{1}$ is $G_{1}\cup \{ a\ k$ long path, which is disjoint
 from $G_{1}$ and one of its starting points is connected to $u\in
 V_{1}\}$.
-\newline
-Is there a subgraph of $G_{2}$, which is isomorph with
-$G'_{1}$?
-\newline
 If unfortunately the nodes of the path were the first $k$ nodes in the
 matching order, the algorithm would iterate through all the possible k
 long paths in $G_{2}$, and it would recognize that no path can be
 extended to $G'_{1}$.
 \newline
 \end{example}
 These examples may look artificial, but the same problems also appear
 in real-world instances, even though in a less obvious way.
-\subsection{Preparations}
+%\subsection{Preparations}
-\begin{claim}
+%\begin{claim}
-\label{claim:claimCoverFromLeft}
+%\label{claim:claimCoverFromLeft}
-The total ordering relation uniquely determines a node order, in which
+%The total ordering relation uniquely determines a node order, in which
-the nodes of $V_{1}$ will be covered by VF2. From the point of
+%the nodes of $V_{1}$ will be covered by VF2. From the point of
-view of the matching procedure, this means, that always the same node
+%view of the matching procedure, this means, that always the same node
-of $G_{1}$ will be covered on the d-th level.
+%of $G_{1}$ will be covered on the $d$-th level.
-\end{claim}
+%\end{claim}
-\begin{definition}
+%\begin{definition}
-An order $(u_{\sigma(1)},u_{\sigma(2)},..,u_{\sigma(|V_{1}|)})$ of
+%An order $(u_{\sigma(1)},u_{\sigma(2)},..,u_{\sigma(|V_{1}|)})$ of
-$V_{1}$ is \textbf{matching order} if exists $\prec$ total
+%$V_{1}$ is \textbf{matching order} if there exists $\prec$ total
-ordering relation, s.t. the VF2 with $\prec$ on the d-th level finds
+%ordering relation, s.t. the VF2 with $\prec$ on the d-th level finds
-pair for $u_{\sigma(d)}$ for all $d\in\{1,..,|V_{1}|\}$.
+%pair for $u_{\sigma(d)}$ for all $d\in\{1,..,|V_{1}|\}$.
-\end{definition}
+%\end{definition}
-\begin{claim}\label{claim:MOclaim}
+%\begin{claim}\label{claim:MOclaim}
-A total ordering is matching order iff the nodes of every component
+%A total ordering is matching order iff the nodes of every component
-form an interval in the node sequence, and every node connects to a
+%form an interval in the node sequence, and every node connects to a
-previous node in its component except the first node of each component.
+%previous node in its component except the first node of each component.
-\end{claim}
+%\end{claim}
-To summing up, a total ordering always uniquely determines a matching
+%In summary, a total ordering always uniquely determines a matching
-order, and every matching order can be determined by a total ordering,
+%order, and every matching order can be determined by a total ordering,
-however, more than one different total orderings may determine the
+%however, more than one different total orderings may determine the
-same matching order.
+%same matching order.
-\subsection{Total ordering}
+\subsection{Matching order}
-The matching order will be searched directly.
 \begin{notation}
 Let \textbf{F$_\mathcal{M}$(l)}$:=|\{v\in V_{2} :
-l=\mathcal{L}(v)\}|-|\{u\in V_{1}\backslash \mathcal{M} : l=\mathcal{L}(u)\}|$ ,
+l=\mathcal{L}(v)\}|-|\{u\in \mathcal{M} : l=\mathcal{L}(u)\}|$,
 where $l$ is a label and $\mathcal{M}\subseteq V_{1}$.
 \end{notation}
-\begin{definition}Let $\mathbf{arg\ max}_{f}(S) :=\{u\in S : f(u)=max_{v\in S}\{f(v)\}\}$ and $\mathbf{arg\ min}_{f}(S) := arg\ max_{-f}(S)$, where $S$ is a finite set and $f:S\longrightarrow \mathbb{R}$.
+\begin{definition}Let $\mathbf{arg\ max}_{f}(S) :=\{u\in S : f(u)=max_{v\in S}\{f(v)\}\}$ and $\mathbf{arg\ min}_{f}(S) := arg\ max_{(-f)}(S)$, where $S$ is a finite set and $f:S\longrightarrow \mathbb{R}$.
 \end{definition}
-\begin{algorithm}
+\begin{notation}
+Let $deg(v)$ denote the degree of node $v$.
+\end{notation}
+\begin{algorithm}[H]
 \algtext*{EndIf}
 \algtext*{EndProcedure}
 \algtext*{EndWhile}
 \algtext*{EndFor}
 \caption{\hspace{0.5cm}$The\ method\ of\ VF2++\ for\ determining\ the\ node\ order$}\label{alg:VF2PPPseu}
 \Comment{matching order} \While{$V_{1}\backslash \mathcal{M}
 \neq\emptyset$} \State $r\in$ arg max$_{deg}$ (arg
 min$_{F_\mathcal{M}\circ \mathcal{L}}(V_{1}\backslash
 \mathcal{M})$)\label{alg:findMin} \State Compute $T$, a BFS tree with
 root node $r$.  \For{$d=0,1,...,depth(T)$} \State $V_d$:=nodes of the
-$d$-th level \State Process $V_d$ \Comment{See Algorithm
+$d$-th level \State Process $V_d$ \Comment{See Algorithm~\ref{alg:VF2PPProcess1}} \EndFor
-\ref{alg:VF2PPProcess1}} \EndFor
 \EndWhile \EndProcedure
 \end{algorithmic}
 \end{algorithm}
 \begin{algorithm}
 \algtext*{EndProcedure}%ne nyomtasson ..
 \algtext*{EndWhile}
 \caption{\hspace{.5cm}$The\ method\ for\ processing\ a\ level\ of\ the\ BFS\ tree$}\label{alg:VF2PPProcess1}
 \begin{algorithmic}[1]
 \Procedure{VF2++ProcessLevel}{$V_{d}$} \While{$V_d\neq\emptyset$}
-\State $m\in$ arg min$_{F_{\mathcal{M}\circ\ \mathcal{L}}}($ arg max$_{deg}($arg
+\State $m\in$ arg min$_{F_\mathcal{M}\circ\ \mathcal{L}}($ arg max$_{deg}($arg
 max$_{Conn_{\mathcal{M}}}(V_{d})))$ \State $V_d:=V_d\backslash m$
 \State Append node $m$ to the end of $\mathcal{M}$ \State Refresh
 $F_\mathcal{M}$ \EndWhile \EndProcedure
 \end{algorithmic}
 \end{algorithm}
-Algorithm~\ref{alg:VF2PPPseu} is a high level description of the
+Algorithm~\ref{alg:VF2PPPseu} is a high-level description of the
 matching order procedure of VF2++. It computes a BFS tree for each
 component in ascending order of their rarest node labels and largest $deg$,
-whose root vertex is the component's minimal
+whose root vertex is the minimal node of its component. Algorithm~\ref{alg:VF2PPProcess1} is a method to process a level of the BFS tree, which appends the nodes of the current level in descending
-node. Algorithm~\ref{alg:VF2PPProcess1} is a method to process a level of the BFS tree, which appends the nodes of the current level in descending
 lexicographic order by $(Conn_{\mathcal{M}},deg,-F_\mathcal{M})$ separately
 to $\mathcal{M}$, and refreshes $F_\mathcal{M}$ immediately.
-Claim~\ref{claim:MOclaim} shows that Algorithm~\ref{alg:VF2PPPseu}
+\begin{claim}
-provides a matching order.
+Algorithm~\ref{alg:VF2PPPseu} provides a matching order.
+\end{claim}
 \subsection{Cutting rules}
 \label{VF2PPCuttingRules}
 This section presents the cutting rules of VF2++, which are improved by using extra information coming from the node labels.
 $\mathbf{\Gamma_{2}^{l}(v)}:=\{\tilde{v} : \mathcal{L}(\tilde{v})=l \wedge
 \tilde{v}\in \Gamma_{2} (v)\}$, where $u\in V_{1}$, $v\in
 V_{2}$ and $l$ is a label.
 \end{notation}
-\subsubsection{Induced subgraph isomorphism}
+\begin{claim}[Cutting function for ISO]
-\begin{claim}
+\[LabCut_{ISO}((u,v),\mathfrak{m}):=\bigvee_{l\ is\ label}|\Gamma_{2}^{l} (v) \cap T_{2}(\mathfrak{m})|\!\neq\!|\Gamma_{1}^{l}(u)\cap T_{1}(\mathfrak{m})|\  \vee\]\[\bigvee_{l\ is\ label} \newline |\Gamma_{2}^{l}(v)\cap \tilde{T}_{2}(\mathfrak{m})| \neq |\Gamma_{1}^{l}(u)\cap \tilde{T}_{1}(\mathfrak{m})|\] is a cutting function for ISO.
-\[LabCut_{IND}((u,v),\mathfrak{m}):=\bigvee_{l\ is\ label}|\Gamma_{2}^{l} (v) \cap T_{2}(\mathfrak{m})|\!<\!|\Gamma_{1}^{l}(u)\cap T_{1}(\mathfrak{m})|\ \vee\]\[\bigvee_{l\ is\ label} \newline |\Gamma_{2}^{l}(v)\cap \tilde{T}_{2}(\mathfrak{m})| < |\Gamma_{1}^{l}(u)\cap \tilde{T}_{1}(\mathfrak{m})|\] is a cutting function by IND.
 \end{claim}
-\subsubsection{Graph isomorphism}
-\begin{claim}
+\begin{claim}[Cutting function for IND]
-\[LabCut_{ISO}((u,v),\mathfrak{m}):=\bigvee_{l\ is\ label}|\Gamma_{2}^{l} (v) \cap T_{2}(\mathfrak{m})|\!\neq\!|\Gamma_{1}^{l}(u)\cap T_{1}(\mathfrak{m})|\  \vee\]\[\bigvee_{l\ is\ label} \newline |\Gamma_{2}^{l}(v)\cap \tilde{T}_{2}(\mathfrak{m})| \neq |\Gamma_{1}^{l}(u)\cap \tilde{T}_{1}(\mathfrak{m})|\] is a cutting function by ISO.
+\[LabCut_{IND}((u,v),\mathfrak{m}):=\bigvee_{l\ is\ label}|\Gamma_{2}^{l} (v) \cap T_{2}(\mathfrak{m})|\!<\!|\Gamma_{1}^{l}(u)\cap T_{1}(\mathfrak{m})|\ \vee\]\[\bigvee_{l\ is\ label} \newline |\Gamma_{2}^{l}(v)\cap \tilde{T}_{2}(\mathfrak{m})| < |\Gamma_{1}^{l}(u)\cap \tilde{T}_{1}(\mathfrak{m})|\] is a cutting function for IND.
 \end{claim}
-\subsubsection{Subgraph isomorphism}
+\begin{claim}[Cutting function for SUB]
-\begin{claim}
+\[LabCut_{SU\!B}((u,v),\mathfrak{m}):=\bigvee_{l\ is\ label}|\Gamma_{2}^{l} (v) \cap T_{2}(\mathfrak{m})|\!<\!|\Gamma_{1}^{l}(u)\cap T_{1}(\mathfrak{m})|\] is a cutting function for SUB.
-\[LabCut_{SU\!B}((u,v),\mathfrak{m}):=\bigvee_{l\ is\ label}|\Gamma_{2}^{l} (v) \cap T_{2}(\mathfrak{m})|\!<\!|\Gamma_{1}^{l}(u)\cap T_{1}(\mathfrak{m})|\] is a cutting function by SUB.
 \end{claim}
 \section{Implementation details}\label{sec:VF2ppImpl}
 This section provides a detailed summary of an efficient
 implementation of VF2++.
+\begin{notation}
+Let $\Delta_1$ and $\Delta_2$ denote the largest degree in $G_1$ and $G_2$, respectively, and let $\Delta=\max\{\Delta_1,\Delta_2\}$.
+\end{notation}
 \subsection{Storing a mapping}
 After fixing an arbitrary node order ($u_0, u_1, ..,
-u_{|G_{1}|-1}$) of $G_{1}$, an array $M$ is usable to store
+u_{|V_{1}|-1}$) of $G_{1}$, an array $M$ can be used to store
 the current mapping in the following way.
 \[
 M[i] =
 \begin{cases}
 v & if\ (u_i,v)\ is\ in\ the\ mapping\\ INV\!ALI\!D &
 if\ no\ node\ has\ been\ mapped\ to\ u_i,
 \end{cases}
 \]
-where $i\in\{0,1, ..,|G_{1}|-1\}$, $v\in V_{2}$ and $INV\!ALI\!D$
+where $i\in\{0,1, ..,|V_{1}|-1\}$, $v\in V_{2}$ and $INV\!ALI\!D$
 means "no node".
 \subsection{Avoiding the recurrence}
 The recursion of Algorithm~\ref{alg:VF2Pseu} can be realized
 as a \textit{while loop}, which has a loop counter $depth$ denoting the
-all-time depth of the recursion. Fixing a matching order, let $M$
+current depth of the recursion. Fixing a matching order, let $M$
-denote the array storing the all-time mapping. Based on Claim~\ref{claim:claimCoverFromLeft},
+denote the array storing the current mapping. Observe that
 $M$ is $INV\!ALI\!D$ from index $depth$+1 and not $INV\!ALI\!D$ before
 $depth$. $M[depth]$ changes
 while the state is being processed, but the property is held before
 both stepping back to a predecessor state and exploring a successor
 state.
 The necessary part of the candidate set is easily maintainable or
 computable by following
-Section~\ref{candidateComputingVF2}. A much faster method
+the steps described in Section~\ref{candidateComputingVF2}. A much faster method
-has been designed for biological- and sparse graphs, see the next
+has been designed for biological and sparse graphs, see the next
 section for details.
 \subsection{Calculating the candidates for a node}
-Being aware of Claim~\ref{claim:claimCoverFromLeft}, the
+The task is not to maintain the candidate set, but to generate the
-task is not to maintain the candidate set, but to generate the
 candidate nodes in $G_{2}$ for a given node $u\in V_{1}$.  In
 case of any of the three problem types and a mapping $\mathfrak{m}$, if a node $v\in
 V_{2}$ is a potential pair of $u\in V_{1}$, then $\forall
 u'\in \mathfrak{D}(\mathfrak{m}) : (u,u')\in
 E_{1}\Rightarrow (v,\mathfrak{m}(u'))\in
 E_{2}$. That is, each covered neighbour of $u$ has to be mapped to
-a covered neighbour of $v$.
+a covered neighbour of $v$, i.e. selecting arbitrarily a covered neighbour $u'$ of $u$, all of the admissible candidates for $u$ are among the neighbours of $\mathfrak{m}(u')$.
-Having said that, an algorithm running in $\Theta(deg)$ time is
+Having said that, an algorithm running in $\Theta(\Delta_2)$ time is
 describable if there exists a covered node in the component containing
 $u$, and a linear one otherwise.
 \subsection{Determining the node order}
 For using lookup tables, the node labels are associated with the
 numbers $\{0,1,..,|K|-1\}$, where $K$ is the set of the labels. It
 enables $F_\mathcal{M}$ to be stored in an array. At first, the node order
 $\mathcal{M}=\emptyset$, so $F_\mathcal{M}[i]$ is the number of nodes
-in $V_{1}$ having label $i$, which is easy to compute in
+in $V_{2}$ having label $i$, which is easy to compute in
-$\Theta(|V_{1}|)$ steps.
+$\Theta(|V_{2}|)$ steps.
 Representing $\mathcal{M}\subseteq V_{1}$ as an array of
-size $|V_{1}|$, both the computation of the BFS tree, and processing its levels by Algorithm~\ref{alg:VF2PPProcess1} can be done inplace by swapping nodes.
+size $|V_{1}|$, both the computation of the BFS tree, and processing its levels by Algorithm~\ref{alg:VF2PPProcess1} can be done in-place by swapping nodes.
 \subsection{Cutting rules}
 In Section~\ref{VF2PPCuttingRules}, the cutting rules were
 described using the sets $T_{1}$, $T_{2}$, $\tilde T_{1}$
-and $\tilde T_{2}$, which are dependent on the all-time mapping
+and $\tilde T_{2}$, which are dependent on the current mapping. The aim is to check the labelled cutting
-(i.e. on the all-time state). The aim is to check the labeled cutting
+rules of VF2++ in $\Theta(\Delta)$ time.
-rules of VF2++ in $\Theta(deg)$ time.
 Firstly, suppose that these four sets are given in such a way, that
 checking whether a node is in a certain set takes constant time,
 e.g. they are given by their 0-1 characteristic vectors. Let $L$ be an
 initially zero integer lookup table of size $|K|$. After incrementing
 $L[\mathcal{L}(u')]$ for all $u'\in \Gamma_{1}(u) \cap T_{1}(\mathfrak{m})$ and
 decrementing $L[\mathcal{L}(v')]$ for all $v'\in\Gamma_{2} (v) \cap
-T_{2}(s)$, the first part of the cutting rules is checkable in
+T_{2}(\mathfrak{m})$, the first part of the cutting rules is checkable in
-$\Theta(deg)$ time by considering the proper signs of $L$. Setting $L$
+$\Theta(\Delta)$ time by considering the proper signs of $L$. Setting $L$
-to zero takes $\Theta(deg)$ time again, which makes it possible to use
+to zero takes $\Theta(\Delta)$ time again, which makes it possible to use
 the same table through the whole algorithm. The second part of the
 cutting rules can be verified using the same method with $\tilde
 T_{1}$ and $\tilde T_{2}$ instead of $T_{1}$ and
-$T_{2}$. Thus, the overall complexity is $\Theta(deg)$.
+$T_{2}$. Thus, the overall time complexity is $\Theta(\Delta)$.
-Another integer lookup table storing the number of covered neighbours
+To maintain the sets $T_{1}$, $T_{2}$, $\tilde T_{1}$
-of each node in $G_{2}$ gives all the information about the sets
+and $\tilde T_{2}$, two other integer lookup tables storing the number of covered neighbours of the nodes of the two graphs can be used. This representation allows constant-time membership checking, furthermore it is maintainable in $\Theta(\Delta)$ time whenever a node pair is added or subtracted by incrementing
-$T_{2}$ and $\tilde T_{2}$, which is maintainable in
-$\Theta(deg)$ time when a pair is added or substracted by incrementing
 or decrementing the proper indices. A further improvement is that the
 values of $L[\mathcal{L}(u')]$ in case of checking $u$ are dependent only on
-$u$, i.e. on the size of the mapping, so for each $u\in V_{1}$ an
+$u$, i.e. on the current depth of the recursion, so for each $u\in V_{1}$, an array of pairs \textit{(label, number of such labels)} can store $L$. Note that these arrays are at most of size
-array of pairs (label, number of such labels) can be stored to skip
+$\Delta_1$ if pairs with non-appearing node labels are discarded.
-the maintaining operations. Note that these arrays are at most of size
-$deg$.
 Using similar techniques, the consistency function can be evaluated in
-$\Theta(deg)$ steps, as well.
+$\Theta(\Delta)$ steps, as well.
 \section{Experimental results}\label{sec:ExpRes}
-This section compares the performance of VF2++ and VF2 Plus. According to
+This section compares the performance of VF2++ and VF2~Plus. According to
 our experience, both algorithms run faster than VF2 with orders of
 magnitude, thus its inclusion was not reasonable.
-The algorithms were implemented in C++ using the open source
+The algorithms were implemented in C++ using the open-source
-LEMON graph and network optimization library\cite{LEMON}. The test were carried out on a linux based system with an Intel i7 X980 3.33 GHz CPU and 6 GB of RAM.
+LEMON graph and network optimization library~\cite{LEMON}. The tests were carried out on a Linux-based system with an Intel i7 X980 3.33 GHz CPU and 6 GB of RAM.
 \subsection{Biological graphs}
 The tests have been executed on a recent biological dataset created
 for the International Contest on Pattern Search in Biological
-Databases\cite{Content}, which has been constructed of molecule,
+Databases~\cite{Content}, which has been constructed of molecule,
 protein and contact map graphs extracted from the Protein Data
-Bank\cite{ProteinDataBank}.
+Bank~\cite{ProteinDataBank}.
 The molecule dataset contains small graphs with less than 100 nodes
 and an average degree of less than 3. The protein dataset contains
 graphs having 500-10 000 nodes and an average degree of 4, while the
 contact map dataset contains graphs with 150-800 nodes and an average
 degree of 20.  \\
-In the following, both the induced subgraph isomorphism and the graph
+In the following, both the induced subgraph and the graph
-isomorphism will be examined.
+isomorphism problems will be examined.
 This dataset provides graph pairs, between which all the induced subgraph isomorphisms have to be found. For runtime results, please see Figure~\ref{fig:bioIND}.
 In an other experiment, the nodes of each graph in the database had been
 shuffled, and an isomorphism between the shuffled and the original
-graph was searched. The solution times are shown on Figure~\ref{fig:bioISO}.
+graph was searched. The running times are shown on Figure~\ref{fig:bioISO}.
 \begin{figure}[H]
 \vspace*{-2cm}
 \hspace*{-1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{figure}[H]
 \begin{tikzpicture}[trim axis left, trim axis right]
-\begin{axis}[title=Molecules IND,xlabel={target size},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title=Molecules IND,xlabel={$|V_2|$},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \thinspace}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {Orig/Molecules.32.txt}; \addplot[mark=triangle*,mark
 size=1.8pt,color=red] table {VF2PPLabel/Molecules.32.txt};
 \end{axis}
 \end{tikzpicture}
 \end{subfigure}
 \hspace*{1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{figure}[H]
 \begin{tikzpicture}[trim axis left, trim axis right]
-\begin{axis}[title=Contact maps IND,xlabel={target size},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title=Contact maps IND,xlabel={$|V_2|$},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \thinspace}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {Orig/ContactMaps.128.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {VF2PPLabel/ContactMaps.128.txt};
 \end{axis}
 \end{tikzpicture}
 \caption{On contact maps, VF2++ runs almost in constant time, while VF2
-Plus has a near linear behaviour.} \label{fig:INDContact}
+Plus has a near-linear behaviour.} \label{fig:INDContact}
 \end{figure}
 \end{subfigure}
 \begin{center}
 \vspace*{-0.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{figure}[H]
 \begin{tikzpicture}[trim axis left, trim axis right]
-\begin{axis}[title=Proteins IND,xlabel={target size},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title=Proteins IND,xlabel={$|V_2|$},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \thinspace}] %\addplot+[only marks] table
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}] %\addplot+[only marks] table
 {proteinsOrig.txt}; \addplot[mark=*,mark size=1.2pt,color=blue]
 table {Orig/Proteins.256.txt}; \addplot[mark=triangle*,mark
 size=1.8pt,color=red] table {VF2PPLabel/Proteins.256.txt};
 \end{axis}
 \end{tikzpicture}
-\caption{Both the algorithms have linear behaviour on protein
+\caption{Both of the algorithms have linear behaviour on protein
 graphs. VF2++ is more than 10 times faster than VF2
 Plus.} \label{fig:INDProt}
 \end{figure}
 \end{subfigure}
 \end{center}
 \vspace*{-0.5cm}
-\caption{\normalsize{Induced subgraph isomorphism on biological graphs}}\label{fig:bioIND}
+\caption{\normalsize{Induced subgraph isomorphism problem on biological graphs}}\label{fig:bioIND}
 \end{figure}
 \begin{figure}[H]
 \vspace*{-2cm}
 \hspace*{-1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{figure}[H]
 \begin{tikzpicture}[trim axis left, trim axis right]
-\begin{axis}[title=Molecules ISO,xlabel={target size},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title=Molecules ISO,xlabel={$|V_2|$},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \thinspace}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {Orig/moleculesIso.txt}; \addplot[mark=triangle*,mark
 size=1.8pt,color=red] table {VF2PPLabel/moleculesIso.txt};
 \end{axis}
 \end{tikzpicture}
-\caption{In the case of molecules, there is not such a significant
+\caption{The results are close to each other on contact maps, but VF2++ seems to be slightly faster as the number of nodes increases.
-difference, but VF2++ seems to be faster as the number of nodes
+}\label{fig:ISOMolecule}
-increases.}\label{fig:ISOMolecule}
 \end{figure}
 \end{subfigure}
 \hspace*{1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{figure}[H]
 \begin{tikzpicture}[trim axis left, trim axis right]
-\begin{axis}[title=Contact maps ISO,xlabel={target size},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title=Contact maps ISO,xlabel={$|V_2|$},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \thinspace}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {Orig/contactMapsIso.txt}; \addplot[mark=triangle*,mark
 size=1.8pt,color=red] table {VF2PPLabel/contactMapsIso.txt};
 \end{axis}
 \end{tikzpicture}
-\caption{The results are closer to each other on contact maps, but
+\caption{In the case of molecules, there is no significant
-VF2++ still performs consistently better.}\label{fig:ISOContact}
+difference, but VF2++ performs consistently better.}\label{fig:ISOContact}
 \end{figure}
 \end{subfigure}
 \begin{center}
 \vspace*{-0.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{figure}[H]
 \begin{tikzpicture}[trim axis left, trim axis right]
-\begin{axis}[title=Proteins ISO,xlabel={target size},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title=Proteins ISO,xlabel={$|V_2|$},ylabel={time (ms)},legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \thinspace}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {Orig/proteinsIso.txt}; \addplot[mark=triangle*,mark
 size=1.8pt,color=red] table {VF2PPLabel/proteinsIso.txt};
 \end{axis}
 \end{tikzpicture}
-\caption{On protein graphs, VF2 Plus has a super linear time
+\caption{On protein graphs, VF2~Plus has a super linear time
 complexity, while VF2++ runs in near constant time. The difference
-is about two order of magnitude on large graphs.}\label{fig:ISOProt}
+is about two orders of magnitude on large graphs.}\label{fig:ISOProt}
 \end{figure}
 \end{subfigure}
 \end{center}
 \vspace*{-0.6cm}
-\caption{\normalsize{Graph isomorphism on biological graphs}}\label{fig:bioISO}
+\caption{\normalsize{Graph isomorphism problem on biological graphs}}\label{fig:bioISO}
 \end{figure}
 \subsection{Random graphs}
-This section compares VF2++ with VF2 Plus on random graphs of a large
+This section compares VF2++ with VF2~Plus on random graphs of large
 size. The node labels are uniformly distributed.  Let $\delta$ denote
 the average degree.  For the parameters of problems solved in the
 experiments, please see the top of each chart.
-\subsubsection{Graph isomorphism}
+\subsubsection{Graph isomorphism problem}
 To evaluate the efficiency of the algorithms in the case of graph
-isomorphism, random connected graphs of less than 20 000 nodes have been
+isomorphism problem, random connected graphs of less than 20 000 nodes have been
 considered. Generating a random graph and shuffling its nodes, an
-isomorphism had to be found. Figure \ref{fig:randISO} shows the runtime results
+isomorphism had to be found. Figure~\ref{fig:randISO} shows the runtime results
 on graph sets of various density.
 \vspace*{-1.5cm}
 \hspace*{-1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random ISO, $\delta = 5$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random ISO, $\delta = 5$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/iso/vf2pIso5_1.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/iso/vf2ppIso5_1.txt};
 \end{axis}
 \end{subfigure}
 %\hspace{1cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random ISO, $\delta = 10$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random ISO, $\delta = 10$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/iso/vf2pIso10_1.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/iso/vf2ppIso10_1.txt};
 \end{axis}
 %%\hspace{1cm}
 \hspace*{-1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random ISO, $\delta = 15$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random ISO, $\delta = 15$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/iso/vf2pIso15_1.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/iso/vf2ppIso15_1.txt};
 \end{axis}
 \end{center}
 \end{subfigure}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random ISO, $\delta = 100$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random ISO, $\delta = 100$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \thinspace}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/iso/vf2pIso100_1.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/iso/vf2ppIso100_1.txt};
 \end{axis}
 \end{tikzpicture}
 \end{center}
 \end{subfigure}
 \vspace*{-0.8cm}
-\caption{ISO on random graphs.
+\caption{Graph isomorphism problem on random graphs
 }\label{fig:randISO}
 \end{figure}
-\subsubsection{Induced subgraph isomorphism}
+\subsubsection{Induced subgraph isomorphism problem}
-This section presents a comparison of VF2++ and VF2 Plus in the case
+This section presents a comparison of VF2++ and VF2~Plus in the case
-of induced subgraph isomorphism. In addition to the size of the large
+of induced subgraph isomorphism problem. In addition to the size of graph $G_2$, that of $G_1$ dramatically influences the hardness of
-graph, that of the small graph dramatically influences the hardness of
+a given problem too, so the overall picture is provided by examining graphs to be embedded of various size.
-a given problem too, so the overall picture is provided by examining
-small graphs of various size.
 For each chart, a number $0<\rho< 1$ has been fixed, and the following
 has been executed 150 times. Generating a large graph $G_{2}$ of an average degree of $\delta$,
-choose 10 of its induced subgraphs having $\rho\ |V_{2}|$ nodes,
+choose 10 of its induced subgraphs having $\rho|V_{2}|$ nodes,
-and for all the 10 subgraphs find a mapping by using both the graph
+and for all the 10 subgraphs find a mapping by using both graph
 matching algorithms.  The $\delta = 5, 10, 35$ and $\rho = 0.05, 0.1,
 0.3, 0.8$ cases have been examined, see
-Figure~\ref{fig:randIND5}, \ref{fig:randIND10} and
+Figure~\ref{fig:randIND5},~\ref{fig:randIND10}~and~\ref{fig:randIND35}.
-\ref{fig:randIND35}.
 \vspace*{-1.5cm}
 \hspace*{-1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 5$, $\rho = 0.05$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 5$, $\rho = 0.05$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd5_0.05.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd5_0.05.txt};
 \end{axis}
 \end{center}
 \end{subfigure}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 5$, $\rho = 0.1$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 5$, $\rho = 0.1$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd5_0.1.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd5_0.1.txt};
 \end{axis}
 \end{subfigure}
 \hspace*{-1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 5$, $\rho = 0.3$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 5$, $\rho = 0.3$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd5_0.3.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd5_0.3.txt};
 \end{axis}
 \end{center}
 \end{subfigure}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 5$, $\rho = 0.8$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 5$, $\rho = 0.8$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd5_0.8.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd5_0.8.txt};
 \end{axis}
 \end{tikzpicture}
 \end{center}
 \end{subfigure}
 \vspace*{-0.8cm}
-\caption{IND on graphs having an average degree of
+\caption{Induced subgraph isomorphism problem on random graphs having an average degree of
-5.}\label{fig:randIND5}
+5}\label{fig:randIND5}
 \end{figure}
 \begin{figure}
 \vspace*{-1.5cm}
 \hspace*{-1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \hspace*{-0.5cm}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 10$, $\rho = 0.05$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 10$, $\rho = 0.05$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd10_0.05.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd10_0.05.txt};
 \end{axis}
 \end{subfigure}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \hspace*{-0.5cm}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 10$, $\rho = 0.1$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 10$, $\rho = 0.1$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd10_0.1.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd10_0.1.txt};
 \end{axis}
 \end{subfigure}
 \hspace*{-1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 10$, $\rho = 0.3$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 10$, $\rho = 0.3$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd10_0.3.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd10_0.3.txt};
 \end{axis}
 \end{center}
 \end{subfigure}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 10$, $\rho = 0.8$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 10$, $\rho = 0.8$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd10_0.8.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd10_0.8.txt};
 \end{axis}
 \end{tikzpicture}
 \end{center}
 \end{subfigure}
 \vspace*{-0.8cm}
-\caption{IND on graphs having an average degree of
+\caption{Induced subgraph isomorphism problem on random graphs having an average degree of
-10.}\label{fig:randIND10}
+10}\label{fig:randIND10}
 \end{figure}
 \begin{figure}
 \vspace*{-1.5cm}
 \hspace*{-1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 35$, $\rho = 0.05$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 35$, $\rho = 0.05$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd35_0.05.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd35_0.05.txt};
 \end{axis}
 \end{center}
 \end{subfigure}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 35$, $\rho = 0.1$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 35$, $\rho = 0.1$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd35_0.1.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd35_0.1.txt};
 \end{axis}
 \end{subfigure}
 \hspace*{-1.5cm}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 35$, $\rho = 0.3$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 35$, $\rho = 0.3$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd35_0.3.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd35_0.3.txt};
 \end{axis}
 \end{center}
 \end{subfigure}
 \begin{subfigure}[b]{0.55\textwidth}
 \begin{center}
 \begin{tikzpicture}
-\begin{axis}[title={Random IND, $\delta = 35$, $\rho = 0.8$},width=7.2cm,height=6cm,xlabel={target size},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
+\begin{axis}[title={Random IND, $\delta = 35$, $\rho = 0.8$},width=7.2cm,height=6cm,xlabel={$|V_2|$},ylabel={time (ms)},ylabel near ticks,legend entries={VF2 Plus,VF2++},grid
 =major,mark size=1.2pt, legend style={at={(0,1)},anchor=north
 west},scaled x ticks = false,x tick label style={/pgf/number
-format/1000 sep = \space}]
+format/1000 sep = \kern 0.08em},y tick label style={/pgf/number
+format/1000 sep = \kern 0.08em}]
 %\addplot+[only marks] table {proteinsOrig.txt};
 \addplot table {randGraph/ind/vf2pInd35_0.8.txt};
 \addplot[mark=triangle*,mark size=1.8pt,color=red] table
 {randGraph/ind/vf2ppInd35_0.8.txt};
 \end{axis}
 \end{tikzpicture}
 \end{center}
 \end{subfigure}
 \vspace*{-0.8cm}
-\caption{IND on graphs having an average degree of
+\caption{Induced subgraph isomorphism problem on random graphs having an average degree of
-35.}\label{fig:randIND35}
+35}\label{fig:randIND35}
 \end{figure}
-Based on these experiments, VF2++ is faster than VF2 Plus and able to
+Based on these experiments, VF2++ is faster than VF2~Plus and able to
 handle really large graphs in milliseconds. Note that when $IND$ was
-considered and the small graphs had proportionally few nodes ($\rho =
+considered and the graph to be embedded had proportionally few nodes ($\rho =
-0.05$, or $\rho = 0.1$), then VF2 Plus produced some inefficient node
+0.05$, or $\rho = 0.1$), then VF2~Plus produced some inefficient node
 orders (e.g. see the $\delta=10$ case on
 Figure~\ref{fig:randIND10}). If these instances had been excluded, the
-charts would have seemed to be similar to the other ones.
+charts would have looked similarly to the other ones.
 Unsurprisingly, as denser graphs are considered, both VF2++ and VF2
 Plus slow slightly down, but remain practically usable even on graphs
 having 10 000 nodes.
 \section{Conclusion}
-This paper presented VF2++, a new graph matching algorithm based on VF2, called VF2++, and analyzed it from a practical viewpoint.
+This paper presented VF2++, a new graph matching algorithm based on VF2, and analysed it from a practical viewpoint.
 Recognizing the importance of the node order and determining an
 efficient one, VF2++ is able to match graphs of thousands of nodes in
 near practically linear time including preprocessing. In addition to
-the proper order, VF2++ uses more efficient consistency and cutting
+the proper order, VF2++ uses more efficient cutting
 rules which are easy to compute and make the algorithm able to prune
 most of the unfruitful branches without going astray.
 In order to show the efficiency of the new method, it has been
-compared to VF2 Plus\cite{VF2Plus}, which is the best contemporary algorithm.
+compared to VF2~Plus~\cite{VF2Plus}, which is the best contemporary algorithm.
-.
+The experiments show that VF2++ consistently outperforms VF2~Plus on
-The experiments show that VF2++ consistently outperforms VF2 Plus on
 biological graphs. It seems to be asymptotically faster on protein and
-on contact map graphs in the case of induced subgraph isomorphism,
+on contact map graphs in the case of induced subgraph isomorphism problem,
-while in the case of graph isomorphism, it has definitely better
+while in the case of graph isomorphism problem, it has definitely better
 asymptotic behaviour on protein graphs.
 Regarding random sparse graphs, not only has VF2++ proved itself to be
-faster than VF2 Plus, but it also has a practically linear behaviour both
+faster than VF2~Plus, but it also has a practically linear behaviour both
-in the case of induced subgraph- and graph isomorphism.
+in the case of induced subgraph and graph isomorphism problems.
 %%%%%%%%%%%%%%%%
 \section*{Acknowledgement} \label{sec:ack}
 %%%%%%%%%%%%%%%%
 This research project was initiated and sponsored by QuantumBio
-Inc.\cite{QUANTUMBIO}.
+Inc.~\cite{QUANTUMBIO}.
 The authors were supported by the Hungarian Scientific Research Fund -
 OTKA, K109240 and by the J\'anos Bolyai Research Fellowship program of
 the Hungarian Academy of Sciences.

changeset 28	523fddfd7a01
parent 27	497868c58d36
child 29	0ff72a828b16