damecco.tex
changeset 9 2ad9aa6d7f63
parent 8 49574b75404b
child 10 1e4a79cd8332
equal deleted inserted replaced
7:eace827f74a9 8:1942b1d71a41
  1057   \end{cases}
  1057   \end{cases}
  1058 \]
  1058 \]
  1059 Where $i\in\{0,1, ..,|G_{small}|-1\}$, $v\in V_{large}$ and $INVALID$
  1059 Where $i\in\{0,1, ..,|G_{small}|-1\}$, $v\in V_{large}$ and $INVALID$
  1060 means "no node".
  1060 means "no node".
  1061 \subsubsection{Avoiding the recurrence}
  1061 \subsubsection{Avoiding the recurrence}
  1062 Exploring the state space was described in a recursive fashion using
       
  1063 sets (see Algorithm~\ref{alg:VF2Pseu}), which makes the
       
  1064 algorithm easy to understand but does not show directly an efficient
       
  1065 way of implementation. The following approach avoiding recursion and
       
  1066 using lookup tables instead of sets is not only fast but has linear
       
  1067 space complexity as well.
       
  1068 
       
  1069 The recursion of Algorithm~\ref{alg:VF2Pseu} can be realized
  1062 The recursion of Algorithm~\ref{alg:VF2Pseu} can be realized
  1070 as a while loop, which has a loop counter $depth$ denoting the
  1063 as a \textit{while loop}, which has a loop counter $depth$ denoting the
  1071 all-time depth of the recursion.  Fixing a matching order, let $M$
  1064 all-time depth of the recursion. Fixing a matching order, let $M$
  1072 denote the array storing the all-time mapping.  The initial state is
  1065 denote the array storing the all-time mapping. Based on Claim~\ref{claim:claimCoverFromLeft},
  1073 associated with the empty mapping, which means that $\forall i:
       
  1074 M[i]=INVALID$ and $depth=0$.  In case of a recursive call, $depth$ has
       
  1075 to be incremented, while in case of a return, it has to be
       
  1076 decremented.  Based on Claim~\ref{claim:claimCoverFromLeft},
       
  1077 $M$ is $INVALID$ from index $depth$+1 and not $INVALID$ before
  1066 $M$ is $INVALID$ from index $depth$+1 and not $INVALID$ before
  1078 $depth$, i.e. $\forall i: i < depth \Rightarrow M[i]\neq INVALID$ and
  1067 $depth$. $M[depth]$ changes
  1079 $\forall i: i > depth \Rightarrow M[i]= INVALID$. $M[depth]$ changes
       
  1080 while the state is being processed, but the property is held before
  1068 while the state is being processed, but the property is held before
  1081 both stepping back to a predecessor state and exploring a successor
  1069 both stepping back to a predecessor state and exploring a successor
  1082 state.
  1070 state.
  1083 
  1071 
  1084 The necessary part of the candidate set is easily maintainable or
  1072 The necessary part of the candidate set is easily maintainable or
  1098 E_{large}$. That is, each covered neighbour of $u$ has to be mapped to
  1086 E_{large}$. That is, each covered neighbour of $u$ has to be mapped to
  1099 a covered neighbour of $v$.
  1087 a covered neighbour of $v$.
  1100 
  1088 
  1101 Having said that, an algorithm running in $\Theta(deg)$ time is
  1089 Having said that, an algorithm running in $\Theta(deg)$ time is
  1102 describable if there exists a covered node in the component containing
  1090 describable if there exists a covered node in the component containing
  1103 $u$. In this case choose a covered neighbour $u'$ of $u$ arbitrarily
  1091 $u$, and a linear one other wise.
  1104 --- such a node exists based on
       
  1105 Claim~\ref{claim:MOclaim}. With all the candidates of $u$
       
  1106 being among the uncovered neighbours of $Pair(M,u')$, there are solely
       
  1107 $deg(Pair(M,u'))$ nodes to check.
       
  1108 
       
  1109 An easy trick is to choose an $u'$, for which
       
  1110 $|\{uncovered\ neighbours\ $ $of\ Pair(M,u')\}|$ is the smallest
       
  1111 possible.
       
  1112 
       
  1113 Note that if $u$ is the first node of its component, then all the
       
  1114 uncovered nodes of $G_{large}$ are candidates, so giving a sublinear
       
  1115 method is impossible.
       
  1116 
  1092 
  1117 
  1093 
  1118 \subsubsection{Determining the node order}
  1094 \subsubsection{Determining the node order}
  1119 This section describes how the node order preprocessing method of
  1095 This section describes how the node order preprocessing method of
  1120 VF2++ can efficiently be implemented.
  1096 VF2++ can efficiently be implemented.
  1121 
  1097 
  1122 For using lookup tables, the node labels are associated with the
  1098 For using lookup tables, the node labels are associated with the
  1123 numbers $\{0,1,..,|K|-1\}$, where $K$ is the set of the labels. It
  1099 numbers $\{0,1,..,|K|-1\}$, where $K$ is the set of the labels. It
  1124 enables $F_\mathcal{M}$ to be stored in an array, for which
  1100 enables $F_\mathcal{M}$ to be stored in an array. At first, the node order
  1125 $F_\mathcal{M}[i]=F_\mathcal{M}(i)$ where $i=0,1,..,|K|-1$. At first,
       
  1126 $\mathcal{M}=\emptyset$, so $F_\mathcal{M}[i]$ is the number of nodes
  1101 $\mathcal{M}=\emptyset$, so $F_\mathcal{M}[i]$ is the number of nodes
  1127 in $V_{small}$ having label i, which is easy to compute in
  1102 in $V_{small}$ having label i, which is easy to compute in
  1128 $\Theta(|V_{small}|)$ steps.
  1103 $\Theta(|V_{small}|)$ steps.
  1129 
  1104 
  1130 $\mathcal{M}\subseteq V_{small}$ can be represented as an array of
  1105 Representing $\mathcal{M}\subseteq V_{small}$ as an array of
  1131 size $|V_{small}|$.
  1106 size $|V_{small}|$, both the computation of the BFS tree, and processing its levels by Algorithm~\ref{alg:VF2PPProcess1} can be done inplace by swapping nodes.
  1132 
       
  1133 The BFS tree is computed by using a FIFO data structure which is
       
  1134 usually implemented as a linked list, but one can avoid it by using
       
  1135 the array $\mathcal{M}$ itself. $\mathcal{M}$ contains all the nodes
       
  1136 seen before, a pointer shows where the first node of the FIFO is, and
       
  1137 another one shows where the next explored node has to be inserted. So
       
  1138 the nodes of each level of the BFS tree can be processed by
       
  1139 Algorithms~\ref{alg:VF2PPProcess1} in place by swapping nodes.
       
  1140 
  1107 
  1141 After a node $u$ gets to the next place of the node order,
  1108 After a node $u$ gets to the next place of the node order,
  1142 $F_\mathcal{M}[lab[u]]$ has to be decreased by one, because there is
  1109 $F_\mathcal{M}[lab[u]]$ has to be decreased by one, because there is
  1143 one less covered node in $V_{large}$ with label $lab(u)$, that is why
  1110 one less covered node in $V_{large}$ with label $lab(u)$, that is why
  1144 min selection sort is preferred which gives the elements from left to
  1111 min selection sort is preferred which gives the elements from left to
  1145 right in descending order, see Algorithm~\ref{alg:VF2PPProcess1}.
  1112 right in descending order, see Algorithm~\ref{alg:VF2PPProcess1}.
  1146 
       
  1147 Note that the \textit{while loop} of Algorithm~\ref{alg:VF2PPPseu}
       
  1148 takes one iteration per graph component and the graphs in biology are
       
  1149 mostly connected.
       
  1150 
  1113 
  1151 \subsubsection{Cutting rules}
  1114 \subsubsection{Cutting rules}
  1152 In Section~\ref{VF2PPCuttingRules}, the cutting rules were
  1115 In Section~\ref{VF2PPCuttingRules}, the cutting rules were
  1153 described using the sets $T_{small}$, $T_{large}$, $\tilde T_{small}$
  1116 described using the sets $T_{small}$, $T_{large}$, $\tilde T_{small}$
  1154 and $\tilde T_{large}$, which are dependent on the all-time mapping
  1117 and $\tilde T_{large}$, which are dependent on the all-time mapping
  1162 $L[lab(u')]$ for all $u'\in \Gamma_{small}(u) \cap T_{small}(s)$ and
  1125 $L[lab(u')]$ for all $u'\in \Gamma_{small}(u) \cap T_{small}(s)$ and
  1163 decrementing $L[lab(v')]$ for all $v'\in\Gamma_{large} (v) \cap
  1126 decrementing $L[lab(v')]$ for all $v'\in\Gamma_{large} (v) \cap
  1164 T_{large}(s)$, the first part of the cutting rules is checkable in
  1127 T_{large}(s)$, the first part of the cutting rules is checkable in
  1165 $\Theta(deg)$ time by considering the proper signs of $L$. Setting $L$
  1128 $\Theta(deg)$ time by considering the proper signs of $L$. Setting $L$
  1166 to zero takes $\Theta(deg)$ time again, which makes it possible to use
  1129 to zero takes $\Theta(deg)$ time again, which makes it possible to use
  1167 the same table through the whole algorithm.  The second part of the
  1130 the same table through the whole algorithm. The second part of the
  1168 cutting rules can be verified using the same method with $\tilde
  1131 cutting rules can be verified using the same method with $\tilde
  1169 T_{small}$ and $\tilde T_{large}$ instead of $T_{small}$ and
  1132 T_{small}$ and $\tilde T_{large}$ instead of $T_{small}$ and
  1170 $T_{large}$. Thus, the overall complexity is $\Theta(deg)$.
  1133 $T_{large}$. Thus, the overall complexity is $\Theta(deg)$.
  1171 
  1134 
  1172 An other integer lookup table storing the number of covered neighbours
  1135 An other integer lookup table storing the number of covered neighbours