1057 \end{cases} |
1057 \end{cases} |
1058 \] |
1058 \] |
1059 Where $i\in\{0,1, ..,|G_{small}|-1\}$, $v\in V_{large}$ and $INVALID$ |
1059 Where $i\in\{0,1, ..,|G_{small}|-1\}$, $v\in V_{large}$ and $INVALID$ |
1060 means "no node". |
1060 means "no node". |
1061 \subsubsection{Avoiding the recurrence} |
1061 \subsubsection{Avoiding the recurrence} |
1062 Exploring the state space was described in a recursive fashion using |
|
1063 sets (see Algorithm~\ref{alg:VF2Pseu}), which makes the |
|
1064 algorithm easy to understand but does not show directly an efficient |
|
1065 way of implementation. The following approach avoiding recursion and |
|
1066 using lookup tables instead of sets is not only fast but has linear |
|
1067 space complexity as well. |
|
1068 |
|
1069 The recursion of Algorithm~\ref{alg:VF2Pseu} can be realized |
1062 The recursion of Algorithm~\ref{alg:VF2Pseu} can be realized |
1070 as a while loop, which has a loop counter $depth$ denoting the |
1063 as a \textit{while loop}, which has a loop counter $depth$ denoting the |
1071 all-time depth of the recursion. Fixing a matching order, let $M$ |
1064 all-time depth of the recursion. Fixing a matching order, let $M$ |
1072 denote the array storing the all-time mapping. The initial state is |
1065 denote the array storing the all-time mapping. Based on Claim~\ref{claim:claimCoverFromLeft}, |
1073 associated with the empty mapping, which means that $\forall i: |
|
1074 M[i]=INVALID$ and $depth=0$. In case of a recursive call, $depth$ has |
|
1075 to be incremented, while in case of a return, it has to be |
|
1076 decremented. Based on Claim~\ref{claim:claimCoverFromLeft}, |
|
1077 $M$ is $INVALID$ from index $depth$+1 and not $INVALID$ before |
1066 $M$ is $INVALID$ from index $depth$+1 and not $INVALID$ before |
1078 $depth$, i.e. $\forall i: i < depth \Rightarrow M[i]\neq INVALID$ and |
1067 $depth$. $M[depth]$ changes |
1079 $\forall i: i > depth \Rightarrow M[i]= INVALID$. $M[depth]$ changes |
|
1080 while the state is being processed, but the property is held before |
1068 while the state is being processed, but the property is held before |
1081 both stepping back to a predecessor state and exploring a successor |
1069 both stepping back to a predecessor state and exploring a successor |
1082 state. |
1070 state. |
1083 |
1071 |
1084 The necessary part of the candidate set is easily maintainable or |
1072 The necessary part of the candidate set is easily maintainable or |
1098 E_{large}$. That is, each covered neighbour of $u$ has to be mapped to |
1086 E_{large}$. That is, each covered neighbour of $u$ has to be mapped to |
1099 a covered neighbour of $v$. |
1087 a covered neighbour of $v$. |
1100 |
1088 |
1101 Having said that, an algorithm running in $\Theta(deg)$ time is |
1089 Having said that, an algorithm running in $\Theta(deg)$ time is |
1102 describable if there exists a covered node in the component containing |
1090 describable if there exists a covered node in the component containing |
1103 $u$. In this case choose a covered neighbour $u'$ of $u$ arbitrarily |
1091 $u$, and a linear one other wise. |
1104 --- such a node exists based on |
|
1105 Claim~\ref{claim:MOclaim}. With all the candidates of $u$ |
|
1106 being among the uncovered neighbours of $Pair(M,u')$, there are solely |
|
1107 $deg(Pair(M,u'))$ nodes to check. |
|
1108 |
|
1109 An easy trick is to choose an $u'$, for which |
|
1110 $|\{uncovered\ neighbours\ $ $of\ Pair(M,u')\}|$ is the smallest |
|
1111 possible. |
|
1112 |
|
1113 Note that if $u$ is the first node of its component, then all the |
|
1114 uncovered nodes of $G_{large}$ are candidates, so giving a sublinear |
|
1115 method is impossible. |
|
1116 |
1092 |
1117 |
1093 |
1118 \subsubsection{Determining the node order} |
1094 \subsubsection{Determining the node order} |
1119 This section describes how the node order preprocessing method of |
1095 This section describes how the node order preprocessing method of |
1120 VF2++ can efficiently be implemented. |
1096 VF2++ can efficiently be implemented. |
1121 |
1097 |
1122 For using lookup tables, the node labels are associated with the |
1098 For using lookup tables, the node labels are associated with the |
1123 numbers $\{0,1,..,|K|-1\}$, where $K$ is the set of the labels. It |
1099 numbers $\{0,1,..,|K|-1\}$, where $K$ is the set of the labels. It |
1124 enables $F_\mathcal{M}$ to be stored in an array, for which |
1100 enables $F_\mathcal{M}$ to be stored in an array. At first, the node order |
1125 $F_\mathcal{M}[i]=F_\mathcal{M}(i)$ where $i=0,1,..,|K|-1$. At first, |
|
1126 $\mathcal{M}=\emptyset$, so $F_\mathcal{M}[i]$ is the number of nodes |
1101 $\mathcal{M}=\emptyset$, so $F_\mathcal{M}[i]$ is the number of nodes |
1127 in $V_{small}$ having label i, which is easy to compute in |
1102 in $V_{small}$ having label i, which is easy to compute in |
1128 $\Theta(|V_{small}|)$ steps. |
1103 $\Theta(|V_{small}|)$ steps. |
1129 |
1104 |
1130 $\mathcal{M}\subseteq V_{small}$ can be represented as an array of |
1105 Representing $\mathcal{M}\subseteq V_{small}$ as an array of |
1131 size $|V_{small}|$. |
1106 size $|V_{small}|$, both the computation of the BFS tree, and processing its levels by Algorithm~\ref{alg:VF2PPProcess1} can be done inplace by swapping nodes. |
1132 |
|
1133 The BFS tree is computed by using a FIFO data structure which is |
|
1134 usually implemented as a linked list, but one can avoid it by using |
|
1135 the array $\mathcal{M}$ itself. $\mathcal{M}$ contains all the nodes |
|
1136 seen before, a pointer shows where the first node of the FIFO is, and |
|
1137 another one shows where the next explored node has to be inserted. So |
|
1138 the nodes of each level of the BFS tree can be processed by |
|
1139 Algorithms~\ref{alg:VF2PPProcess1} in place by swapping nodes. |
|
1140 |
1107 |
1141 After a node $u$ gets to the next place of the node order, |
1108 After a node $u$ gets to the next place of the node order, |
1142 $F_\mathcal{M}[lab[u]]$ has to be decreased by one, because there is |
1109 $F_\mathcal{M}[lab[u]]$ has to be decreased by one, because there is |
1143 one less covered node in $V_{large}$ with label $lab(u)$, that is why |
1110 one less covered node in $V_{large}$ with label $lab(u)$, that is why |
1144 min selection sort is preferred which gives the elements from left to |
1111 min selection sort is preferred which gives the elements from left to |
1145 right in descending order, see Algorithm~\ref{alg:VF2PPProcess1}. |
1112 right in descending order, see Algorithm~\ref{alg:VF2PPProcess1}. |
1146 |
|
1147 Note that the \textit{while loop} of Algorithm~\ref{alg:VF2PPPseu} |
|
1148 takes one iteration per graph component and the graphs in biology are |
|
1149 mostly connected. |
|
1150 |
1113 |
1151 \subsubsection{Cutting rules} |
1114 \subsubsection{Cutting rules} |
1152 In Section~\ref{VF2PPCuttingRules}, the cutting rules were |
1115 In Section~\ref{VF2PPCuttingRules}, the cutting rules were |
1153 described using the sets $T_{small}$, $T_{large}$, $\tilde T_{small}$ |
1116 described using the sets $T_{small}$, $T_{large}$, $\tilde T_{small}$ |
1154 and $\tilde T_{large}$, which are dependent on the all-time mapping |
1117 and $\tilde T_{large}$, which are dependent on the all-time mapping |
1162 $L[lab(u')]$ for all $u'\in \Gamma_{small}(u) \cap T_{small}(s)$ and |
1125 $L[lab(u')]$ for all $u'\in \Gamma_{small}(u) \cap T_{small}(s)$ and |
1163 decrementing $L[lab(v')]$ for all $v'\in\Gamma_{large} (v) \cap |
1126 decrementing $L[lab(v')]$ for all $v'\in\Gamma_{large} (v) \cap |
1164 T_{large}(s)$, the first part of the cutting rules is checkable in |
1127 T_{large}(s)$, the first part of the cutting rules is checkable in |
1165 $\Theta(deg)$ time by considering the proper signs of $L$. Setting $L$ |
1128 $\Theta(deg)$ time by considering the proper signs of $L$. Setting $L$ |
1166 to zero takes $\Theta(deg)$ time again, which makes it possible to use |
1129 to zero takes $\Theta(deg)$ time again, which makes it possible to use |
1167 the same table through the whole algorithm. The second part of the |
1130 the same table through the whole algorithm. The second part of the |
1168 cutting rules can be verified using the same method with $\tilde |
1131 cutting rules can be verified using the same method with $\tilde |
1169 T_{small}$ and $\tilde T_{large}$ instead of $T_{small}$ and |
1132 T_{small}$ and $\tilde T_{large}$ instead of $T_{small}$ and |
1170 $T_{large}$. Thus, the overall complexity is $\Theta(deg)$. |
1133 $T_{large}$. Thus, the overall complexity is $\Theta(deg)$. |
1171 |
1134 |
1172 An other integer lookup table storing the number of covered neighbours |
1135 An other integer lookup table storing the number of covered neighbours |