Name
• String
• Substring
• Search
Edit

The character comparisons are done using a specific order given by a table h.

For each integer i such that  i  m we define two disjoint sets:
 Pos(i)={k :  0  k  i and x[i] = x[i-k]}
 Neg(i)={k :  0  k  i and x[i]  x[i-k]}

For  k  m, let hmin[k] be the minimum integer  such that   k-1 and k not in Neg(i) for all i such that  < i  m-1.

For    m-1, let kmin[] be the minimum integer k such that hmin[k]=  k if any such k exists and kmin[]=0 otherwise.

For    m-1, let rmin[] be the minimum integer k such that r >  and hmin[r]=r-1.

The value of h[0] is set to m-1. After that we choose in increasing order of kmin[], all the indexes h[1], ... , h[d] such that kmin[h[i]]  0 and we set rcGs[i] to kmin[h[i]] for  i  d. Then we choose the indexes h[d+1], ... , h[m-1] in increasing order and we set rcGs[i] to rmin[h[i]] for d < i < m.

The value of rcGs[m] is set to the period of x.

The table rcBc is defined as follows: rcBc[as]=min{k :  (k=m or x[m-k-1]=a) and (k > m-s-1 or x[m-k-s-1]=x[m-s-1])} To compute the table rcBc we define: for each c in locc[c] is the index of the rightmost occurrence of c in x[0 .. m-2] (locc[c] is set to -1 if c does not occur in x[0 .. m-2]).

A table link is used to link downward all the occurrences of each pattern character.

The preprocessing phase can be performed in O(m2) time and O(m) space complexity. The searching phase is in O(n) time complexity.

Main features:

• refinement of the Boyer-Moore algorithm;
• partitions the set of pattern positions into two disjoint subsets;
• preprocessing phase in O(m2) time and O(m) space;
• searching phase in O(n) time complexity;
• 2n text character comparisons in the worst case.

Example:

Preprocessing phase

Tables used by Reverse Colussi algorithm

Searching phase

 G C A T C G C A G A G A G T A T A C A G T A C G 1 G C A G A G A G

Shift by: 1 (rcBc[A][8])

 G C A T C G C A G A G A G T A T A C A G T A C G 2 1 G C A G A G A G

Shift by: 2 (rcGs[2])

 G C A T C G C A G A G A G T A T A C A G T A C G 2 1 G C A G A G A G

Shift by: 2 (rcGs[2])

 G C A T C G C A G A G A G T A T A C A G T A C G 5 6 7 2 8 3 4 1 G C A G A G A G

Shift by: 7 (rcGs[9])

 G C A T C G C A G A G A G T A T A C A G T A C G 2 1 G C A G A G A G

Shift by: 2 (rcGs[2])

 G C A T C G C A G A G A G T A T A C A G T A C G 1 G C A G A G A G

Shift by: 5 (rcBc[A][2])

The Reverse Colussi algorithm performs 16 character comparisons on the example.

C