Name
  • String
  • Substring
  • Search
Edit

Raita designed an algorithm which at each attempt first compares the last character of the pattern with the rightmost text character of the window, then if they match it compares the first character of the pattern with the leftmost text character of the window, then if they match it compares the middle character of the pattern with the middle text character of the window. And finally if they match it actually compares the other characters from the second to the last but one, possibly comparing again the middle character.

Raita observed that its algorithm had a good behaviour in practice when searching patterns in English texts and attributed these performance to the existence of character dependencies.
Smith made some more experiments and concluded that this phenomenon may rather be due to compiler effects.

The preprocessing phase of the Raita algorithm consists in computing the bad-character shift function. It can be done in O(m+sigma) time and O(sigma) space complexity.

The searching phase of the Raita algorithm has a quadratic worst case time complexity.


Main features:

  • first compares the last pattern character, then the first and finally the middle one before actually comparing the others;
  • performs the shifts like the Horspool algorithm;
  • preprocessing phase in O(m+sigma) time and O(sigma) space complexity;
  • searching phase in O(mn) time complexity.

Example:

Preprocessing phase

Horspool algorithm bmBc table
bmBc table used by Raita algorithm.

Searching phase

First attempt
GCATCGCAGAGAGTATACAGTACG
 1 
GCAGAGAG 

Shift by: 1 (bmBc[A])

Second attempt
GCATCGCAGAGAGTATACAGTACG
 2      1 
 GCAGAGAG 

Shift by: 2 (bmBc[G])

Third attempt
GCATCGCAGAGAGTATACAGTACG
 2      1 
 GCAGAGAG 

Shift by: 2 (bmBc[G])

Fourth attempt
GCATCGCAGAGAGTATACAGTACG
 24563891 
 GCAGAGAG 

Shift by: 2 (bmBc[G])

Fifth attempt
GCATCGCAGAGAGTATACAGTACG
 1 
 GCAGAGAG 

Shift by: 1 (bmBc[A])

Sixth attempt
GCATCGCAGAGAGTATACAGTACG
 1 
 GCAGAGAG 

Shift by: 8 (bmBc[T])

Seventh attempt
GCATCGCAGAGAGTATACAGTACG
 2      1
 GCAGAGAG

Shift by: 2 (bmBc[G])

The Raita algorithm performs 18 character comparisons on the example.

C