Name

Edit

The Boyer-Moore type algorithms match some suffixes of the pattern but it is possible to match some prefixes of the pattern by scanning the character of the window from right to left and then improve the length of the shifts. This is make possible by the use of the suffix oracle of the reverse pattern. This data structure is a very compact automaton which recognizes at least all the suffixes of a word and slightly more other words The string-matching algorithm using the * oracle* of the reverse pattern is called the Backward Oracle Matching algorithm.

The suffix oracle of a word *w* is a Deterministic Finite Automaton * O*(

The language accepted by * O*(

The preprocessing phase of the Backward Oracle Matching algorithm consists in computing the suffix oracle for the reverse pattern *x*^{R}. Despite the fact that it is able to recognize words that are not factor of the pattern, the suffix oracle can be used to do string-matching since the only word of length greater or equal *m* which is recognized by the oracle is the reverse pattern itself. The computation of the oracle is linear in time and space in the length of the pattern.

During the searching phase the Backward Oracle Matching algorithm parses the characters of the window from right to left with the automaton * O*(

The Backward Oracle Matching algorithm has a quadratic worst case time complexity but it is optimal in average. On the average it performs * O*(

**Main features:**

version of the Reverse Factor algorithm using the suffix oracle of *x*^{R} instead of the suffix automaton of *x*^{R};

fast in practice for very long patterns and small alphabets;

preprocessing phase in O(m) time and space complexity;

searching phase in O(mn) time complexity;

optimal in the average.

**Example:**

Preprocessing phase

Oracle used by Backward Oracle Matching algorithm.

Searching phase

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

* | 8 | 7 | 2 | ||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 5 (8 - 3)

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

* | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | |||||||||||||||

- | G | C | A | G | A | G | A | G |

Shift by: 7 (8 - 1)

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

* | 7 | 2 | 1 | ||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 7 (8 - 1)

The Backward Oracle Matching algorithm algorithm performs 17 character comparisons on the example.

C