Name

Edit

The Boyer-Moore type algorithms match some suffixes of the pattern but it is possible to match some prefixes of the pattern by scanning the character of the window from right to left and then improve the length of the shifts. This is made possible by the use of the smallest suffix automaton (also called DAWG for Directed Acyclic Word Graph) of the reverse pattern. The resulting algorithm is called the Reverse Factor algorithm.

The smallest suffix automaton of a word *w* is a Deterministic Finite Automaton * S*(

During the searching phase, the Reverse Factor algorithm parses the characters of the window from right to left with the automaton * S*(

The Reverse Factor algorithm has a quadratic worst case time complexity but it is optimal in average. It performs * O*(

**Main features:**

- uses the suffix automaton of
*x*^{R}; - fast on practice for long pattens and small alphabets;
- preprocessing phase in O(m) time and space complexity;
- searching phase in O(mn) time complexity;
- optimal in the average.

**Example:**

Preprocessing phase

Suffix automaton used by Reverse Factor algorithm.

Searching phase

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

* | 8 | 7 | 2 | ||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 5 (8 - 3)

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

* | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | |||||||||||||||

- | G | C | A | G | A | G | A | G |

Shift by: 7 (8 - 1)

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

* | 7 | 2 | 1 | ||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 7 (8 - 1)

The Reverse Factor algorithm performs 17 character comparisons on the example.

C