Name

Edit

The pattern *x* is factorized in two parts *x*_{} and *x*_{r} such that *x*=*x*_{}x_{r}. Then the search phase of the Two Way algorithm consists in comparing the characters of *x*_{r} from left to right and then, if no mismatch occurs during that first stage, in comparing the characters of *x*_{} from right to left in a second stage.

The preprocessing phase of the algorithm consists then in choosing a good **factorization***x*_{}x_{r}.

**Definition**

Let (*u*, *v*) be a factorization of *x*. A * repetition* in (

*w*is a suffix of*u*or*u*is a suffix of*w*;*w*is a prefix of*v*of*v*is a prefix of*w*.

In other words *w* occurs at both sides of the cut between *u* and *v* with a possible overflow on either side. The length of a repetition in (*u*,*v*) is called a * local period* and the length of the smallest repetition in (

Each factorization (*u*,*v*) of has at least one repetition. It can be easily seen that 1 *r*(*u*,*v*) |*x*|

A factorization (*u*,*v*) of *x* such that *r*(*u*,*v*)=*per*(*x*) is called a * critical factorization* of

If (*u*,*v*) is a critical factorization of *x* then at the position |*u*| in *x* the global and the local periods are the same. The Two Way algorithm chooses the critical factorization (*x*_{},*x*_{r}) such that |*x*_{}| < *per*(*x*) and |*x*_{}| is minimal.

To compute the critical factorization *x*_{},*x*_{r} of *x* we first compute the maximal suffix *z* of *x* for the order and the maximal suffix for the reverse order . Then (*x*_{},*x*_{r}) is chosen such that. |*x*_{}|=*max*{|*z*|,||}

The preprocessing phase can be done in * O*(

The searching phase of the Two Way algorithm consists in first comparing the character of *x*_{r} from left to right, then the character of *x*_{} from right to left.

When a mismatch occurs when scanning the *k*-th character of *x*_{r}, then a shift of length *k* is performed.

When a mismatch occurs when scanning *x*_{}, or when an occurrence of the pattern is found, then a shift of length *per*(*x*) is performed.

Such a scheme leads to a quadratic worst case algorithm, this can be avoided by a prefix memorization: when a shift of length *per*(*x*) is performed the length of the matching prefix of the pattern at the beginning of the window (namely *m*-*per*(*x*)) after the shift is memorized to avoid to scan it again during the next attempt.

The searching phase of the Two Way algorithm can be done in * O*(

The Two Way algorithm performs 2*n*-*m* text character comparisons in the worst case. Breslauer designed a variation on the Two Way algorithm which performs less than 2*n*-*m* comparisons using constant space.

**Main features:**

- requires an ordered alphabet;
- preprocessing phase in O(m) time and constant space complexity;
- constant space complexity for the preprocessing phase;
- searching phase in O(n) time;
- performs 2n-m text character comparisons in the worst case.

**Example:**

Preprocessing phase

Factorisation used by Two Way algorithm.

Searching phase

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

1 | 2 | ||||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 2

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

1 | |||||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 1

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

1 | |||||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 1

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

1 | |||||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 1

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

7 | 8 | 1 | 2 | 3 | 4 | 5 | 6 | ||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 7

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

1 | 2 | ||||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 2

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

1 | 2 | ||||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 2

G | C | A | T | C | G | C | A | G | A | G | A | G | T | A | T | A | C | A | G | T | A | C | G |

1 | 2 | 3 | |||||||||||||||||||||

G | C | A | G | A | G | A | G |

Shift by: 3

The Two Way algorithm performs 20 character comparisons on the example.

C