Name
  • String
  • Substring
  • Search
Edit
    Searching a word x with an automaton consists first in building the minimal Deterministic Finite Automaton (DFA)  A(x) recognizing the language *x.
The DFA  A(x) =(Q, q0, T, E) recognizing the language *x is defined as follows:
  • Q is the set of all the prefixes of x: Q={, x[0], x[0 .. 1], ... , x[0 .. m-2], x};
  • q0=E; 
  • T={x}; 
  • for q in Q (q is a prefix of x) and a in , (q, a, qa) is in E if and only if qa is also a prefix of x, otherwise (q, a, p) is in E such that p is the longest suffix of qa which is a prefix of x.

    The DFA  A(x) can be constructed in O(m+) time and O(m) space.

    Once the DFA  A(x) is build, searching for a word x in a text y consists in parsing the text y with the DFA  A(x) beginning with the initial state q0. Each time the terminal state is encountered an occurrence of x is reported.

    The searching phase can be performed in O(n) time if the automaton is stored in a direct access table, in O(nlog()) otherwise.


Main features:

  • builds the minimal deterministic automaton recognizing the language *x;
  • extra space in O(m) if the automaton is stored in a direct access table;
  • preprocessing phase in O(m) time complexity;
  • searching phase in O(n) time complexity if the automaton is stored in a direct access table, O(nlog()) otherwise.
C