In information theory and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertion, deletion, substitution) required to change one word into the other. The phrase edit distance is often used to refer specifically to Levenshtein distance.
The Levenshtein distance between two strings is defined as the minimum number of edits needed to transform one string into the other, with the allowable edit operations being insertion, deletion, or substitution of a single character. It is named after Vladimir Levenshtein, who considered this distance in 1965. It is closely related to pairwise string alignments.
Mathematically, the Levenshtein distance between two strings is given by where
Note that the first element in the minimum corresponds to deletion (from to ), the second to insertion and the third to match or mismatch, depending on whether the respective symbols are the same.
For example, the Levenshtein distance between "kitten" and "sitting" is 3, since the following three edits change one into the other, and there is no way to do it with fewer than three edits:
The Levenshtein distance has several simple upper and lower bounds that are useful in applications which are applied with many of them and compare them. These include: