Approximate string matching. Finding strings that approximately match a pattern Overview[edit] The closeness of a match is measured in terms of the number of primitive operations necessary to convert the string into an exact match. This number is called the edit distance between the string and the pattern. The usual primitive operations are: insertion: cot → coatdeletion: coat → cotsubstitution: coat → cost These three operations may be generalized as forms of substitution by adding a NULL character (here symbolized by *) wherever a character has been deleted or inserted: insertion: co*t → coatdeletion: coat → co*tsubstitution: coat → cost Some approximate matchers also treat transposition, in which the positions of two letters in the string are swapped, to be a primitive operation. transposition: cost → cots Different approximate matchers impose different constraints.
Problem formulation and algorithms[edit] One possible definition of the approximate string matching problem is the following: Given a pattern string. Best Fuzzy Matching Algorithm. Efficient Top-k Algorithms for Fuzzy Search in StringCollections. Robust and Efficient Fuzzy Match for Online Data Cleaning. Fuzzy matching scoring algorithm.