background preloader

Reference

Facebook Twitter

Approximate string matching. Fuzzy Mediawiki search for "angry emoticon": "Did you mean: andré emotions" Overview[edit] The closeness of a match is measured in terms of the number of primitive operations necessary to convert the string into an exact match.

Approximate string matching

This number is called the edit distance between the string and the pattern. The usual primitive operations are:[1] Dynamic programming. In mathematics, computer science, economics, and bioinformatics, dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems.

Dynamic programming

It is applicable to problems exhibiting the properties of overlapping subproblems[1] and optimal substructure (described below). When applicable, the method takes far less time than naive methods that don't take advantage of the subproblem overlap (like depth-first search). Full text search. In text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full text database.

Full text search

Full-text search is distinguished from searches based on metadata or on parts of the original texts represented in databases (such as titles, abstracts, selected sections, or bibliographical references). In a full-text search, a search engine examines all of the words in every stored document as it tries to match search criteria (text specified by a user).

Full-text-searching techniques became common in online bibliographic databases in the 1990s[verification needed]. Many websites and application programs (such as word processing software) provide full-text-search capabilities. Some web search engines, such as AltaVista, employ full-text-search techniques, while others index only a portion of the web pages examined by their indexing systems.[1]