background preloader

Ling_soft

Facebook Twitter

Lingro: multilingual dictionary and language learning site. How to Write a Spelling Corrector. In the past week, two friends (Dean and Bill) independently told me they were amazed at how Google does spelling correction so well and quickly. Type in a search like [speling] and Google comes back in 0.1 seconds or so with Did you mean: spelling. (Yahoo and Microsoft are similar.) What surprised me is that I thought Dean and Bill, being highly accomplished engineers and mathematicians, would have good intuitions about statistical language processing problems such as spelling correction. But they didn't, and come to think of it, there's no reason they should: it was my expectations that were faulty, not their knowledge. I figured they and many others could benefit from an explanation. So here, in 21 lines of Python 2.5 code, is the complete spelling corrector: import re, collections def words(text): return re.findall('[a-z]+', text.lower()) def train(features): model = collections.defaultdict(lambda: 1) for f in features: model[f] += 1 return model alphabet = 'abcdefghijklmnopqrstuvwxyz'

Language Identifier – Polyglot 3000 :: Likasoft. English grammar writing software with English grammar checker wr.