background preloader

Ocr

Facebook Twitter

OCR for Linux: Teaching Linux to Read. Rod Smith covers the optical character recognition (OCR) options for Linux, their limitations, and how to install and use Tesseract for your OCR needs on Linux. Computers are excellent number-crunching machines, but they’ve traditionally been very poor at dealing with the” fuzzier” everyday world at which humans excel. Ask a computer to add a thousand numbers and it wouldn’t blink an eye if it had one; however, ask a computer to read those thousand numbers from a sheet of paper and you’ll run into problems.

Even with a scanner attached to the computer, a computer will have a hard time recognizing printed numbers (or, generalizing a bit, letters and punctuation) for what they are — a task that even kindergarten children can master. The software that attempts to teach computers about the printed alphabet and words is known as Optical Character Recognition(OCR) software. In some cases the OCR software can use a scanner directly, bypassing the need to store a file on disk. 3.Type . Ocropus - Google Code. Tesseract-ocr - Google Code. Tesseract. GOCR.