Get flash to fully experience Pearltrees
For everyone, anywhere.
Prime minister Gordon Brown and e-commerce businesswoman Martha Lane Fox, left, listen to web pioneer Tim Berners-Lee, as he addresses a Downing Street seminar on smarter government.
There is an infographic boom going on out there - with the internet flooded with data visualisations of the way we fight war, how we use Twitter, what music we like and how we use the, er, internet. But new as these are, there's a long tradition of telling stories using graphics. Guardian definitive atlas of UK government spending
tesseract-ocr - An OCR Engine that was developed at HP Labs between 1985 and 1995... and now at Google.Tesseract is probably the most accurate open source OCR engine available. Combined with the Leptonica Image Processing Library it can read a wide variety of image formats and convert them to text in over 60 languages. It was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but since then it has been improved extensively by Google. It is released under the Apache License 2.0.
OCRopus™ is an OCR system written in Python, NumPy, and SciPy focusing on the use of large scale machine learning for addressing problems in document analysis. OCRopus 0.6 is being released. It features much simpler installation, fewer dependencies, and improved character recognition rates. This is the first all-Python release. Installation: To install, use: $ hg clone - r ocropus - 0.6 https : //code.google.com/p/ocropus $ cd ocropus / ocropy $ sudo apt - get install $ ( cat PACKAGES ) $ python setup . py download_models $ sudo python setup . py install $ ./ run - test
scanR is a free service that lets you transform camera phone pictures into PDF documents. You can take a picture of a document, send it to scanR by email and in less than a minute you'll get a PDF file.
Google sponsors the development of an open-source OCR software at the IUPR research group. " OCRopus is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities." "The goal of the project is to advance the state of the art in optical character recognition and related technologies, and to deliver a high quality OCR system suitable for document conversions, electronic libraries, vision impaired users, historical document analysis, and general desktop use," explains Thomas Breuel, who leads the project. The software is partly based on Tesseract , the best open source OCR engine available for now.
Quite frankly, I wish I knew about this simple way to use freely available OCR software back in my school days. Of course, we didn’t have camera mobile phones or inexpensive Digicams, but wouldn’t it have saved hours of copying notes! Ah, modern technology is wonderful; take a scanned image (or take a snap using a mobile camera/Digicam) and presto ““ OCR software extracts all the information from the image into easily editable text format. Optical character recognition (OCR) is a system of converting scanned printed/handwritten image files into its machine readable text format.
Data Publica fournit un service de développement de jeux de donnés sur mesure, consistant à construire des jeux de données à partir des spécifications de ses clients et à les livrer par abonnement.
Cyclist at Barclay's Bike scheme docking station.
Which US state shows how they use their money the best?
Is more public sharing of experiences of violent crime a good thing? Photograph: Arthur Turner/Alamy Today a new map of crime called Street Violence is launched by the charity Witness Confident .
The FTC has warned that mobile app developers are making it too easy for technology companies to collect and store personal data from child users. Photograph: Rex Features/Eye Candy
Android: can 'free' apps access user information?