background preloader

Garbage

Facebook Twitter

R packages and webapps using OpenCPU | OpenCPU. Ocr open source. OS mapping data: a new landscape unfolds | Technology. The Free Our Data campaign has scored a major victory, with the announcement by the government that it intends to make Ordnance Survey maps free for use online by any organisation – including commercial ones – at resolutions more detailed than commercial 1:25,000 Landranger maps from April next year. The announcement of the opening of a consultation on the plan by Gordon Brown at Downing Street on Tuesday, as part of a seminar on making public data public – set in the wider context of public service reform, under the "Smarter Government" umbrella – indicates that the ideas underpinning the campaign have now been taken on board at the highest levels of government.

"Mid-range" maps, with resolutions from 1:10,000 upwards, will be made available for re-use, under the plans announced by the prime minister, along with information on postcode areas and electoral and council boundaries. The issue appears to have gone to the top of government to be resolved. Infographics old and new: top data visualisations, in pictures | News. There is an infographic boom going on out there - with the internet flooded with data visualisations of the way we fight war, how we use Twitter, what music we like and how we use the, er, internet.

But new as these are, there's a long tradition of telling stories using graphics. Information Graphics from Taschen, which includes work from the Guardian such as our public spending chart, tells the story of how information graphics came to rule our world. We've selected some of those from the past - to contrast with some key infographics from the past few years - albeit missing out hundreds of examples we could have chosen from the years inbetween.

These are some of the images. Information Graphics from the Guardian bookstore at a discount price of £35.99 1858 Histoire Universelle This Table of Universal History was published in Paris. 1926 Kahn In his educational books on health and anatomy, german physician Fritz Kahn repeatedly drew on the old analogy of human body and machines. 1940 Geis. Data journalism and data visualization | News. Datasets by subject: A-F | News. Project Hosting. Tesseract-ocr - An OCR Engine that was developed at HP Labs between 1985 and 1995... and now at Google. Ocropus - The OCRopus(tm) open source document analysis and OCR system. ScanR - use camera phones for OCR. ScanR is a free service that lets you transform camera phone pictures into PDF documents. You can take a picture of a document, send it to scanR by email and in less than a minute you'll get a PDF file.

If you save the file as text in Acrobat Reader, you'll have the text contained in the document. You can also use it for whiteboard images. scanR requires 1 megapixel cameras for whiteboard scanning and 2 megapixel cameras for document scanning. For best results, take pictures about 12" from the document, in a well-lit area. I tested this online OCR service with Sony Ericsson K750i and the results were pretty good. It missed some characters, so it can't be used as a replacement for an OCR software as Abbyy FineReader, but the PDF from scanR can improve the results of an OCR processing with a commercial software. Related:Use Gmail to break PDF DRM. Free online OCR. Open-Source OCR Software, Sponsored by Google. Google sponsors the development of an open-source OCR software at the IUPR research group. "OCRopus is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities.

" "The goal of the project is to advance the state of the art in optical character recognition and related technologies, and to deliver a high quality OCR system suitable for document conversions, electronic libraries, vision impaired users, historical document analysis, and general desktop use," explains Thomas Breuel, who leads the project. The software is partly based on Tesseract, the best open source OCR engine available for now. While the project is expected to be released at the end of next year and will be used for Google's book scanning project, the team has some interesting applications in mind: Top 5 Free OCR Software Tools To Convert Images Into Text. Quite frankly, I wish I knew about this simple way to use freely available OCR software back in my school days. Of course, we didn’t have camera mobile phones or inexpensive Digicams, but wouldn’t it have saved hours of copying notes!

Ah, modern technology is wonderful; take a scanned image (or take a snap using a mobile camera/Digicam) and presto – OCR software extracts all the information from the image into easily editable text format. Optical character recognition (OCR) is a system of converting scanned printed/handwritten image files into its machine readable text format. OCR software works by analyzing a document and comparing it with fonts stored in its database and/or by noting features typical to characters.

Some OCR software also puts it through a spell checker to “guess” unrecognized words. 100% accuracy is difficult to achieve, but close approximation is what most software strive for. OCR Using Microsoft OneNote 2007 Drag a scan or a saved picture into OneNote. SimpleOCR TopOCR. Tesserac ou l'OCR Open source. Bonjour Les programmes de reconnaissance de caractères OCR sous licence libre ne sont pas légion Mais ceux qui donnent de bons résultats sont encore plus rares (*) J'en ai pourtant trouvé un (qui intéresse Google également) Les sources sont disponibles Et j'ai pu l'installer sur mon ubuntu Edgy en suivant les indications de cette page NB La premiere fois que j'ai lancer la commande make il me manquait un paquet pour compiler correctement les sources à la lecture des logs j'ai compris qu'il manquait c++ , je l'ai donc installé et l'installation s'est poursuivie jusqu'à son terme.

Les résultats sont bons pour une mise en page simple mais l'accentuation est perdue lors de la reconnaissance. Mes questions : 1) Quelqu'un aurait t'il déjà fait un script pournautilus Actionpermettant de lancer automatiquement la conversion d'un fichier image (clic droit > convertir en texte) Free OCR Software - FreeOCR.net the free OCR list - Optical character recognition software. Maude praises UK progress on open data but private sector sees work to do | Technology. An "open data" revolution kicked off by a Guardian campaign is gathering pace in the UK. The Cabinet Office minister, Francis Maude, is trumpeting the UK's success in making government data freely available - and pointing to examples of companies that have sprung up to create commercial businesses around free data from public bodies.

Maude says that "companies including SMEs [small and medium-sized enterprises] and startups are using open data to improve public services and create innovative products. " But, he adds, he wants both "data holders" in government and new data-driven businesses to "promote the open data revolution". One of the newest apps to appear using government data is an iPhone all called Your Taxi Meter, which uses live data from local councils to find out from a car's registration number whether it is a licensed taxi - so that would-be passengers can check on it before they get in.

Every open spending data site in the US ranked and listed | News. The Follow the Money 2012 report has this week revealed the good news that more US states are being open about their public spending by publishing their transactions on their websites. It has also exposed the states of Arkansas, Idaho, Iowa, Montana and Wyoming that are keeping their finances behind a password protected wall or are just not publishing at all. A network of US Public Interest Research Groups (US PIRGs) which produced the report, revealed that 46 states now "allow residents to access checkbook-level information about government expenditures online".

The checkbook means a digital copy of who receives state money, how much, and for what purpose. Perhaps to make sense of this 'checkbook' concept it's useful to compare US and UK public finance transparency. In the UK councils have been publishing lists of their spending over £500 for just over a year. The government site Direct Gov provides a service to look up your local council's spending data. This is just the UK way. Mapping technologies | Technology. Witness Confident: can a street violence map encourage more victims to report crime? | News. Today a new map of crime called Street Violence is launched by the charity Witness Confident. Unlike Police.uk which displays official statistics of crimes at street level, the Street Violence map will display accounts of street robberies and attacks from witnesses and victims. The motivation for this, as Witness Confident explain, is that a signficant portion of violent crime is not reported, so making it easier and quicker to share information about crime can help to reduce these lost cases.

The map also serves as a report to help people learn about how crime is tackled in their area. This comes at a time when this reporting increasingly comes from police and council newsletters, according to data from the Home Office's latest study. Perhaps the most important feature of the Street Violence map is the way it's connected to the Metropolitan Police Service (MPS) through a form that will essentially email the Police as if you were using their own email service. So what do you think? Keeping these data-hungry technology companies at bay | Dan Gillmor. The US Federal Trade Commission says mobile app developers are doing far too little to help parents protect the privacy of children who use phones and tablets. And, according to a story in the Wall Street Journal, the head of an app developers trade group agrees; he's quoted as saying: "Parents should have clear, simple, easy-to-use tools to protect their children's privacy.

" I have an even better idea for the mobile industry, and the tech industry in general: how about providing clear, simple, easy-to-use tools to protect every person's privacy? And for the regulators: is there a way to require the industry to do this, in a way that won't cause more problems than it solves? In recent days, one mini-debacle after another has demonstrated the tech industry's true nature. This is an industry that considers the idea of providing serious privacy choices for its product users an unacceptable deterrent to one of its key business models. I have some sympathy for the app developers.

Android 'free' apps pass user data to ad networks, study finds | Technology. Android: can 'free' apps access user information? Advertising networks used by apps in Android devices can get access to user information, according to an investigation by a UK information security company. MWR Infosecurity found that a significant number of the top 50 "free" apps which generate money for the developer and advertisers by connecting to an American advertising network pass on details about the phone's user to the network – a move that may breach European data protection laws. With roughly a quarter of the UK's phone users using Android phones, and with millions of apps downloaded every month – often for free, supported by advertising, rather than paid-for – the gap in security is a source of concern. The study was commissioned by Channel 4 News.

The code that MWR Infosecurity found gave advertising networks access to contacts, calendar and location. It came from a large US ad network called MobClix. Channel 4 said that it had not responded to repeated requests for comment. Data protection | Technology. Data protection | Technology. Strata 2012: who goes to a conference about #data? Mapped | News. Free our data | Technology. Facts are sacred: the power of data - out now on Kindle and iBooks | News. We have a new ebook out for Kindle and it's about how we work with data at the Guardian - and how that data is changing the world around us. It's available now in the US, Canada and around the world - having been published in the UK just before Christmas. What's in it? It's a combination of original writing and the best data coverage from the Datablog and Datastore.

"Comment is free," wrote Guardian editor CP Scott in 1921, "but facts are sacred". Ninety years later, publishing those sacred facts has become a new type of journalism in itself: data journalism. And it's rapidly becoming part of the establishment. Facts Are Sacred, edited by Simon Rogers, shows how the Guardian Datastore and Datablog does it. It will be available on as an enhanced iBook soon too, complete with videos and graphics. • Buy it on Amazon.co.uk• Buy it on Amazon.com • Buy it on iBooks• Published as enhanced iBook soon You don't have to have a Kindle or an iPad - here's how you can read it on your PC or Mac.

Open data: proof that knowledge isn't always power | Housing network | Guardian Professional. Open data is about making information collected and created by public service agencies freely available online, for inspection and analysis by anyone who cares to look. Having grown up in the United States and Canada, this approach is spreading fast through the UK public sector. For example, local councils are now obliged to publish details of all items of spending over £500. Should housing associations join this move to greater transparency and throw open their information vaults to boost accountability to residents and other stakeholders? If it can provide genuinely helpful data in a form that customers and communities can easily interpret and use to improve services and opportunities, then yes. Why not? The recent storm of publicity about the police crime maps site perfectly illustrates both the exciting potential of open data and its pitfalls.

Most providers are well used to details of their performance being published and compared to their peers. TheBigClean. If you'd like to help organise a Big Clean event near you, please add your name and location to the list below! We will then add this to the website www.bigclean.org Planning will be discussed on the Big Clean list. Berlin Dublin Aidan McGuire, ScraperWiki Jyväskylä (Finland) 19.3. 10-18.00 Finnish time, Hannikaisenkatu 18 Antti Poikola London Francis Irving Richard Goodwin, [1]] ([@TSOTechnology) Prague 19.3. 10:00 - 17:00 (GMT+1), National Technical Library Jindřich Mynarz (@jindrichmynarz) Josef Šlerka (@josefslerka) Montreal James McKinney (@mckinneyjames) Open Knowledge Foundation.

Refine, reuse and request data. The Big Clean. Text and metadata extraction with Apache Tika. Open data, scraping e thacks com Software Livre. R-Function to Source all Functions from a GitHub Repository. R-Function to Source all Functions from a GitHub Repository. Quelques commandes Unix avancées. On attaque la troisième partie de la série de tutoriels consacrés aux commandes Ubuntu et Unix de manière générale. Avant de continuer, je vous recommande de regarder les billets sur : quelques généralités Unix & quelques commandes indispensables .

Dans cette troisième partie, nous aborderons quelques commandes Unix avancées (ce n’est pas pour autant que c’est compliqué). C’est à partir de cette vidéo que l’on s’en rend réellement compte de l’avantage de la console sous Unix et à quel point ça peut être plus rapide qu’une interface graphique classique. Voici le sommaire de que nous allons apprendre : Les processus Lister des processus : ps & top Arrêter des processus : kill Recherche Rechercher des fichiers : locate & find Filtrer les données Afficher le début ou la fin d’un fichier : head & tail Rechercher des mots clés dans un fichier : grep & sed Découper en colonnes : cut Trier des données : sort Compter des occurrences : wc.

Tutoriels Unix. UNIX / Linux Tutorial for Beginners. Unix tutoriel. Unix - Aide-mémoire. R-Function to Read Data from Google Docs Spreadsheets. Local Projects. Local Project – Art Space New York. Introducing the PivotViewer Control for Silverlight. Data Visualization: Modern Approaches. Martin Wattenberg: Research. Using Wireframes to Streamline Your Development Process. 50 Great Examples of Data Visualization. Thinkmap visualization software facilitates communication, learning, and discovery.

A Periodic Table of Visualization Methods. Thanks for choosing Tableau Public. PivotViewer - Microsoft Download Center - Confirmation. PivotViewer Control. Web scraping in Java with Jsoup, Part 1 | A Web Coding Blog. 10 Popular Sites Like Getpivot (Updated: Mar 21st. PowerPivot | Microsoft BI. OpenPaths. Elements-for-constructing-social-learning-environments-e1268231388833.jpg (Image JPEG, 1200x900 pixels) - Redimensionnée (97%) 37 Data-ish Blogs You Should Know About. Virtualization and Java: An Introduction to Memory Management. The Numbers Guy. Basketball Geek. How Companies Learn Your Secrets. Kodu Offers Pop-Up Computer Programming for Children.

Informatique | Blog de Bibichette. [Fonctions] Batcher.fr MS-DOS batch scripts Ressources. [Fonctions] Batcher.fr MS-DOS batch scripts Ressources. Building Great Web Applications. I2P | Fansub Streaming - Actu numérique, Japon et streaming de fansub d'animes. Utiliser des DNS différents | Fansub Streaming - Actu numérique, Japon et streaming de fansub d'animes. Boucle en batch (1/1) The Freenet Project - /download. How Companies Learn Your Secrets. NSPpart.pdf. Prediction-api-r-client - R client library for the Google Prediction API. Prediction API. Free Data Visualization Software. Customizing Block Histogram. 10 Popular Sites Like Getpivot (Updated: Mar 21st.

Lancement de Microsoft Pivot | Zesty. DV Géo - donnees - Geocible. [batch] contenu d'un fichier dans une variabl [Résolu] | CommentCaMarche. [Variables] Batcher.fr MS-DOS batch scripts Ressources. [Commandes] Batcher.fr MS-DOS batch scripts Ressources. [Téléchargements] Batcher ! MS-DOS batch scripts Ressources. Word Cloud in R. Freenet Doc. Freenet | Fansub Streaming - Actu numérique, Japon et streaming de fansub d'animes. Ressources | Fansub Streaming - Actu numérique, Japon et streaming de fansub d'animes. Unix et android. Google Releases Rosetta Stone for Dart to JavaScript.

Terminate an R Session. NoSQL : 5 minutes pour comprendre. Nosql. Data Publica, le portail français des données publiques et de l'open data.