Big Data - Analytics - Providers

TwitterFacebook
Get flash to fully experience Pearltrees
Since data scientists Alasdair Allan and Pete Warden (disclosure: Pete writes for ReadWriteWeb) presented information about how iPhones store a log of your location data in an unencrypted file, there's been a mix of reactions. Some are outraged by the privacy implications. Some don't see why it's a big deal, citing either the forensic community's prior knowledge of the logs or the fact that many people share location information on Foursquare. Others have been intrigued at the possibilities of exploring their own personal location information.

Stalk Yourself: Use R to Analyze Your iPhone Location Data

http://www.readwriteweb.com/hack/2011/04/stalk-yourself-use-r-to-analyz.php
http://radar.oreilly.com/2011/04/data-hand-tools.html

Data hand tools - O'Reilly Radar

The flowering of data science has both driven, and been driven by, an explosion of powerful tools. R provides a great platform for doing statistical analysis, Hadoop provides a framework for orchestrating large clusters to solve problems in parallel, and many NoSQL databases exist for storing huge amounts of unstructured data. The heavy machinery for serious number crunching includes perennials such as Mathematica , Matlab , and Octave , most of which have been extended for use with large clusters and other big iron. But these tools haven't negated the value of much simpler tools; in fact, they're an essential part of a data scientist's toolkit.

Comment le web de données change t-il la nature du business | MFG Blog

http://mfglabs.com/blog-charles-darwin-et-les-api#blog_bar Comment le web de données change-t-il la nature du business ? Les synergies entre science du vivant et internet sont fascinantes. L'été dernier à San Francisco, Sam Ramji a exposé une analogie entre la théorie de l'évolution de Darwin et l'apparition des API, et c'est particulièrement intéressant face aux enjeux que soulève la dynamique du web.

MFG Labs

Lors de la conférence f8 en septembre dernier, Facebook présentait le Custom Open Graph qui apporte une nouvelle dimension au fameux “j’aime”. En introduisant de nouveaux verbes d’action comme “écouter” ou “lire”, Facebook révolutionne la manière dont on consomme et partage des contenus avec ses amis. C’est grâce à ces Open Graph Actions que vous pouvez savoir en temps réel, ce qu’écoutent vos amis sur Spotify ou ce qu’ils lisent sur Yahoo! Limité jusqu’à présent à une poignée d’applications, Facebook vient d’annoncer la disponibilité du Custom Open Graph à plusieurs de ses partenaires privilégiés, dont Cinémur. http://mfglabs.com/
Teradata

Building a beautiful design is a great experience. Seeing the design break apart when people start putting in real content, though, is painful. That’s why testing it as soon as possible with real information to see how it fares is so important.

YQL: Using Web Content For Non-Programmers

http://coding.smashingmagazine.com/2010/12/21/yql-using-web-content-for-non-programmers/
http://www.google.com/publicdata/home

Public Data Explorer

Rapport sur le développement humain 2011, Programme des Nations Unies pour de développement Les données utilisées pour calculer l'Indice de développement humain (IDH) et autres indices composites présentés dans le Rapport sur le développement humain ... données sur le chômage harmonisé pour les pays européens. Cet ensemble de données a été préparé par Google sur la base de données téléchargées à partir ... Indice harmonisé des prix à la consommation (IPCH) pour les pays européens en fonction des groupes de produits et de services. ...
http://www.readwriteweb.com/hack/2011/02/a-free-visual-programming-language-for-big-data.php Until the last few years, large scale data processing was something only big companies could afford to do. As Hadoop has emerged, it has put the power of Google's MapReduce approach into the hands of mere mortals. The biggest challenge is that it still requires a fair amount of technical knowledge to set up and use. Initiatives like Hive and Pig aim at making Hadoop more accessible to traditional database users, but they're still pretty daunting.

A Free Visual Programming Language for Big Data

Today at the Strata conference The Stanford Visualization Group debuted a Web-based visual tool for cleaning up messy data called DataWrangler . According to its website, "Wrangler allows interactive transformation of messy, real-world data into the data tables analysis tools expect." Data can be exported as a CSV or TSV or as JSON data. Another thing I often hear is that a large fraction of the time spent by analysts -- some say the majority of time -- involves data preparation and cleaning: transforming formats, rearranging nesting structures, removing outliers, and so on.

The Stanford Visualization Group Debuts Visual Tool for Cleaning Up Data

http://www.readwriteweb.com/hack/2011/02/datawrangler.php

Dataveyes : visualisations interactives de données

Le web de données c’est peut être un super concept, mais dans la pratique ca ressemble à quoi ? Ca sert à quoi ? C’est fait pour qui ? Et bien c’est justement la visualisation de données qui permettra de répondre à ces questions. L’an dernier j’avais effectué un plongé dans ce monde passionnant en essayant de répondre à cette question simple: Comment naviguer dans un web de données? Ma conclusion était la suivante : “c’est la visualisation de données qui permettra petit à petit de profiter de toute la richesse que le web a à nous offrir”. http://nicolas.cynober.fr/blog/717,dataveyes-visualisations-interactives-de-donnees.html

How to Make Bubble Charts

http://flowingdata.com/2010/11/23/how-to-make-bubble-charts/ A bubble chart can also just be straight up proportionally sized bubbles, but here we're going to cover how to create the variety that is like a scatterplot with a third, bubbly dimension. The advantage of this chart type is that it lets you compare three variables at once. One is on the x-axis, one is on the y-axis, and the third is represented by area size of bubbles. Have a look at the final chart to see what we're making.
On Tuesday I heard the dynamic University of Massachusetts at Lowell professor Georges Grinstein talk about WEAVE (Web-based Analysis and Visualization Environment), a visualization tool for public data. One of the coolest things about WEAVE is the very idea of it. About 10 government agencies decided three years ago (before the Gov 2.0 movement was hot) to put their data out for easy public consumption, and to collaborate around it with the hope of eventually being able to combine all their data. These governments have combined into the Open Indicators Consortium to fund and guide development.

Update on WEAVE government data visualization software - O'Reilly Radar

Needlebase

Needlebase™ is a revolutionary platform for acquiring, integrating, cleansing, analyzing and publishing data on the web. Using Needlebase through a web browser, without programmers or DBAs, you can easily: acquire data from multiple sources : A simple tagging process quickly imports structured data from complex websites, XML feeds, and spreadsheets into a unified database of your design. merge, deduplicate and cleanse : Needlebase uses intelligent semantics to help you find and merge variant forms of the same record. Your merges, edits and deletions persist even after the original data is refreshed from its source. build and publish custom data views : Use Needlebase's visual UI and powerful query language to configure exactly your desired view of the data, whether as a list, table, grid, or map.
The US Department of Justice appears to have made a deal with Google that would allow it to acquire ITA Software, a company that provides airline data to travel search engines, for $700 million with a list of conditions. One of my favorite websites in the world is now in Google's hands. Built as a side-project by ITA engineers, Needlebase is a point-and-click data extraction tool that recommends merges and allows for data to be visualized in multiple ways including as maps. I am not a technology blogger because I am moved by news that air travelers may have to pay a few hundred dollars more per year.

Google, ITA & The Future of DIY Data Mining Tools

Google Labs has come out with a new tool that it is calling "Like Google Trends in reverse." Google Correlate allows users to enter a data series and get back queries that follow a similar pattern. Correlate is based off the technology that Google used to create Google Flu Trends. When you enter a data set into Correlate, it uses the Pearson Correlation Coefficient - a principle of statistics regarding data sets - to show the highest related coefficient within the search term. Correlate data can be input from either a spreadsheet or by exporting a CSV. Correlate also has pre-existing data sets from locations like states.

Google Launches Tool to Make Mining Search Data Easy

If you've been within shouting distance of me over the last month, you've probably heard me singing the praises of Needlebase , a great new point-and-click tool for extracting, sorting and visualizing data from across pages around the web. I've been using it for all kinds of things and now you can too. When we first reviewed Needle here on ReadWriteWeb , it was in closed beta and new users had to request an account.

Awesome: DIY Data Tool Needlebase Now Available to Everyone

Vertical Big Data analytics providers