background preloader

Big Data

Facebook Twitter

What is data science? We’ve all heard it: according to Hal Varian, statistics is the next sexy job.

What is data science?

Five years ago, in What is Web 2.0, Tim O’Reilly said that “data is the next Intel Inside.” But what does that statement mean? Why do we suddenly care about statistics and about data? In this post, I examine the many sides of data science — the technologies, the companies and the unique skill sets. The web is full of “data-driven apps.” One of the earlier data products on the Web was the CDDB database. Google is a master at creating data products. Google’s breakthrough was realizing that a search engine could use input other than the text on the page.

Flu trends Google was able to spot trends in the Swine Flu epidemic roughly two weeks before the Center for Disease Control by analyzing searches that people were making in different regions of the country. Google isn’t the only company that knows how to use data. In the last few years, there has been an explosion in the amount of data that’s available. Will We Exploit Data or Will Data Exploit Us? The interest on big data and open data is understandably growing all over the world.

Will We Exploit Data or Will Data Exploit Us?

The combination of several technology innovations, in areas like social media, cloud computing, analytics, offer scenarios that we could hardly imagine in the past. And the trend toward greater transparency and openness that is being championed by many governments and NGOs is almost creating a “perfect storm” around the ability to extract wealth from the growing masses of data that are freely available over the Internet. It is not just about data that was previously kept behind boundaries and that governments are liberating through their various “data.gov” initiatives.

It is also about new data that is generated through idea contests, online dialogues, social games, photo and video sharing sites, webcasts and webcams, and the likes. On the other hand, no human being can make sense of such a mass of data, so we do need tools, intermediaries, agents who make this digestible to us. The solution is focus. Building data startups: Fast, big, and focused.

This is a written follow-up to a talk presented at a recent Strata online event.

Building data startups: Fast, big, and focused

A new breed of startup is emerging, built to take advantage of the rising tides of data across a variety of verticals and the maturing ecosystem of tools for its large-scale analysis. These are data startups, and they are the sumo wrestlers on the startup stage. The weight of data is a source of their competitive advantage. But like their sumo mentors, size alone is not enough. The most successful of data startups must be fast (with data), big (with analytics), and focused (with services). Setting the stage: The attack of the exponentials The question of why this style of startup is arising today, versus a decade ago, owes to a confluence of forces that I call the Attack of the Exponentials.

At the same time, these technological forces are not symmetric: CPU and storage costs have fallen faster than that of network and disk IO. Le Big Data ne répond à aucun besoin précis. 01net le 12/07/11 à 16h05 A en croire certains, les organisations seraient assises sur des montagnes d’informations dont l’analyse et le croisement produirait une richesse, à ce jour impalpable.

Le Big Data ne répond à aucun besoin précis

Il est vrai que la gestion de l’information est souvent le talon d’Achille des organisations. Seulement, alors qu’elles sont empêtrées dans des chantiers de mise en cohérence, voilà qu’on les incite à produire, et surtout à digérer, une nouvelle masse d’information : le « big data », actuellement au centre des débats. Le principe ? Seulement derrière ce nouveau buzz, difficile de ne pas voir un objet nébuleux ne visant seulement qu’à embellir le concept de datawarehouse. Big Data, la prochaine révolution informatique. The Business of Big Data. Says Solving 'Big Data' Challenge Involves More Than Just Managing Volumes of Data. STAMFORD, Conn., June 27, 2011 View All Press Releases Gartner Special Report Examines How to Leverage Pattern-Based Strategy to Gain Value in Big Data Many IT leaders are attempting to manage "big data" challenges by focusing on the high volumes of information to the exclusion of the many other dimensions of information management, leaving massive challenges to be addressed later, according to Gartner, Inc.

Says Solving 'Big Data' Challenge Involves More Than Just Managing Volumes of Data

Big data is a popular term used to acknowledge the exponential growth, availability and use of information in the data-rich landscape of tomorrow. The term "big data" puts an inordinate focus on the issue of information volume (in every aspect from storage through transform/transport to analysis). Big data is also heavily weighted toward current issues and can lead to short-sighted decisions that will hamper the enterprise's information architecture as IT leaders try to expand and change it to meet changing business needs.

Contacts About Gartner. Big Data : les progrès de l’analyse des données. La démultiplication des outils de collecte de données (comme le web ou nos téléphones mobiles qui enregistrent très facilement nos déplacements, mais également nos actions, nos relations…) et l’amélioration des outils d’analyses de données offrent aux entreprises des moyens marketing de plus en plus inédits, estime Lee Gomes pour la Technology Review.

Big Data : les progrès de l’analyse des données

Et de donner un exemple simple et frappant : celui des Giants de San Francisco, l’équipe de baseball américain championne du monde et championne de la ligue nationale, qui a mis en place une tarification dynamique mise au point par Qcue, permettant de modifier le prix des billets en fonction de la demande, et ce, jusqu’à la dernière minute. L’idée étant d’adapter les tarifs à la demande pour éviter la mévente et mieux exploiter les phénomènes d’enchères (qui profitent plutôt au marché noir). Une tarification dynamique qui a permis une augmentation du chiffre d’affaires du club de 6 % en 2010. Vers le commerce algorithmique.