background preloader

BigData

Facebook Twitter

Le Cloud Computing. Google Spreadsheets. Le Big Data en une infographie. Big Data Right Now: Five Trendy Open Source Technologies. Big Data is on every CIO’s mind this quarter, and for good reason.

Big Data Right Now: Five Trendy Open Source Technologies

Companies will have spent $4.3 billion on Big Data technologies by the end of 2012. But here’s where it gets interesting. Those initial investments will in turn trigger a domino effect of upgrades and new initiatives that are valued at $34 billion for 2013, per Gartner. Over a 5 year period, spend is estimated at $232 billion. What you’re seeing right now is only the tip of a gigantic iceberg. Big Data is presently synonymous with technologies like Hadoop, and the “NoSQL” class of databases including Mongo (document stores) and Cassandra (key-values).

But there are new, untapped advantages and non-trivially large opportunities beyond these usual suspects. Did you know that there are over 250K viable open source technologies on the market today? We have a lot of…choices, to say the least. What’s on our own radar, and what’s coming down the pipe for Fortune 2000 companies?

Storm and Kafka Why should you care? Drill and Dremel. Internet et médias sociaux: les chiffres 2012 - INNOVATION. Leaders in Big Data. Web à Québec : entre les données volumineuses, les microdonnées et Google  La semaine dernière se tenait la troisième édition de l’événement le Web à Québec.

Web à Québec : entre les données volumineuses, les microdonnées et Google 

Gina Desjardins et moi avions tenu un panel sur les réseaux sociaux lors de la première édition. Cette série de conférences a depuis bien grandi, et les conférenciers ont eu le plaisir de dispenser leur savoir devant des salles pleines, qui débordaient même parfois. Des conférences aux conversations de corridor (souvent aussi importantes), j’ai relevé les données volumineuses, les microdonnées et le modèle Google.

'bigdata' Quorum-based Journaling in CDH4.1. Given the requirement to avoid SPOFs and custom hardware, we knew that any design we decided upon would involve storing multiple replicas of the metadata on multiple commodity nodes.

Quorum-based Journaling in CDH4.1

Given this, we added the following additional requirements: As a company focused on making Hadoop easier to deploy and operate, we also considered the following operational requirements:Consistency with other Hadoop components- any new components introduced by the design should operate similarly to existing components; for example, they should use XML-based configuration files, log4j logging, and the same metrics framework.Operations-focused metrics - since the system is a critical part of NameNode operation, we put a high emphasis on exposing metrics. QuorumJournalManager After discussion internally at Cloudera, with our customers, and with the community, we designed a system called QuorumJournalManager. Distributed Commit Protocol Fencing and Epoch Numbers Testing Summary Acknowledgements Aaron T.

OpenData

Data Analytics. Introduction to data warehouse. DataVisualization. MasterCard Big Data For Shopping Habits. 5 Big Data Startups to Watch in 2012. Big data, without question, is a 2011 buzz word finalist. But like all metaphors, it communicates a universal understanding that data dominates our lives and will increasingly do so in the years ahead. How can you deny that a company’s success will depend in great part on how they view data and its value? The belief is not lost on venture capitalists who have invested $350 million in Hadoop and NoSQL startups since 2008.

To commemorate this mega trend, we’ve picked five big data startups and one honorable mention that we believe are ones to watch in 2012. Here they are: Cloudera Cloudera took the spotlight once again this year. Jeff Hammerbacher is Cloudera’s chief scientist. I’m looking to hire someone to work closely with me at Cloudera in my role as Chief Scientist. Cloudera will face its most serious competition this year.

Jeff Kelly of Wikibon writes: Earlier this year, Cloudera and Dell announced a partnership. Issue #1: Quo Vadis, Big Data? A lot of people like to make predictions.

Issue #1: Quo Vadis, Big Data?

I don’t. But I love filling them for later reference. Here’s a roundup of predictions for 2013. Most of them are about the Big Data market, very few mentioning NoSQL databases. Why? Sections To frame the context of these predictions, let’s start with the forecast of the Big Data market from Gartner Research. Big data. Un article de Wikipédia, l'encyclopédie libre.

Big data

Une visualisation des données créée par IBM[1] montre que les big data que Wikipedia modifie à l'aide du robot Pearle ont plus de signification lorsqu'elles sont mises en valeur par des couleurs et des localisations[2]. Croissance et Numérisation de la Capacité de Stockage Mondiale de L'information[3]. Dans ces nouveaux ordres de grandeur, la capture, le stockage, la recherche, le partage, l'analyse et la visualisation des données doivent être redéfinis. Informatique ubiquitaire. Un article de Wikipédia, l'encyclopédie libre.

Informatique ubiquitaire

L'évolution des ordinateurs : la course à la minaturisation et à la diffusion dans le milieu ambiant[1]. L´informatique ubiquitaire est la troisième ère de l'histoire de l'informatique, qui succède à l'ère des ordinateurs personnels et celle des mainframe.

DataScientist

Le Big Bang de la Big Data - INNOVATION. Big Data News. Big Data vidéos sur Dailymotion.