Big Data

Course à l'innovation : les DSI européens décrochent. 01Business le 26/09/13 à 10h55 La culture européenne met en avant le passé. Là où l’Asie appuie à fond sur l’accélérateur, nous avançons avec les freins bloqués », caricature Frédéric Pichard, responsable des baromètres chez CSC. Depuis cinq ans, le baromètre CIO de la SSII CSC mesure les enjeux et les évolutions de la fonction informatique au sein de l’entreprise. Cette année, l’étude a été réalisée sur une base de prospection mondiale (voir encadré méthodologie). Elle met en exergue le dynamisme des pays ou régions dits émergents, et une certaine difficulté du Vieux Continent à pousser l’innovation dans et au travers de l’informatique. La faute à un « passif » informatique plus important et à une distance plus grande entre la DSI et les organes de décisions de l’entreprise.

Priorité à la sécurité avant de se lancer dans le cloud Quels sont les défis les plus importants pour votre DSI dans les annés à venir ? En Asie, 66% des DSI voient leur budget augmenter. Crédit photo : Benjamin Ellis. Bases de données graphes : un tour d’horizon. Dans un précédent article, nous avons introduit quelques concepts à propos des graphes, et les avons illustrés par deux exemples en utilisant la base de données graphe Neo4j. Au cours de ces dernières années, de nombreuses compagnies ont développé leur solution de base de données graphe, en tant qu’éditeur comme Neo Technology avec Neo4j, Objectivity avec InfiniteGraph ou encore Sparsity avec dex*, ou en développant leur propre solution pour l’intégrer à leur application, comme LinkedIn ou Twitter.

Il est donc assez difficile de s’y retrouver dans ce paysage riche, qui continue à évoluer très vite. Dans ce nouvel article qui se focalise sur les bases de données graphes, nous donnerons les éléments nécessaires à la compréhension de leur positionnement dans leur écosystème, par rapport aux autres types de base de données et aux autres types d’outils dédiés au traitement de graphes.

Une telle base de données répond donc généralement aux critères suivants : Graph storage et graph processing. Hadoop. Un article de Wikipédia, l'encyclopédie libre. Hadoop a été créé par Doug Cutting et fait partie des projets de la fondation logicielle Apache depuis 2009. Historique[modifier | modifier le code] En 2004, Google publie un article présentant son algorithme basé sur des opérations analytiques à grande échelle sur un grand cluster de serveurs, le MapReduce, ainsi que son système de fichier en cluster, le GoogleFS.

Doug Cutting, qui travaille à cette époque sur le développement de Apache Lucene et rencontre des problèmes similaires à ceux de la firme de Mountain View, décide alors de reprendre les concepts décrits dans l'article pour développer sa propre version des outils en version Open Source, qui deviendra le projet Hadoop. Architecture[modifier | modifier le code] Hadoop Distributed File System[modifier | modifier le code] Une architecture de machines HDFS (aussi appelée cluster HDFS) repose sur deux types de composants majeurs : MapReduce[modifier | modifier le code] CIO Agenda: Big Data Ecosystem. IN TERMS of ‘forces’ affecting the CIO Agenda, Information Strategy and Enterprise Architecture, Big Data is increasingly important.

This is due to explosive growth in number of data source types: applications, digital media, mobiles, users, customers, unstructured data sets, sensors, emails, blogs etc. Data is complex and in mixed formats (text, video, audio), on-demand infrastructure scalability (including massively scalable storage) is needed to deliver Big Data capabilities, as are robust analytics and visualisation tools and techniques for distributed, parallel systems. Increasing bandwidth availability has also led to exponential data growth rates and capabilities e.g. social networks, video and microblogging. Figure 1: A (simplified) Big Data Ecosystem, source: Steve Nimmons Where do you start in formulating a reference architecture for Big Data and sourcing suppliers for a Big Data ecosystem? Big Data: A Revolution That Will Transform How We Live, Work and Think Hadoop Overview. Visualization-based data discovery tools. Visualization-based data discovery tools may account for less than 5 % of the Business Intelligence (BI) Market, but they are fighting above their weight in terms of profile.

In 2011, Gartner placed Visualisation at the peak of the BI Hype Cycle. Despite this indicating the category may lose some of its lustre , Gartner are still predicting a compound annual growth rate of 30% in each of next 5 years. If true, this means the category will increase in value from $427 to $1,606 million over the period, a growth rate 3 times that of the overall BI market. So what are Data Visualisation tools and how are they defined? According to Gartner, there are 3 common elements Who are the main vendors in the Data Visualisation category? Gartner ranked the leading players, based on estimated revenue, to be Figure 2. With the expected growth of this category, large traditional BI vendors like Microsoft are scrambling for market share. So why the growth and hype in the Category? Rapid prototyping. The Definition of Enterprise Big Data. With David Vellante With the inaugural O'Reilly Media Strata conference, the topic of is coming into sharper focus.

When O'Reilly initiates coverage of a topic through an event like Strata, you can be sure the content will be well-thought-out, rich, relevant and visionary in nature. A key theme that emerged from the event was that Big Data is not just about cool technologies and Web 2.0 companies experimenting with gigantic data sets. Rather it's defining new value streams based on leveraging information. Big-data Background Big Data is emerging from the realms of science projects at Web companies to help companies like telecommunication giants understand exactly which customers are unhappy with service and what processes caused the dissatisfaction, and predict which customers are going to change carriers. The IT techniques and tools to execute big data processing are new, very important and exciting. Enterprise Big Data Big-data Definition1 Big data has the following characteristics:

From Big Data to Big Busines. Big Data et Technologies du Langage. 42 Big Data Startups – Big Data News. Published by Jeff Vance at Startup50. Which ones are missing? I would add Pervasive, Tableau, Splunk, Lavastorm, Yottamine, Alteryx, Pivotal as well as non-product companies. For instance, publishers like DataScienceCentral (self-funded, profitable, with a large list of big data clients). This list contains (too) many Hadoop-related companies. Here's a compilation of the most analytic ones, compiled by Gregory. SiSense: Big Data analytics and BI platform.Skytree: machine-learning-based platforms for Big Data analytics.Splice Machine: a Hadoop-based, SQL-compliant database designed for Big Data applications.Statwing: tools that make it easy for anyone to use the same statistical analysis tools that data scientists and statisticians use.SumAll: an analytics tool that helps businesses make more money by using their own data.

And some added by Gregory (top 20 Big Data startups by raised venture capital amount): Anyone interested in publishing a list of top 20 analytic startups?