Platfora

The year in big data and data science Big data and data science have both been with us for a while. According to McKinsey & Company’s May 2011 report on big data, back in 2009 “nearly all sectors in the U.S. economy had at least an average of 200 terabytes of stored data … per company with more than 1,000 employees.” And on the data-science front, Amazon’s John Rauser used his presentation at Strata New York (below) to trace the profession of data scientist all the way back to 18th-century German astronomer Tobias Mayer. Of course, novelty and growth are separate things, and in 2011, there were a number of new technologies and companies developed to address big data’s issues of storage, transfer, and analysis. With that as a backdrop, below I take a look at three evolving data trends that played an important role over the last year. The ubiquity of Hadoop It was a big year for investment for Apache Hadoop-based companies. More data, more privacy and security concerns Open data’s inflection point Related:

Flume Speech, Language & Multimedia < Technology Services | Raytheon BBN Technologies For nearly four decades, Raytheon BBN Technologies has been a leader in speech and language technologies. Since the early 1970s, Raytheon BBN Technologies has been performing pioneering research in automatic speech recognition. Over the years, Raytheon BBN Technologies has had many firsts, including the first demonstration, in the early 1990s, of real-time, large-vocabulary, speaker-independent continuous speech recognition on commercial, off-the-shelf hardware. Raytheon BBN's Byblos, our primary speech recognition system, is an automatically trainable system that utilizes probabilistic hidden Markov models, and it continues to represent the state of the art in large-vocabulary, speaker-independent speech recognition. The Byblos engine forms the core of our application suite that includes Audio Indexer and Audio Monitoring System. Our natural language processing technologies can locate, identify, and organize information from a variety of sources and in multiple languages.

What is big data? Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it. The hot IT buzzword of 2012, big data has become viable as cost-effective approaches have emerged to tame the volume, velocity and variability of massive data. The value of big data to an organization falls into two categories: analytical use, and enabling new products. The past decade’s successful web startups are prime examples of big data used as an enabler of new products and services. The emergence of big data into the enterprise brings with it a necessary counterpart: agility. What does big data look like? As a catch-all term, “big data” can be pretty nebulous, in the same way that the term “cloud” covers diverse technologies. Volume This volume presents the most immediate challenge to conventional IT structures. Variety

Skytree Text Analysis and Text Mining Software: Lexalytics Les promesses du Big Data | ParisTech Review Le déluge des données numériques, évoqué dans nos colonnes par George Day et David Reibstein, n’impacte pas que les métiers du marketing. C’est l’ensemble des organisations de production qui est touché, et au-delà l’enjeu de compétitivité concerne les économies nationales. Ceux qui seront capables d’utiliser ces données auront une longueur d’avance pour connaître les opinions et détecter les mouvements culturels, mais aussi pour comprendre ce qui se joue au sein de leur organisation, en améliorant les processus et en informant mieux la prise de décision. Encore faut-il s’en donner les moyens: c’est tout la difficulté du “big data”, qui est à la fois une promesse et un défi. L’ère de l’information La question a d’abord surgi au sein du monde académique, quand une équipe dirigée par Peter Lyman et Hal R. Lyman et Varian évoquaient aussi la croissance déjà vertigineuse des échanges en ligne, avec le fameux Web 2.0 où chacun est un éditeur en puissance. Que faire de ces données?

Trifacta PeerIndex — We Value Social