Get flash to fully experience Pearltrees
The quest to find decision-making insights in the modern data flood is certainly an appealing notion. After all, there is so much of data, from the traditional stuff inside corporate databases to e-mail, Web-browsing patterns, social-network messages and sensor data. Information drives decisions, so more of it ought to open the door to better decisions.
Revolution Analytics, the company that is extending R, the open source statistical programming language, with proprietary extensions, is making available a free set of extensions that allow its R engine to run atop Hadoop clusters. Now statisticians that are familiar with R can do analysis on unstructured data stored in the Hadoop Distributed File System, the data store used for the MapReduce method of chewing on unstructured data pioneered by Google for its search engine and mimicked and open sourced by rival Yahoo! as the Apache Hadoop project. R can now also run against the HBase non-relational, column-oriented distributed data store, which mimics Google's BigTable and which is essentially a database for Hadoop for holding structured data.
danah boyd Microsoft Research; New York University (NYU) - Department of Media, Culture, and Communication; University of New South Wales (UNSW); Harvard University - Berkman Center for Internet & Society Kate Crawford University of New South Wales (UNSW) September 21, 2011 A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society, September 2011 Abstract: The era of Big Data has begun.
The amount of data in our world has been exploding, and analyzing large data sets—so-called big data—will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus, according to research by MGI and McKinsey's Business Technology Office. Leaders in every sector will have to grapple with the implications of big data, not just a few data-oriented managers. The increasing volume and detail of information captured by enterprises, the rise of multimedia, social media, and the Internet of Things will fuel exponential growth in data for the foreseeable future.
Platfora , a data management software provider based on Hadoop, announced on Thursday it has raised $5.7 million from Andreessen-Horowitz just a few months after the company was founded. Hadoop is an open-source data-management software framework. It’s useful for companies that store enormous amounts of data and have to regularly index it. That can include financial services companies that have to track previous prices and old transactions or companies like Yahoo that need to regularly access search information.