big data

TwitterFacebook
Get flash to fully experience Pearltrees
data resources

Fusion Tables - Google Drive | Extend, visualize and share data online

Docs Docs keeps everything and everyone on the same page. Add artichokes to a shared shopping list, or put the finishing touches on your business plan from the lobby before the meeting, right from your mobile device. Sheets Sheets is more than just columns and rows. http://www.google.com/drive/start/apps.html#fusiontables
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis (Yes it's a long title, since people kept asking me to write about this and that too :) I do when it has a point.) While SQL databases are insanely useful tools, their monopoly in the last decades is coming to an end. And it's just time: I can't even count the things that were forced into relational databases, but never really fitted them. (That being said, relational databases will always be the best for the stuff that has relations .) But, the differences between NoSQL databases are much bigger than ever was between one SQL database and another.

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison :: KKovacs

https://cwiki.apache.org/confluence/display/Hive/Home

Home - Apache Hive

Skip to end of metadata Go to start of metadata The Apache Hive TM data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop TM , it provides Tools to enable easy data extract/transform/load (ETL) A mechanism to impose structure on a variety of data formats Access to files stored either directly in Apache HDFS TM or in other data storage systems such as Apache HBase TM Query execution via MapReduce Hive defines a simple SQL-like query language, called QL, that enables users familiar with SQL to query the data.

maui-indexer - Maui - Multi-purpose automatic topic indexing

http://code.google.com/p/maui-indexer/ Summary Maui automatically identifies main topics in text documents. Depending on the task, topics are tags, keywords, keyphrases, vocabulary terms, descriptors, index terms or titles of Wikipedia articles.

I - RapidMiner

http://rapid-i.com/content/view/181/190/ "Many thanks to all of you at Rapid-I for your hospitality and professionalism before and during the training course this week. For me there were so many highlights, but I just want to mention that my appreciation of what is possible with RapidMiner increased significantly during the 3 days - great product!" Russ Weedon, Basis06, Switzerland "RapidMiner is an awesome package. Thank you for making such powerful functionality available in such a convenient form." Michael Van Kleeck, USA
http://www.revolutionanalytics.com/products/r-for-apache-hadoop.php

Using Revolution R Enterprise With Apache Hadoop for 'Big Analytics'

The 'Big Data' explosion of the last few years has led to new infrastructure investments around storage and data architectures. Apache Hadoop has rapidly become a leading option for storing and performing operations on big data. Meanwhile, R has emerged as the tool of choice for data scientists modeling and running advanced analytics. Revolution Analytics brings R to Hadoop, giving companies a way to get better returns on their big data investments and extract unique, competitive insights from advanced analytics with the most cost-effective solution on the market. Read the white paper "Advanced 'Big Data' Analytics with R and Hadoop" PDF
http://www.greenplum.com/products/chorus

Chorus: Productivity engine for Data Science Teams | Greenplum

Real-Time Collaboration Greenplum Chorus breaks down the silos across the enterprise by replacing the backlog of email with a single interface for all your organization’s data, together with virtual databases for exploration and innovation, and social collaboration for insight and analysis. Greenplum Chorus provides rich social network features that revolve around datasets, insights, and other key Chorus components — allowing Big Data Analytics stakeholders to all participate and collaborate in the same environment. The result is that the data science team can collaboratively discover, share, and discuss insights that have a meaningful impact to the business. Data Exploration Gone are the days of hunting through email or shared drives for one specific comment or piece of code.