background preloader

Reviews & Comparisons

Facebook Twitter

Cassandra, Hive, and Hadoop: How We Picked Our Analytics Stack | MarkedUp - Analytics and Insights for Windows 8. When we first made MarkedUp Analytics available on an invite-only basis to back in September we had no idea how quickly the service would be adopted. By the time we completely opened MarkedUp to the public in December, our business was going gangbusters. But we ran into a massive problem by the end of November: it was clear that RavenDB, our chosen database while we were prototyping our service, wasn’t going to be able to keep growing with us. So we had to find an alternative database and data analysis system, quickly! The Nature of Analytic Data The first place we started was by thinking about our data, now that we were moving out of the “validation” and into the “scaling” phase of our business.

Analytics is a weird business when it comes to read / write characteristics and data access patterns. In most CRUD applications, mobile apps, and e-commerce software you tend to see read / write characteristics like this: In analytics though, the relationship is inverted: Database Criteria HBase Riak. Database - MongoDB vs. Cassandra. Why We Migrated from MongoDB to Riak. Recommended Links Like this piece? Share it with your friends: Powering hundreds of thousands of clicks and shares and hundreds of millions of pageviews every month, Shareaholic found itself needing to scale up it's big data architecture. Learn why we chose Riak, what it does for us, and how we imported 100 gigabytes of data from a live Mongo database while maintaining data consistency and zero downtime. I'll discuss the things that went smoothly and the things I'd do differently next time. About the author: Robby leads tech at Shareaholic in Cambridge, MA.

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase vs Couchbase vs Hypertable vs ElasticSearch vs Accumulo vs VoltDB vs Scalaris comparison :: Software architect Kristof Kovacs. While SQL databases are insanely useful tools, their monopoly in the last decades is coming to an end. And it's just time: I can't even count the things that were forced into relational databases, but never really fitted them. (That being said, relational databases will always be the best for the stuff that has relations.) But, the differences between NoSQL databases are much bigger than ever was between one SQL database and another.

This means that it is a bigger responsibility on software architects to choose the appropriate one for a project right at the beginning. In this light, here is a comparison of Open Source NOSQL databases: The most popular ones # Redis # Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory). For example: To store real-time stock prices. Cassandra # Best used: When you need to store data so huge that it doesn't fit on server, but still want a friendly familiar interface to it. MongoDB # ElasticSearch # CouchDB # Accumulo # NoSQL. "Structured storage" redirects here.

For the Microsoft technology also known as structured storage, see COM Structured Storage. A NoSQL (often interpreted as Not Only SQL[1][2]) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling and finer control over availability. The data structure (e.g. key-value, graph, or document) differs from the RDBMS, and therefore some operations are faster in NoSQL and some in RDBMS. There are differences though, and the particular suitability of a given NoSQL DB depends on the problem it must solve (e.g. does the solution use graph algorithms?). History[edit] There have been various approaches to classify NoSQL databases, each with different categories and subcategories. A more detailed classification is the following, by Stephen Yen:[9] Performance[edit] Examples[edit] Graph[edit]