background preloader

Visual Guide to NoSQL Systems - Nathan Hurst's Blog

Visual Guide to NoSQL Systems - Nathan Hurst's Blog
There are so many NoSQL systems these days that it's hard to get a quick overview of the major trade-offs involved when evaluating relational and non-relational systems in non-single-server environments. I've developed this visual primer with quite a lot of help (see credits at the end), and it's still a work in progress, so let me know if you see anything misplaced or missing, and I'll fix it. Without further ado, here's what you came here for (and further explanation after the visual). Note: RDBMSs (MySQL, Postgres, etc) are only featured here for comparison purposes. Also, some of these systems can vary their features by configuration (I use the default configuration here, but will try to delve into others later). As you can see, there are three primary concerns you must balance when choosing a data management system: consistency, availability, and partition tolerance. According to the CAP Theorem, you can only pick two. Self promotion and Credits

NoSQL, NewSQL and Beyond The 451 Group has published last week the conclusions of a report detailing the growing set of options in the information management space. In the process they also clarified what they meant by "NewSQL". “NewSQL” is our shorthand for the various new scalable/high performance SQL database vendors. [...NewSQL vendors] have in common the development of new relational database products and services designed to bring the benefits of the relational model to distributed architectures, or to improve the performance of relational databases to the extent that horizontal scalability is no longer a necessity.We would include (in no particular order) Clustrix, GenieDB, ScalArc, Schooner, VoltDB, RethinkDB, ScaleDB, Akiban, CodeFutures, ScaleBase, Translattice, and NimbusDB, as well as Drizzle, MySQL Cluster with NDB, and MySQL with HandlerSocket. The latter group includes Tokutek and JustOne DB.

NoSQL Data Modeling Techniques « Highly Scalable Blog NoSQL databases are often compared by various non-functional criteria, such as scalability, performance, and consistency. This aspect of NoSQL is well-studied both in practice and theory because specific non-functional properties are often the main justification for NoSQL usage and fundamental results on distributed systems like the CAP theorem apply well to NoSQL systems. At the same time, NoSQL data modeling is not so well studied and lacks the systematic theory found in relational databases. In this article I provide a short comparison of NoSQL system families from the data modeling point of view and digest several common modeling techniques. I would like to thank Daniel Kirkdorffer who reviewed the article and cleaned up the grammar.

NoSQL databases: The Cost of Migration Migrating to a NoSQL database is not a free ride. There are some costs and complexity involved in this process. I’ve found a good list of the costs involved in a slide from Tom Melendez’ (a bit old) presentation (embedded below): Maven – Maven in 5 Minutes Prerequisites You must have an understanding of how to install software on your computer. If you do not know how to do this, please ask someone at your office, school, etc or pay someone to explain this to you. The Maven mailing lists are not the best place to ask for this advice. Installation Maven is a Java tool, so you must have Java installed in order to proceed.

NoSQL Pain? Learn How to Read/write Scale Without a Complete Re-write Lately I've been reading more cases were different people have started to realize the limitations of the NoSQL promise to database scalability. Note the references below: Take MongoDB for example. It's damn fast, but it doesn't really know how to save data reliably to disk. I've had it set up in a replica pair to mitigate that risk. Guess what - both servers in the pair failed and corrupted their data files at the same day. FAQ Why Redis is different compared to other key-value stores? There are two main reasons. Redis is a different evolution path in the key-value DBs where values can contain more complex data types, with atomic operations defined against those data types. MongoDB Best Practices Hello from the Engine Yard Data Team! We wanted to let you know what we've been up to since the last time we blogged. When the team was formed earlier in the year, our first job was to expand our stack with MongoDB.

4 Reasons Why Data Engineers Don't Use Cassandra - Open Source Three years ago, I was stuck trying to get a use case fit into my Oracle database. It was getting expensive fast and I was running out of budget. A friend suggested I try Apache Cassandra for the task and the time series use case was perfect. NoSQL Database Technical Overview The Oracle NoSQL Database is a distributed key-value database. It is designed to provide highly reliable, scalable and available data storage across a configurable set of systems that function as storage nodes. Data is stored as key-value pairs, which are written to particular storage node(s), based on the hashed value of the primary key. Storage nodes are replicated to ensure high availability, rapid failover in the event of a node failure and optimal load balancing of queries. Customer applications are written using an easy-to-use Java/C API to read and write data.

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison (Yes it's a long title, since people kept asking me to write about this and that too :) I do when it has a point.) While SQL databases are insanely useful tools, their monopoly in the last decades is coming to an end. And it's just time: I can't even count the things that were forced into relational databases, but never really fitted them. (That being said, relational databases will always be the best for the stuff that has relations.)