NOSQL and Polyglot Persistence

> > > >

NOSQL Patterns. Over the last couple years, we see an emerging data storage mechanism for storing large scale of data.

These storage solution differs quite significantly with the RDBMS model and is also known as the NOSQL. Some of the key players include ...GoogleBigTable, HBase, HypertableAmazonDynamo, Voldemort, Cassendra, RiakRedisCouchDB, MongoDB These solutions has a number of characteristics in commonKey value storeRun on large number of commodity machinesData are partitioned and replicated among these machinesRelax the data consistency requirement. (because the CAP theorem proves that you cannot get Consistency, Availability and Partitioning at the the same time) The aim of this blog is to extract the underlying technologies that these solutions have in common, and get a deeper understanding on the implication to your application's design. I am not intending to compare the features of these solutions, nor to suggest which one to use. API model The basic form of API access is.

Big Data Architecture at LinkedIn. 1. I am here with Sid Anand today. Sid has joined recently LinkedIn. How are you doing Sid? So you made the move over to LinkedIn. What is your job title and just tell us a little about the move over to LinkedIn? Yes, sure. 5. That is an excellent question too. So some vendors like, say, Neo4j they provide a solution that runs on one machine because they recognize this is a difficult problem, but for companies like LinkedIn or Facebook where the graphs don’t fit on one machine that is not a good solution. 6. That’s another great question. 7. Voldemort is a dynamo-inspired key-value store that was created by Jay Kreps in 2008. NoSQL: The Dawn of Polyglot Persistence. January 18, 2010 by Stephan Schmidt For some developers polyglot programming is already reality.

I’m not such a big fan of polyglot programming, using many programming languages in one company. Especially for small ones there are hurdles, like turnover. I’ve seen projects stranded because noone understood a particular language. Or as Alex Ruiz writes: I haven’t seen any practical evidence yet to convince me this is a good idea. Contrary to that, I’m a big fan of polyglot persistence. Many applications may require a non-traditional data store (say, something like MongoDB) for their core domain, but have other features that fit perfectly into a relational database – say, a CMS that relies heavily on custom fields and has a traditional user management system.

SQL is just fine But you might say: “SQL is working for me!”. Many companies on the web wave front have created their own storages to better suit their needs: Flickr, Facebook, Google and Amazon to only name a few. Best of breed. The Evolving Panorama of Data. The NOSQL Tapes, vol. 37: Rickard Öberg on Polyglot Persistence in Practice.

The dark side of NoSQL. September 30, 2009 by Stephan Schmidt There is a dark side to most of the current NoSQL databases. People rarely talk about it. They talk about performance, about how easy schemaless databases are to use. About nice APIs. They are mostly developers and not operation and system administrators. The three problems no-one talks about – almost noone, I had a good talk with the Infinispan lead [1] – are: ad hoc data fixing – either no query language available or no skillsad hoc reporting – either no query language available or no in-house skillsdata export – sometimes no API way to access all data In an insightful comment to my blog post “Essential storage tradeoff: Simple Reads vs.

My application relies on hundreds of queries that need to run in real-time against all of that transactional data – no offline cubes or Hadoop clusters. Data export: NoSQL data bases are differently affected by those problems. What is your NoSQL strategy? Swarm: A true distributed programming language. Fundamentals The fundamental concept behind Swarm is that we should “move the computation, not the data”.

Swarm: A true distributed programming language

The Swarm prototype is a simple stack-based language, akin to a primitive version of the Java bytecode interpreter. I wanted the proof of concept to be quick to implement, while demonstrating that the concept could work for a popular runtime like the JVM or Microsoft’s CLR. Update (Sept 17th 09): Swarm is now implemented as a Scala library, so you program in normal Scala, rather than a custom stack-based library as with the prototype described here.

It uses the Scala 2.8 Continuations plugin to achieve this. The Prototype The prototype is implemented in Scala, and I will use snippets of Scala code below, but a knowledge of Scala won’t be required to understand the rest of this article. As with the JVM, there are three places to store data in the Swarm VM: the stack, a local variable array, and the store. The “Store”