The Net Takeaway: SQL and Hadoop SQL and Hadoop · 11/20/2008 12:23 PM, Database Analysis I don’t know why there is so much confusion over the role of MapReduce oriented databases like Hadoop vs. SQL oriented databases. It’s actually pretty simple. There are 2 things people want to do with databases: Select and Aggregate/Report, aka Process. The Select portion is filtering: finding specific data points based on attributes like time, category, etc. So, how do we tell databases to do these 2 things? While some programmers immediately get what SQL can do, others find it to be “YAL”, “Yet Another Language”. MapReduce is a programming concept that’s been around for a while in the object-oriented world, but has recently become more popular as scripting languages rise and as processors become more parallel. Therefore, if you think about it, both Hadoop and SQL databases are doing the same thing: Selecting some data (the Map phase) and Processing it (the Reduce phase). So, why the sturm und drang? But we aren’t there yet. SQL vs.
Getting Started with NoSQL « myNoSQL Couple of weeks ago, I had the pleasure to sit down with Mathias Meyer, Chief Visionary at Scalarium, a Berlin startup and discuss NoSQL adoption. Like myself, Mathias is really excited about NoSQL and he uses every opportunity to introduce more people to the NoSQL space. Recently he gave quite a few presentations around the Europe about NoSQL databases. The discussion has focused on how would someone start learning and using NoSQL databases and the path to follow in this new ecosystem. Alex: How does one get started with NoSQL? Mathias: Well, that’s a question I get quite a lot, but it is not that easy to answer. From a business perspective, you are probably going to find some use cases where storing your data in a relational database doesn’t make too much sense and you’ll start looking for ways to get it out of the database. Alex: So, as a developer you should just give yourself a chance to play with the new shiny toys. Mathias: Indeed. You can’t really give a universal answer here.
Code Monkeyism | The Blog for Developers by Stephan Schmidt PHP MongoDB driver examples & tips A large proportion of support requests to MongoLab are questions about how to properly configure and use a particular MongoDB driver. This blog post is the third of a series where we are covering each of the major MongoDB drivers in depth. The driver we’ll be covering here is the PHP driver, developed and maintained by the MongoDB, Inc. team (primarily @derickr, @bjori and @jmikola). In this post: This post aims to help you understand how to configure and use the PHP driver effectively in your MongoDB application. A simple PHP example You can find a straightforward example on connecting, inserting, updating and querying using the PHP driver in MongoLab’s Language Center. Production-ready connection settings We often see incorrect configurations of the driver, particularly around timeouts and replica set connections. Additional connection options that are supported by the PHP driver can be found here. PHP driver tips & tricks Index builds can sometimes block new connections connectionTimeoutMS
How Digg is Built? Using a Bunch of NoSQL technologies The picture should speak for Digg’s polyglot persistency approach: But here is also a description of the data stores in use: Digg stores data in multiple types system depending on the type of data and the access patterns, and also for historical reasons in some cases :)Cassandra: The primary store for “Object-like” access patterns for such things as Items (stories), Users, Diggs and the indexes that surround them. I know this will sound strange, but isn’t it too much in there? @antirez Original title and link: How Digg is Built? via: by Alex Popescu & Ana-Maria Bacalu Most read Latest
The Apache Cassandra Project Hadoop Has Promise but Also Problems... Show Me the Cheaper or Simpler Alternatives Jessica E. Vascellaro for WSJ: But some early adopters of Hadoop now say using the technology is challenging and rolling it out will take time.[…]Mr. Boroditsky says Hadoop is “immature” and comes with additional costs of hiring in-house expertise and consultants. I’m starting to believe that the “Hadoop has problems and is complex” chorus is a vendor reaction very similar to the reaction they had to open source in general. how many other tools can lead you to the same solution? It would be great if Hadoop administration would get simpler and operational costs would go down and if know-how would be easier to find. Original title and link: Hadoop Has Promise but Also Problems… Show Me the Cheaper or Simpler Alternatives (NoSQL database©myNoSQL)
Aggregation in MongoDB 2.6: Things Worth Knowing TL;DR: The powerful aggregation framework in MongoDB is even more powerful in MongoDB 2.6 The MongoDB 2.6 release improved aggregation framework (one of MongoDB's best features) considerably. We often hear from customers who are unaware of the aggregation framework, or unsure exactly why they should be using it. Introducing aggregation The aggregation framework in MongoDB has become the go-to tool for a range of problems which would traditionally have been solved with the map-reduce engine. Step by step We prefer to show rather than tell, so lets look at a worked example. If you don't have a collection like that try this Node.js program which will make you a million documents: (The program uses the Faker.js library to mock up records. What we want to know is how many of those records belong to the same zip code. Now we want to use the aggregate's $group operator as the first element of the pipeline. We add this as the next element in the pipeline array and we put that into the shell...
nodechat.js – Using node.js, backbone.js, socket.io, and redis to make a real time chat app Geek fun: take node.js and a NoSQL database — usually it is MongoDB, CouchDB, or Redis, but adventurous types could even try Riak, HBase, or Cassandra — and create a “real-time” chat or collaborative editor: nodechat.js is a simple, realtime chat app that leverages node.js, backbone.js, socket.IO, and redis. I wrote it as an exercise and I am sharing it becuase there are relatively few working examples using all these pieces together. The outcome? Update: A node.js, socket.io, and CouchDB post. Original title and link: nodechat.js – Using node.js, backbone.js, socket.io, and redis to make a real time chat app (NoSQL databases © myNoSQL)
CouchDB: Technical Overview A Database for the Web CouchDB is a database that completely embraces the web. Store your data with JSON documents. Access your documents with your web browser, via HTTP. CouchDB comes with a suite of features, such as on-the-fly document transformation and real-time change notifications, that makes web app development a breeze. See the introduction , technical overview , or one of the guides for more information. Want to Contribute? CouchDB is an open source project. One of the first things you should do is actually use CouchDB, and get to know it, read about it, evangelise it, and engage with the wider community. Why don’t you check out JIRA and help us triage some of those issues? Do you want to contribute code?
Reflections on the New Database Revolution | The Database Revolution The recent Bloor Group roundtable discussion on The New Database Revolution covered a lot of ground. Both the benefits and challenges of the new database paradigms were discussed and a broad range of questions were raised by the audience. In thinking about how the discussion went, there are a number of themes that are worth further mention. The Need for History One important theme is the need to understand how we got where we are today. Today, we find architectures in enterprises that cannot be explained by conscious decisions based on a single current architecture. So we have an existing architecture, which is already very complex and may not be fully understood (or understandable) and we are going to have to add to it to support the new database platforms. Before and After the Relational Interlude Another historical topic that came up in the roundtable was the existence of databases in the long-forgotten world prior to the dominance of the relational paradigm.
MongoDB Blog By Sunil Sadasivin, CTO at Buffer Buffer, powered by experiments and metrics At Buffer, every product decision we make is driven by quantitative metrics. We have always sought to be lean in our decision making, and one of the core tenants of being lean is launching experimental features early and measuring their impact. Buffer is a social media tool to help you schedule and space out your posts on social media networks like Twitter, Facebook, Google+ and Linkedin. We started in late 2010 and thanks to a keen focus on analytical data, we have now grown to over 1.5 million users and 155k unique active users per month. When I started at Buffer in September 2012 we were using a mixture of Google Analytics, Kissmetrics and an internal tool to track our app usage and analytics. We took the plunge in April 2013 to build our own metrics framework using MongoDB. Why we chose MongoDB At the time we were evaluating datastores, we had no idea what our data would look like. Tracking events Result: