background preloader

Document Oriented

Facebook Twitter

Things they don't tell you about MongoDB. MongoDB is by far the most popular NoSQL database in Brazil (at least based on the amount of blog posts and articles writen about it here that I read). It’s really an amazing solution but what really bothers me is the fact that very few people know about it’s limitations. So I see the same story repeating itself: people unhappy with it treating his limitations as if they were bugs.

This post is about some of it’s limitations that really caught me by surprise, so that if you are thinking in adopting it at least you’ll be warned about them and so avoid these headaches. Hungry for bytes This was my first surprise: MongoDB consumes too much disk space. If storage space is a restriction to your project you MUST take this in consideration. Data replication with the replica-set strategy is amazing, but have it’s limitations. The replica-set strategy for data replication in MongoDB is amazing. Master-slave replication will not ensure to you high availability Avoid the 32 bit version.

Genghis, the single-file MongoDB admin app. Apache CouchDB. MongoDB. Richardwilly98/elasticsearch-river-mongodb. Amazon EC2 and MongoDB configuration for great performance. Sometimes, we prefer using Amazon EC2 directly for our Rails stack. No offense to Heroku but we need a more controlled environment; and no offense to EngineYard as they don’t support MongoDB on their environment as yet. We were faced with several problems that we wanted to solve Control our environment without MongoDB hogging all the memory.Choose the right instanceChoose the right fileSystem for the optimal performance. Choice of EC2 instance is always an interesting one – when you have to shell out the money from your pocket, you neither want to overspend nor underutilize the instance.

Over the course of changing instances, tuning them up for performance, we realized these important pointers: MongoDB runs well on ext4 systems and really badly on ext3 When we attached an EBS to the instance, you need to format the filesystem. After we tried these stunts we found that this is indeed well documented on the mongodb site!! Freeing page caches # echo 1 > /proc/sys/vm/drop_caches Add Swap space! Understanding MongoDB Storage - PolySpot Blog. Posted on July 4, 2012 by Arnaud BAILLY How MongoDB stores its Data This article tries to explain the intricacies of MongoDB storage and how it affects performances of the database.

It was sparked by questions which people who were more familiar with “traditional” databases asked, and more specifically questions regarding the memory consumption of mongod and its impact on other processes running in the same host. It is rather linux-centric. The following figure tries to present how the various components of a MongoDB node (Disks, File System, RAM) interact to provide access to the database. Note that this scheme is not specific to so-called NOSQL databases and actually is rather a general depiction of how databases interact with the operating system to provide data. Memory-mapped Files MongoDB stores its data into files called extents with a standard size of 2GB (actually, the process is a little bit more complex). Running htop and filtering on mongod yields the following picture: References. A Year with MongoDB - Engineering at Kiip.

This week marks the one year anniversary of Kiip running MongoDB in production. As of this week, we’ve also moved over 95% of our data off of MongoDB onto systems such as Riak and PostgreSQL, depending which solution made sense for the way we use our data. This post highlights our experience with MongoDB over the past year. A future post will elaborate on the migration process: how we evaluated the proper solutions to migrate to and how we migrated the data from MongoDB. First, some numbers about our data to give context to the scale being discussed. Data size: 240 GBTotal documents: 85,000,000Operations per second: 520 (Create, reads, updates, etc.) The Good We were initially attracted to MongoDB due to the features highlighted on the website as well as word of mouth from those who had used it successfully.

Schemaless - Being a document data store, the schemaless-nature of MongoDB helps a lot. The Bad What We’re Doing Now In retrospect, MongoDB was not the right solution for Kiip. MongoDB strategies for the disk-averse. Feb 09th Behind the scenes at foursquare, we have a lot of data collection efforts that present interesting scaling puzzles. One is the venue metrics system, which allows business owners to get information about checkins to their venue over time. It lets them see the effect of specials, understand their clientele’s demographics, and even identify their most loyal customers. To store this data, we need to handle tens of writes per second across millions of venues, interleaved with infrequent reads of the last 90 days of data for a given venue. Ideally, reads would return within a second or two. We’re fans of MongoDB, and one natural way to store this information would be to have one document for every active (venue, hour) pair which contains various counters (men, women, etc.).

If we hold all of the data live in RAM, this is no big deal. Locality, locality, locality A lot of other database systems provide ways to establish disk locality. They to query a range of this data: An Articulate Introduction to MongoDB. NOSQL has become a very heated topic for large web-scale deployment where scalability and semi-structured data driven the DB requirement towards NOSQL.

There has been many NOSQL products evolving in over last couple years. In my past blogs, I have been covering the underlying distributed system theory of NOSQL, as well as some specific products such as CouchDB and Cassandra/HBase. Last Friday I was very lucky to meet with Jared Rosoff from 10gen in a technical conference and have a discussion about the technical architecture of MongoDb.

I found the information is very useful and want to share with more people. One thing I am very impressed by MongoDb is that it is extremely easy to use and the underlying architecture is also very easy to understand. Here are some simple admin steps to start/stop MongoDb server 02.mkdir /data/lib 05.... 08.... Major difference from RDBMSMongoDb differs from RDBMS in the following way Query processingMongoDb belongs to the type of document-oriented DB. 09. true)

Trees in MongoDB. To model hierarchical or nested data relationships, you can use references to implement tree-like structures. The following Tree pattern examples model book categories that have hierarchical relationships. Model Tree Structures with Child References (link) The Child References pattern stores each tree node in a document; in addition to the tree node, document stores in an array the id(s) of the node’s children. Consider the following hierarchy of categories: Tree data model for a sample hierarchy of categories. The following example models the tree using Child References, storing the reference to the node’s children in the field children: The Child References pattern provides a suitable solution to tree storage as long as no operations on subtrees are necessary. Model Tree Structures with Parent References (link) The Parent References pattern stores each tree node in a document; in addition to the tree node, the document stores the id of the node’s parent.

(link) (link) (link) SimpleDB. Amazon SimpleDB is a highly available NoSQL data store that offloads the work of database administration. Developers simply store and query data items via web services requests and Amazon SimpleDB does the rest. Unbound by the strict requirements of a relational database, Amazon SimpleDB is optimized to provide high availability and flexibility, with little or no administrative burden. Behind the scenes, Amazon SimpleDB creates and manages multiple geographically distributed replicas of your data automatically to enable high availability and data durability. The service charges you only for the resources actually consumed in storing your data and serving your requests. The service allows you to focus fully on value-added application development, rather than arduous and time-consuming database administration.

Amazon SimpleDB automatically creates multiple geographically distributed copies of each data item you store. Application examples include: Couchbase | Document-Oriented NoSQL Database. RavenDB - 2nd generation document database. Document-oriented database. This article is about the software type. For usage/deployment instances, see Full text database. A document-oriented database is a computer program designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data.

Document-oriented databases are one of the main categories of NoSQL databases and the popularity of the term "document-oriented database" (or "document store") has grown[1] with the use of the term NoSQL itself. In contrast to relational databases and their notion of "Relation", i.e., a tuple (or row) of related strong-typed data items, these systems are designed around an abstract notion of a "document".

Document-oriented databases are inherently a subclass of the key-value store, another NoSQL database concept. XML databases are a specific subclass of document-oriented databases. Documents[edit] To understand the difference, consider this text document: Bob Smith 123 Back St. Now consider the same document marked up in pseudo-XML: