background preloader

RavenDB - 2nd generation document database

RavenDB - 2nd generation document database

Document-oriented database This article is about the software type. For usage/deployment instances, see Full text database. A document-oriented database is a computer program designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data. Document-oriented databases are inherently a subclass of the key-value store, another NoSQL database concept. XML databases are a specific subclass of document-oriented databases. Documents[edit] The central concept of a document-oriented database are the documents, which is used in usual English sense of a group of data that encodes some sort of user-readable information. To understand the difference, consider this text document: Bob Smith 123 Back St. Although it is clear to the reader that this document contains the address for a contact, there is no information within the document that indicates that, nor information on what the individual fields represent. Now consider the same document marked up in pseudo-XML: See also[edit]

SimpleDB Amazon SimpleDB is a highly available NoSQL data store that offloads the work of database administration. Developers simply store and query data items via web services requests and Amazon SimpleDB does the rest. Unbound by the strict requirements of a relational database, Amazon SimpleDB is optimized to provide high availability and flexibility, with little or no administrative burden. The service allows you to focus fully on value-added application development, rather than arduous and time-consuming database administration. Amazon SimpleDB automatically creates multiple geographically distributed copies of each data item you store. As your business changes or application evolves, you can easily reflect these changes in Amazon SimpleDB without worrying about breaking a rigid schema or needing to refactor code – simply add another attribute to your Amazon SimpleDB data set when needed. Amazon SimpleDB passes on to you the financial benefits of Amazon’s scale.

richardwilly98/elasticsearch-river-mongodb MongoDB strategies for the disk-averse Feb 09th Behind the scenes at foursquare, we have a lot of data collection efforts that present interesting scaling puzzles. One is the venue metrics system, which allows business owners to get information about checkins to their venue over time. It lets them see the effect of specials, understand their clientele’s demographics, and even identify their most loyal customers. To store this data, we need to handle tens of writes per second across millions of venues, interleaved with infrequent reads of the last 90 days of data for a given venue. We’re fans of MongoDB, and one natural way to store this information would be to have one document for every active (venue, hour) pair which contains various counters (men, women, etc.). If we hold all of the data live in RAM, this is no big deal. Locality, locality, locality A lot of other database systems provide ways to establish disk locality. MongoDB offers two options that address this, but neither is quite right for our application.

A Year with MongoDB - Engineering at Kiip This week marks the one year anniversary of Kiip running MongoDB in production. As of this week, we’ve also moved over 95% of our data off of MongoDB onto systems such as Riak and PostgreSQL, depending which solution made sense for the way we use our data. This post highlights our experience with MongoDB over the past year. First, some numbers about our data to give context to the scale being discussed. Data size: 240 GBTotal documents: 85,000,000Operations per second: 520 (Create, reads, updates, etc.) The Good We were initially attracted to MongoDB due to the features highlighted on the website as well as word of mouth from those who had used it successfully. Schemaless - Being a document data store, the schemaless-nature of MongoDB helps a lot. The Bad Although MongoDB has a lot of nice features on the surface, most of them are marred by underlying architectural issues. Non-counting B-Trees - MongoDB uses non-counting B-trees as the underlying data structure to index data.

An Articulate Introduction to MongoDB NOSQL has become a very heated topic for large web-scale deployment where scalability and semi-structured data driven the DB requirement towards NOSQL. There has been many NOSQL products evolving in over last couple years. In my past blogs, I have been covering the underlying distributed system theory of NOSQL, as well as some specific products such as CouchDB and Cassandra/HBase. Last Friday I was very lucky to meet with Jared Rosoff from 10gen in a technical conference and have a discussion about the technical architecture of MongoDb. I found the information is very useful and want to share with more people. One thing I am very impressed by MongoDb is that it is extremely easy to use and the underlying architecture is also very easy to understand. Here are some simple admin steps to start/stop MongoDb server 02.mkdir /data/lib 05.... 08.... Major difference from RDBMSMongoDb differs from RDBMS in the following way Query processingMongoDb belongs to the type of document-oriented DB. 09. true)

Understanding MongoDB Storage - PolySpot Blog Posted on July 4, 2012 by Arnaud BAILLY How MongoDB stores its Data This article tries to explain the intricacies of MongoDB storage and how it affects performances of the database. It was sparked by questions which people who were more familiar with “traditional” databases asked, and more specifically questions regarding the memory consumption of mongod and its impact on other processes running in the same host. It is rather linux-centric. The following figure tries to present how the various components of a MongoDB node (Disks, File System, RAM) interact to provide access to the database. Note that this scheme is not specific to so-called NOSQL databases and actually is rather a general depiction of how databases interact with the operating system to provide data. Memory-mapped Files MongoDB stores its data into files called extents with a standard size of 2GB (actually, the process is a little bit more complex). Running htop and filtering on mongod yields the following picture: References

Amazon EC2 and MongoDB configuration for great performance Sometimes, we prefer using Amazon EC2 directly for our Rails stack. No offense to Heroku but we need a more controlled environment; and no offense to EngineYard as they don’t support MongoDB on their environment as yet. We were faced with several problems that we wanted to solve Control our environment without MongoDB hogging all the memory.Choose the right instanceChoose the right fileSystem for the optimal performance. Choice of EC2 instance is always an interesting one – when you have to shell out the money from your pocket, you neither want to overspend nor underutilize the instance. Over the course of changing instances, tuning them up for performance, we realized these important pointers: MongoDB runs well on ext4 systems and really badly on ext3 When we attached an EBS to the instance, you need to format the filesystem. After we tried these stunts we found that this is indeed well documented on the mongodb site!! Freeing page caches # echo 1 > /proc/sys/vm/drop_caches Add Swap space!

Related: