background preloader

Examples

Facebook Twitter

Amazon Architecture | High Scalability. This is a wonderfully informative Amazon update based on Joachim Rohde's discovery of an interview with Amazon's CTO. You'll learn about how Amazon organizes their teams around services, the CAP theorem of building scalable systems, how they deploy software, and a lot more. Many new additions from the ACM Queue article have also been included. Amazon grew from a tiny online bookstore to one of the largest stores on earth. They did it while pioneering new and interesting ways to rate, review, and recommend products. Greg Linden shared is version of Amazon's birth pangs in a series of blog articles Site: Information Sources.

Flickr Architecture | High Scalability. Scaling Twitter: Making Twitter 10000 Percent Faster | High Scal. Update 6: Some interesting changes from Twitter's Evan Weaver: everything in RAM now, database is a backup; peaks at 300 tweets/second; every tweet followed by average 126 people; vector cache of tweet IDs; row cache; fragment cache; page cache; keep separate caches; GC makes Ruby optimization resistant so went with Scala; Thrift and HTTP are used internally; 100s internal requests for every external request; rewrote MQ but kept interface the same; 3 queues are used to load balance requests; extensive A/B testing for backwards capability; switched to C memcached client for speed; optimize critical path; faster to get the cached results from the network memory than recompute them locally.Update 5: Twitter on Scala.

A Conversation with Steve Jenson, Alex Payne, and Robey Pointer by Bill Venners. Twitter started as a side project and blew up fast, going from 0 to millions of page views within a few terrifying months. Lessons learned at Facebook. 7 Scaling Strategies Facebook Used to Grow to 500 Million Users. Robert Johnson, a director of engineering at Facebook, celebrated Facebook's monumental achievement of reaching 500 million users by sharing the scaling principles that helped reach that milestone. In case you weren't suitably impressed by the 500 million user number, Robert ratchets up the numbers game with these impressive figures: 1 million users per engineer500 million active users100 billion hits per day50 billion photos2 trillion objects cached, with hundreds of millions of requests per second130TB of logs every day How did Facebook get to this point?

People Matter Most. These principles are not really new, but I think when you see them all laid out together like this it's easy to see how they all work together to make a self-reinforcing virtuous circle. Will these principles be enough to grow the next 500 million users? Facebook's New Real-time Messaging System: HBase to Store 135+ Billion Messages a Month. You may have read somewhere that Facebook has introduced a new Social Inbox integrating email, IM, SMS, text messages, on-site Facebook messages. All-in-all they need to store over 135 billion messages a month. Where do they store all that stuff? Facebook's Kannan Muthukkaruppan gives the surprise answer in The Underlying Technology of Messages: HBase.

HBase beat out MySQL, Cassandra, and a few others. Why a surprise? HBase is a scaleout table store supporting very high rates of row-level updates over massive amounts of data. Facebook chose HBase because they monitored their usage and figured out what the really needed. A short set of temporal data that tends to be volatileAn ever-growing set of data that rarely gets accessed Makes sense. Some key aspects of their system: I wouldn't sleep on the idea that Facebook already having a lot of experience with HDFS/Hadoop/Hive as being a big adoption driver for HBase. Facebook: Why our 'next-gen' comms ditched MySQL. High performance access to file storage About a year ago, when Facebook set out to build its email-meets-chat-meets-everything-else messaging system, the company knew its infrastructure couldn't run the thing. "[The Facebook infrastructure] wasn't really ready to handle a bunch of different forms of messaging and have it happen in real time," says Joel Seligstein, a Facebook engineer who worked on the project.

So Seligstein and crew mocked up a multifaceted messaging prototype, tossed it onto various distributed storage platforms, and ran a Big Data bake off. The winner was HBase, the open source distributed database modeled after Google's proprietary BigTable platform. Facebook was already using MySQL for message storage, the open source Cassandra platform for inbox search, and the proprietary Haystack platform for storing photos. "The email workload is a write dominated workload. Facebook also picked HBase because it's designed for automatic failover.