background preloader

Scalability

Facebook Twitter

Amazon's Dynamo. In two weeks we’ll present a paper on the Dynamo technology at SOSP, the prestigious biannual Operating Systems conference.

Amazon's Dynamo

Dynamo is internal technology developed at Amazon to address the need for an incrementally scalable, highly-available key-value storage system. The technology is designed to give its users the ability to trade-off cost, consistency, durability and performance, while maintaining high-availability. Let me emphasize the internal technology part before it gets misunderstood: Dynamo is not directly exposed externally as a web service; however, Dynamo and similar Amazon technologies are used to power parts of our Amazon Web Services, such as S3. We submitted the technology for publication in SOSP because many of the techniques used in Dynamo originate in the operating systems and distributed systems research of the past years; DHTs, consistent hashing, versioning, vector clocks, quorum, anti-entropy based recovery, etc. The official reference for the paper is: Amazon.com. Java NIO. Comme nous l’avons déjà évoqué sur le blog, à l’occasion du challenge USI 2011, nous nous sommes intéressés à différents serveurs et framework web NIO en Java.

Java NIO

Le principe était simple en mettant à plat la spécification du challenge, nous avons identifié quelques besoins techniques : Une solution pour le marshalling JSONUn serveur web NIO supportant le long pollingUne solution pour la persistence et le partage des données Notre démarche a été de réaliser des POCs implémentant la création des utilisateurs et le long polling pour retenir la meilleure solution. La solution devait être simple et rapide à implémenter, et tenir une charge conséquente en la testant à l’aide de ab l’outil de benchmark Apache et de la librairie Async Http Client. Pour le JSON, nous nous sommes tous rapidement mis d’accord sur l’utilisation de la librairie Jackson.

Pourquoi NIO ? Revenons d’abord à l’essentiel, nous ne pouvons justifier notre choix sans expliquer ce qu’est cette API Java. Le test Restlet Tomcat 7 Deft. Building Scalable Systems: an Asynchronous Approach. The LMAX Architecture. LMAX is a new retail financial trading platform.

The LMAX Architecture

As a result it has to process many trades with low latency. The system is built on the JVM platform and centers on a Business Logic Processor that can handle 6 million orders per second on a single thread. The Business Logic Processor runs entirely in-memory using event sourcing. The Business Logic Processor is surrounded by Disruptors - a concurrency component that implements a network of queues that operate without needing locks. During the design process the team concluded that recent directions in high-performance concurrency models using queues are fundamentally at odds with modern CPU design.

Over the last few years we keep hearing that "the free lunch is over"[1] - we can't expect increases in individual CPU speed. So I was fascinated to hear about a talk at QCon London in March last year from LMAX. Concurrent Programming Using The Disruptor. Facebook's Realtime Analytics System. Recently, I was reading Todd Hoff's write-up on FaceBook real time analytics system.

Facebook's Realtime Analytics System

As usual, Todd did an excellent job in summarizing this video from Engineering Manager at Facebook Alex Himel, Engineering Manager at Facebook. In this first post, I’d like to summarize the case study, and consider some things that weren't mentioned in the summaries. This will lead to an architecture for building your own Realtime Time Analytics for Big-Data that might be easier to implement, using Facebook's experience as a starting point and guide as well as the experience gathered through a recent work with few of GigaSpaces customers. The second post provide a summary of that new approach as well as a pattern and a demo for building your own Real Time Analytics system.. The Business Drive for real time analytics: Time is money The main drive for many of the innovations around realtime analytics has to do with competitiveness and cost, just as with most other advances. Why now? Technology advancement. High Scalability.