background preloader

CAP Theorem

Facebook Twitter

Ask For Forgiveness Programming - Or How We'll Program 1000 Cores. The argument for a massively multicore future is now familiar: while clock speeds have leveled off, device density is increasing, so the future is cheap chips with hundreds and thousands of cores.

Ask For Forgiveness Programming - Or How We'll Program 1000 Cores

That’s the inexorable logic behind our multicore future. The unsolved question that lurks deep in the dark part of a programmer’s mind is: how on earth are we to program these things? For problems that aren’t embarrassingly parallel, we really have no idea. IBM Research’s David Ungar has an idea. And it’s radical in the extreme... Grace Hopper once advised “It's easier to ask for forgiveness than it is to get permission.”

You may recognize David as the co-creator of the Self programming language, inspiration for the HotSpot technology in the JVM and the prototype model used by Javascript. During a talk on his research, Everything You Know (about Parallel Programming) Is Wrong! No Shared Lock Goes Unpunished Jakob Engblom recounts a similar line of thought in his blog: Lessons from Nature. Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue  In NoSQL: Past, Present, Future Eric Brewer has a particularly fine section on explaining the often hard to understand ideas of BASE (Basically Available, Soft State, Eventually Consistent), ACID (Atomicity, Consistency, Isolation, Durability), CAP (Consistency Availability, Partition Tolerance), in terms of a pernicious long standing myth about the sanctity of consistency in banking.

Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue 

Myth: Money is important, so banks must use transactions to keep money safe and consistent, right? Reality: Banking transactions are inconsistent, particularly for ATMs. ATMs are designed to have a normal case behaviour and a partition mode behaviour. In partition mode Availability is chosen over Consistency. Why? Your ATM transaction must go through so Availability is more important than consistency. This is not a new problem for the financial industry. During the Renaissance, when the modern banking system started to take shape, everything was partitioned. Consistency it turns out is not the Holy Grail. CAP Twelve Years Later: How the "Rules" Have Changed. This article first appeared in Computer magazine and is brought to you by InfoQ & IEEE Computer Society.

CAP Twelve Years Later: How the "Rules" Have Changed

The CAP theorem asserts that any net­worked shared-data system can have only two of three desirable properties. How­ever, by explicitly handling partitions, designers can optimize consistency and availability, thereby achieving some trade-off of all three. In the decade since its introduction, designers and researchers have used (and sometimes abused) the CAP theorem as a reason to explore a wide variety of novel distributed systems.

The NoSQL movement also has applied it as an argument against traditional databases. The CAP theorem states that any networked shared-data system can have at most two of three desirable properties: consistency (C) equivalent to having a single up-to-date copy of the data; high availability (A) of that data (for updates); and tolerance to network partitions (P). Why "2 of 3" is missleading In fact, this exact discussion led to the CAP theorem. Atomicity (A). How to beat the CAP theorem. The CAP theorem states a database cannot guarantee consistency, availability, and partition-tolerance at the same time.

How to beat the CAP theorem

But you can't sacrifice partition-tolerance (see here and here), so you must make a tradeoff between availability and consistency. Managing this tradeoff is a central focus of the NoSQL movement. Consistency means that after you do a successful write, future reads will always take that write into account. Availability means that you can always read and write to the system.

During a partition, you can only have one of these properties. Systems that choose consistency over availability have to deal with some awkward issues. The other option is choosing availability over consistency. I believe that maintaining eventual consistency in the application layer is too heavy of a burden for developers. So sacrificing availability is problematic and eventual consistency is too complex to reasonably build applications. There is another way. What is a data system? That's it.