Tuesday, February 10, 2015

Paper summary: Perspectives on the CAP theorem

This is a 2012 short paper by Seth Gilbert and Nancy Lynch that appeared in a special issue commemorating the 12th anniversary of the CAP theorem. Gilbert and Lynch get to write in this special issue because they were the ones to first publish a proof the CAP conjecture put forward by Eric Brewer in PODC 2000 keynote.

In this paper, Gilbert and Lynch aim to situate the CAP theorem in the broader context of a family of results in distributed computing theory that shows impossibility of guaranteeing both safety and liveness in an unreliable distributed system. A quick refresher on specifications is in order. Safety says "nothing bad happens". (You can satisfy safety by doing nothing.) Liveness says "eventually something good happens". Unreliability (aka failure model) says "such and such faults happen".

The impossibility results surveyed in relation to CAP concern slightly different problems and slightly different fault models. While it is easy to confuse CAP with those results on a superficial look, on a closer inspection we see that the results are all distinct and none subsume CAP.

The CAP problem 

The CAP theorem does NOT consider the consensus problem, but considers an easier problem: the atomic read/write register (aka atomic storage) problem. Atomic means that the system provides linearizability, a strong type of single-copy consistency that guarantees that a read returns the most recent version of data. The specifications of this problem are as follows. (The first is the safety property, the second one liveness.)
Consistency: The system provides its clients with a single register (emulated by
multiple storage nodes), and each client can read or write from that register.
Availability: Each request for read or write eventually receives a response.

The FLP (Fisher-Lynch-Patterson) and the attacking generals impossibility results consider the consensus problem. The specifications for consensus are as follows. (The first two are safety properties, the last one a liveness property.)
Agreement: No two process can commit different decisions.
Validity (Non-triviality): If all initial values are same, nodes must commit
that value.
Termination: Nodes commit eventually.

So here is the difference between consensus and atomic storage. Consensus is supposed to dutifully remember a value that is anchored (stored by a majority number of nodes). Consensus is loyal to making that value persist as the committed decision. Atomic storage doesn't have that responsibility. The nodes don't need to commit to a decision value, so the system doesn't need to keep track of and remember whether a value is anchored. The atomic storage system as whole accepts new writes as long as the reads don't return results that betray the single register (i.e., single-copy) abstraction.

And what is the implication of this difference? FLP result declares that even under reliable channels assumption, consensus is impossible to solve in an asynchronous system with node crash failures. For example, Paxos loses liveness because it can not converge to a single leader in an asynchronous model. Did the current leader crash? The failure detector cannot be accurate. If the failure detector incorrectly says that the leader (who is supposed to ensure and remember that a value is anchored) is not crashed, liveness is violated since nodes keep waiting on a failed leader. If failure detector incorrectly says that the leader is crashed, then you have multiple leaders, and liveness is violated because of multiple leaders dueling with forever escalating ballot numbers to get the majority to accept their proposal.

On the other hand, since the atomic storage problem doesn't care about remembering whether a value is anchored, it is oblivious to the dueling leaders clients, and as such it is solvable for crashes of up to half of the nodes with the FLP model (i.e., with reliable channels in an asynchronous system). I had blogged about the Attiya, Bar-Noy, Dolev (ABD) algorithm that achieves this feat.

Now that we know atomic storage problem is solvable with reliable channels with up to minority crashes, what can we say about the atomic storage in the presence of unreliable channels? That is covered by the CAP theorem's fault model, which we discuss next.

The CAP fault model 

We discussed the specifications of the problems considered by CAP, FLP, and attacking generals, but we omitted to talk about another important part of the system specification, the unreliability/fault model.

Above I had introduced the FLP fault model when discussing solvability of consensus versus atomic storage in the FLP model. FLP fault model assumes reliable channels, asynchronous system, crash failure. Of course, by assuming reliable channels, you don't get reliable channels in your deployment. That is just wishful thinking. But since the attacking generals impossibility result proved that consensus is not achivable in the presence of unreliability channels, FLP had to consider reliable channels. Even then, we have disappointment; consensus is also impossible in the FLP model.

CAP does something courageous and considers unreliable channels again (as in the attacking generals fault model) in its fault model. Since CAP is concerned with the atomic storage problem, which is a slightly easier problem than consensus, the attacking generals impossibility result does not subsume the CAP result.

CAP result says that atomic storage problem is also impossible to solve under unreliable channels.

Recall that ABD solved the atomic storage problem in the FLP model. If we move to the CAP fault model and allow partitions, we observe from the ABD algorithm that it blocks (loses availability) for a read or write request that arrives to a node in a minority partition. Just as the CAP says, either consistency or availability has to give.

Here is the proof sketch verbatim from Gilbert-Lynch paper.

Similar to the attacking generals result, the CAP result is oblivious to whether the system is synchronous or asynchronous, and holds in both cases.

What is remaining?

Observe from the CAP proof sketch that the CAP fault model is very rough. When it says unreliable channels, it allows you to assume the worst case (i.e., no message makes it through at all), and prove the impossibility result for that worst case.

What if we quantify and limit the unreliability of the channels to more realistic scenarios. Can we prove more refined versions of CAP? What would be the consistency level a system can provide if the system model allows eventual message arrival? A recent technical report from University of Texas Austin, "Consistency availability convergence" paper, looks at that problem. We will discuss that paper next in our distributed systems seminar.

More about CAP tradeoffs

The Gilbert-Lynch paper discusses some of the practical implications of the CAP theorem and says that Consistency versus Availability should not be seen as an absolute and binary tradeoff. Instead you can consider shades of Consistency versus Availability. Also you can make different Consistency versus Availability tradeoffs at the data level, operation level, and subsystem level. These observations are very similar to the suggestions made in Eric Brewer's article in the same special issue: "CAP 12 years later, how the rules have changed".

The Gilbert-Lynch paper also mentions the scalability problems caused due to trying to enforce consistency, but leaves that discussion as future work. PACELC model by Daniel Abadi provides a more detailed explanation for Low-latency versus Consistency tradeoffs in the absence of partitions.

No comments: