Ramblings on serializability

Here is a very raw/immature idea I have been toying with recently. I decided to write it down and share in order to get feedback about whether this is worth exploring further.


Serializability of reads and writes (sequential consistency) is an important feature in distributed systems. A significant application of serializability is in transaction processing in distributed databases (e.g., Spanner).

Paxos serialization versus serializability

Paxos is a method for fault-tolerant state machine replication. In order to achieve fault-tolerant state machine replication, Paxos employs consensus to serialize operations at a leader and apply the operations at each replica in this exact serialized order (dictated by the leader).

While serialization is not an end in Paxos but rather a means, Paxos is seen nowadays as a defacto method for serialization rather than a method for fault-tolerant state machine replication. Maybe the ZooKeeper implementation of a Paxos-based lock-service help promote this confusion and blur the lines between Paxos's true goal (fault-tolerant replication) with its side effect (serialization).

In fact Paxos serialization is overkill, it is too strong. Paxos will serialize operations in a total order, which is not necessarily needed for sequential consistency. Today in many applications where knowing the total order and replicated-logging of that order is not important, Paxos is still (ab)used.

Serialization is not and should not be tightly-related to a replication solution.

OK, so what? (What is wrong with using a drug for its side effect?)

This confusion may permeate the misconception that serialization is inherently a reactive process. Many people have now been thinking about serialization from the Paxos perspective and as an inherently on-demand process.

(Probably there are other problems that arise due to this confusion, but for now I am only picking on this one.)

Proactive serialization

Serialization can actually be proactive and anticipatory. The workers does not need to make a serialization request to a Paxos leader. By getting rid of such a request (albeit in a fraction of the cases), latency can be reduced and throughput can be improved.

By adopting the proactive serialization philosophy, a lock service can provide the locks to some workers proactively so those workers can go ahead. Embracing this philosophy, a lock-service master can anticipate a future request and provide some locks ahead of time. The benefit of this is to eliminate the need for an explicit request (again in a fraction of the cases) and improve throughput and reduce latency.

A couple paragraphs above I said that "Today in many applications where knowing the total order and replicated-logging of that order is not important, Paxos is still (ab)used." Picking up on that thread I will claim further (without much evidence for now) that in those cases, proactive serialization could do a better job.

What are some ways to achieve proactive serialization?

How can proactive serialization be achieved? One approach is to employ machine learning. The lock service learns the patterns of requests for locks over time and then starts distributing these locks speculatively before they are requested.

Employing presignaling can be another approach to achieve proactive serialization. The nodes can anticipate which locks they will be needing in the next round or minute and ask for those locks in bulk speculatively. The lock service can choose to honor some of these requests and achieve proactive serialization to improve performance.

Another approach to achieve proactive serialization can be to do a static analysis of workers' programs/actions and come up with patterns/sequence of accesses. When the lock service observes one of the detected patterns, it can give the locks in advance to the respective workers that are mentioned in the rest of the pattern.

Did I pick on Paxos unfairly?

Actually, none of these proactive approaches are ruled out in Paxos. The Paxos leader itself could be making these proactive/speculative lock giving requests in advance. So, did I really need to pick on Paxos? Maybe not.

However, Paxos serialization in total order is still overkill and can cause problems in a Paxos-based solution. Google Spanner used a smart solution of assigning separate conflict domains (at the tablet level) to Paxos groups. This way it was able to provide locks in parallel across separate conflict domains.

If we do not tightly-couple serialization with a Paxos-based replication service, we can find more efficient and new solutions for serialization.


Murat said…
In response to questions posed here,
I posted the following comment.

The lock service can be implemented distributedly and does not necessarily have a master.

Sorry, I should have been more clear. What I had in mind was to give out locks on a lease. And even then the lock service can revoke the locks by negotiating with the node; If the node that a lock is leased to is not using it, and another node requests, lock service can request it back to give to the other node.

I am hoping that it could be possible to implement this proactive leasing part as "hints" as defined in the Hints for computer system design.


Use hints to speed up normal execution: A hint, like a cache entry, is the saved result of some computation. It is different in two ways: it may be wrong, and it is not necessarily reached by an associative lookup. Because a hint may be wrong, there must be a way to check its correctness before taking any unrecoverable action. It is checked against the 'truth', information that must be correct but can be optimized for this purpose rather than for efficient execution. Like a cache entry, the purpose of a hint is to make the system run faster. Usually this means that it must be correct nearly all the time. Hints are very relevant for speculative execution, and you can find several examples of hints and speculative execution in computer systems. One example of a hint is in the Ethernet, in which lack of a carrier signal on the cable is used as a hint that a packet can be sent. If two senders take the hint simultaneously, there is a collision that both can detect; both stop, delay for a randomly chosen interval, and then try again.

Popular posts from this blog

Graviton2 and Graviton3

Foundational distributed systems papers

Learning a technical subject

Learning about distributed systems: where to start?

Strict-serializability, but at what cost, for what purpose?

CockroachDB: The Resilient Geo-Distributed SQL Database

Amazon Aurora: Design Considerations + On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes

Warp: Lightweight Multi-Key Transactions for Key-Value Stores

Anna: A Key-Value Store For Any Scale

Your attitude determines your success