SerializabilitySerializability of reads and writes (sequential consistency) is an important feature in distributed systems. A significant application of serializability is in transaction processing in distributed databases (e.g., Spanner).
Paxos serialization versus serializabilityPaxos is a method for fault-tolerant state machine replication. In order to achieve fault-tolerant state machine replication, Paxos employs consensus to serialize operations at a leader and apply the operations at each replica in this exact serialized order (dictated by the leader).
While serialization is not an end in Paxos but rather a means, Paxos is seen nowadays as a defacto method for serialization rather than a method for fault-tolerant state machine replication. Maybe the ZooKeeper implementation of a Paxos-based lock-service help promote this confusion and blur the lines between Paxos's true goal (fault-tolerant replication) with its side effect (serialization).
In fact Paxos serialization is overkill, it is too strong. Paxos will serialize operations in a total order, which is not necessarily needed for sequential consistency. Today in many applications where knowing the total order and replicated-logging of that order is not important, Paxos is still (ab)used.
Serialization is not and should not be tightly-related to a replication solution.
OK, so what? (What is wrong with using a drug for its side effect?)This confusion may permeate the misconception that serialization is inherently a reactive process. Many people have now been thinking about serialization from the Paxos perspective and as an inherently on-demand process.
(Probably there are other problems that arise due to this confusion, but for now I am only picking on this one.)
Proactive serializationSerialization can actually be proactive and anticipatory. The workers does not need to make a serialization request to a Paxos leader. By getting rid of such a request (albeit in a fraction of the cases), latency can be reduced and throughput can be improved.
By adopting the proactive serialization philosophy, a lock service can provide the locks to some workers proactively so those workers can go ahead. Embracing this philosophy, a lock-service
A couple paragraphs above I said that "Today in many applications where knowing the total order and replicated-logging of that order is not important, Paxos is still (ab)used." Picking up on that thread I will claim further (without much evidence for now) that in those cases, proactive serialization could do a better job.
What are some ways to achieve proactive serialization?How can proactive serialization be achieved? One approach is to employ machine learning. The lock service learns the patterns of requests for locks over time and then starts distributing these locks speculatively before they are requested.
Employing presignaling can be another approach to achieve proactive serialization. The nodes can anticipate which locks they will be needing in the next round or minute and ask for those locks in bulk speculatively. The lock service can choose to honor some of these requests and achieve proactive serialization to improve performance.
Another approach to achieve proactive serialization can be to do a static analysis of workers' programs/actions and come up with patterns/sequence of accesses. When the lock service observes one of the detected patterns, it can give the locks in advance to the respective workers that are mentioned in the rest of the pattern.
Did I pick on Paxos unfairly?Actually, none of these proactive approaches are ruled out in Paxos. The Paxos leader itself could be making these proactive/speculative lock giving requests in advance. So, did I really need to pick on Paxos? Maybe not.
However, Paxos serialization in total order is still overkill and can cause problems in a Paxos-based solution. Google Spanner used a smart solution of assigning separate conflict domains (at the tablet level) to Paxos groups. This way it was able to provide locks in parallel across separate conflict domains.
If we do not tightly-couple serialization with a Paxos-based replication service, we can find more efficient and new solutions for serialization.