Paper review. Asynchronous consensus without rounds

This paper by Robbert van Renesse appeared on Arxiv two weeks ago. (Update: Huh, I missed this earlier, but the paper has a footnote that says it was written in 2010.) The paper looks very interesting. I only got to skim the paper, but I will give this a careful read later.

All published crash and Byzantine fault tolerant asynchronous consensus protocols use rounds. (Yes, indeed... Paxos, Viewstamped Replication, even Nakamoto consensus, and Avalanche protocol all use rounds.) Rounds are such an inherent part of consensus algorithms that it is tempting to conclude that solving fault tolerant consensus requires ordered rounds. This paper shows that such a conclusion would be wrong by showing an asynchronous consensus protocol that does not use rounds.

The protocol is named after Texel, an island of Netherlands. Presumably this is because 1) Robbert is Dutch, and 2) he wants to name an alternative island to Paxos island in a sea farther away from the Ionian sea. Texel provides binary consensus (can only decide between 0 and 1 as input votes) without rounds. Nodes query/poll other nodes to make up their minds. If 2/3rd vote is same, the value is anchored. Texel is reminiscent of Avalanche in that it is a decentralized binary consensus algorithm that works by nodes polling other nodes. However, in contrast to Avalanche which uses rounds and is probabilistic, Texel does not use rounds and it is deterministic. Instead of rounds, nodes use querying of consistent cuts and change their vote. The consistent cuts are identified by using vector clocks in the Texel algorithm.

Texel is not a very efficient algorithm, because each node makes up its own mind in a decentralized manner. In contrast, by having leaders coordinate other nodes for which proposal to vote on for a given round, Paxos achieved efficiency and economy.

What is more, in Texel, processes are not allowed to respond to queries if they are experimenting themselves. How is this supposed to terminate/decide? In this respect, Texel again resembles the Avalanche algorithm. If two conflicting proposals exist, consensus is not guaranteed to terminate for Avalanche. The same thing holds for Texel as well.

Texel requires 3f+1 processes, where f is the number of processes that can crash. Byzantine fault-tolerant Texel requires 5f+1 nodes. (I need to check how Texel deals with vector clock integrity in the presence of Byzantine nodes. Maybe it is done through signing entries by corresponding processes, because vector clock entries are only maxed, not updated by processes other than the originators.)

Conclusion

Ok, so what do we have here? Texel is not very efficient. It may not terminate. It uses more processes to tolerate f faulty processes. This all makes me think, rounds are a  great abstraction for distributed systems. Ain't nothing wrong with rounds. They are implementable via ballotnumbers as in PAxos and you are done. The paper also doesn't claim that there is something  wrong with rounds, and neither does it claim that solving consensus without rounds brings any advantages.

On the other hand, this is still a super exciting paper, because Texel proves that distributed fault-tolerant consensus is possible without rounds. Texel breaks new ground! It may be possible to have more useful instances of no-round consensus algorithms in the future. The Texel protocol is derived using stepwise refinement. (Robbert had also used stepwise refinement in his work on chain replication. It is a technique that keeps on giving.) Starting from a high-level specificition of Consensus, an intermediate level specification  called ProtoConsensus with no-rounds is shown to refine the Consensus specification, and Texel is shown to refine ProtoConsensus. It may be possible to search for alternative implementations refining ProtoConsensus.

I am happy that new territory is being explored for decentralized consensus for the last couple years. It is exciting times for distributed systems and algorithms.

MAD questions

1. How could we extend this to multi-decree consensus?
Texel is a single-instance consensus algorithm? Is it possible to extend Texel to multiple-instance consensus in an efficient way and implement state machine replication using it? Is it possible to do linearizable reads from that multi-instance algorithm? Given that there is no leader, and commit time of an update operation is murky, this will be tricky.

2. Is it possible to use HLC instead of VC for finding consistent cuts?
I suppose it may be possible to use HLC for the non-byzantine version of Texel and benefit from loosely synchronized clocks, but I don't know if there would be a big practical gain.

3. How do we implement this protocol in TLA+?
In PlusCal implementing Texel won't be very hard. Modeling Texel in PlusCal may provide value because it will let you test different invariants and temporal properties and exploring variations on the protocol. If I can get the scaffold for this in place, I may even assign this as course project this year.

Comments

Popular posts from this blog

The end of a myth: Distributed transactions can scale

Hints for Distributed Systems Design

Foundational distributed systems papers

Learning about distributed systems: where to start?

Metastable failures in the wild

Scalable OLTP in the Cloud: What’s the BIG DEAL?

The demise of coding is greatly exaggerated

SIGMOD panel: Future of Database System Architectures

Dude, where's my Emacs?

There is plenty of room at the bottom