The Stellar Consensus Protocol: A Federated Model for Internet-level Consensus

Last week in our seminar we discussed the Stellar consensus paper.

The paper is long, 32 pages. It looks like the paper is written the way the protocol is conceived and reasoned about. First comes a section on the federated Byzantine agreement (FBA) model, which talks about quorum slices and the quorums that result from them. The next section talks about the requirements for quorum intersections, and defines the dispensable sets with associated theorems. Then comes the federated voting section, with subsections titled:  Voting with open membership, Blocking sets, Accepting statements, Accepting is not enough, Statement confirmation, and Liveness and neutralization. On page 19, the Stellar Consensus Protocol (SCP) section starts, with more definitions and the proofs intertwined with the protocol description. At this point, the reader is already overwhelmed, trying to maintain in his mind theories about how the previous 19 pages might connect back to this protocol, and is confronted with the task of reading through a 9 pages long SCP protocol section.

The paper would be significantly improved if it was rewritten top-down: not in a way the protocol is conceived and proved, but in a reader-friendly manner prioritizing the clear communication of the protocol basics.

It was hard reading through this paper, and I didn't read it thoroughly. Here is what I understand:
Stellar Consensus Protocol is PBFT which uses quorums derived from quorum slices of federated participants, instead of traditional quorums from a closed set of participants.
It would be nice if SCP provided a mechanism that prevents participants from selecting bad quorum slices that lead to bad quorums.

Background and context

Traditional Byzantine Agreement protocols have closed membership: the number of participating nodes and their identities (via public private keys or via non-transferable symmetric-key MACs) are fixed.

SCP is a Federation-based Byzantine Agreement (FBA) protocol which allows open membership instead of requiring closed membership. In a federated open membership model, we don't know the number of nodes in the entire system or all of their identities. Nodes may join and leave.

One way to deal with open membership is Proof-of-Work based blockchains as in BitCoin. That has problems with excessive energy consumption, inscalability of throughput, and due to probabilistic nature of the commit long wait times to have a good guarantee of irreversibility of a transaction.

SCP does not use proof-of-work based blockchains. It adapts the PBFT protocol to work in an open membership federated environment. PBFT is a 3-phase deterministic byzantine consensus protocol. It has similarities with Paxos: in fact if you extend Paxos which just tolerates crash faults to tolerate byzantine faults, you get pretty much the PBFT protocol.

The federated model

The federated model means that a node can have a PBFT/consensus agreement with a set of nodes it specifies, without involving all the nodes in the network in this agreement.

To this end, each node specifies a quorum slice in its config file. The  quorum slice consists of the nodes it trusts, and hopefully be a diverse well-balanced portfolio. By declaring its quorum slice, this node says that it finds the consortium of these nodes (not necessarily individually each one) trustworthy, and will rely on this consortium to convince itself of the agreement and will rely on them to bless/endorse its transactions. Traditional non-federated Byzantine agreement requires all nodes to accept the same slices, in FBA the key innovation is enabling each node to chose its own quorum slice set.

These quorum slices are used for constructing quorums. A quorum is a set of nodes sufficient to reach agreement.

For safety, any two quorums in the network need to intersect, and the intersection should contain nonByzantine nodes. If the quorum intersection consists entirely of Byzantine nodes, then SCP cannot guarantee safety.

The onus is on the users to provide good quorum slices. SCP does not provide a way to check the soundness/integrity of quorum slices which give rise to quorums. Again if the quorum intersection consists entirely of Byzantine nodes, safety is violated and SCP doesn't accept responsibility of that. To quote the paper: "SCP can only guarantee safety when nodes choose adequate quorum slices."

The SCP protocol starts with a nomination phase, which if run long enough, eventually produces the same set of candidate values at every intact node, which means nodes can combine the candidate values in a deterministic way to produce a single composite value for the slot. Upon predicted/approximated convergence of nomination phase, the nodes start the ballot phase to perform federated voting (PBFT) to commit and abort ballots associated with composite values.

When intact nodes agree to commit a ballot, the value associated with the ballot will be externalized for the slot in question. When they agree to abort a ballot, the ballot's value becomes irrelevant. If a ballot gets stuck in a state where one or more intact nodes cannot commit or abort it, then nodes try again with a higher ballot; they associate the new ballot with the same value as the stuck one in case any node believes the stuck ballot was committed.

Safety results from ensuring that all stuck and committed ballots are associated with the same value. Liveness follows from the fact that a stuck ballot can be neutralized by moving to a higher ballot. The good news about SCP is that provided that the quorum condition is satisfied, any committed decision is instantly irreversible.

Here are some videos on SCP (well mostly the motivation and setup of Federated Byzantine Agreement without the SCP protocol description): and

MAD questions

1. What are the scalability limits of SCP? 
That the quorums need to intersect is a limitation. If the quorum selections are not done carefully, you may need majority quorums for the system, and PBFT based protocol would suffer after 20 nodes in quorum and 40 nodes in the system. But there are better ways to select your quorum slices: Instead of a flat system if you use a hierarchical system, with tier 1 nodes, tier 2 nodes, tier 3 nodes, and choose your quorums through this hierarchy you can satisfy the quorum property with about log(N) nodes in contrast to N/2 nodes. Hierarchies actually work pretty well for scaling. Think of a tree with 10 children per node, at level 4 there will 10,000 nodes, and level 5 100,000 nodes.

2. There has been a lot of work on quorum systems and probabilistic quorum systems. Can those be employed to help with the scalability problem in SCP? 

Maybe even some graph algorithms can be relevant, like the "Sparse Partitions" work by Baruch Awerbuch and David Peleg. 

3. Is it possible to come up with a service for SCP that provides checks and prevents nodes from selecting bad quorum slices that lead to bad quorums? 
But why would you tryst that service, that service itself should be built in a trustless way.

4. How can we implement sharding per services in SCP?
In the SCP model described in the paper all transactions are related and potentially interfering/dependent on each other since there is no sharding per services considered. How can we implement sharding support for SCP that provides parallelism inside the services but also allows occasional cross service transactions and prevents double spending. Would it be possible to build something similar to the Aspen model for SCP?


Popular posts from this blog

The end of a myth: Distributed transactions can scale

Foundational distributed systems papers

Strict-serializability, but at what cost, for what purpose?

Learning about distributed systems: where to start?

Speedy Transactions in Multicore In-Memory Databases

The Seattle Report on Database Research (2022)

Checking statistical properties of protocols using TLA+

Anna: A Key-Value Store For Any Scale

Amazon Aurora: Design Considerations + On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes

SQLite: Past, Present, and Future