Mergeable Replicated Data Types

This paper is by Gowtham Kaki, Swarn Priya, KC Sivaramakrishnan, and Suresh Jagannathan, and it appeared in OOPSLA 19.

We all know and love CRDTs. Using CRDTs, any replica can accept updates without remote synchronization. The updates may reach to replicas and applied in each in different order, but provided that the updates commute each replica converges to the same state eventually. CRDTs provide out-of-the-box eventual consistency to replicated data types with commutative operations, examples include set data type with member-addition/removal and counter data type with addition/subtraction.

Unfortunately CRDTs are limited to commutative operations. If we also wanted to have multiplication for the counter data structure, since that won't commute with addition and subtraction, we would be in trouble. MRDTs extend CRDT idea to allow non-commutative operations for certain data structures.

The idea in MRDTs is to use a 3-way merge formula. If you have an extra bit of information, the least common ancestor (LCA), you can merge the two replica states v1 and v2 leveraging the LCA as the starting point context. The paper notes that this is similar to how Git operates when merging branches.

Let's consider this idea for the counter data structure (with both addition and multiplication operations). Consider two replicas starting from LCA l=5. One applied multiplication by 2 and ended up V1=10. The other applied addition with -1 and ended up with V2=4. The three way merge suggested for v1, v2, and l is as follows: merge= L + (V1-L) + (V2-L). This is applied in each replica unilaterally, without coordination, and each replica ends up with value 9 (i.e., 5 + 5 -1).

Note that this mergeable counter does not guarantee linearizability (for instance, if the concurrent operations in Fig. 2 are mult 2 and mult 3, then the merge result would be 25 and not 30). Nonetheless, it guarantees convergence, with a somewhat meaningful semantics.

Let's consider the three-way merge idea for sets. Given two replicas {1,2} and {2,3,4}, without additional context, we would say that the merge would be {1,2,3,4}. But if we knew that the LCA for these two replicas was {1,2,3}, we would be able to infer, there was a removal of 3 for the first replica/branch, and removal of 1 and addition of 4 on the second replica/branch, and thus the merge result would become {2,4}. This is indeed the case for applying the 3-way merge formula adopted for the set data structure:
merge = $(L \cap S1 \cap S2)  \cup (S1 \setminus L) \cup (S2 \setminus L) $

Invertible relational specifications

The neat thing in the paper is that this same 3-way merge idea and semantics  generalizes to the lists, queues, priority queues, binary trees, etc. The paper presents a general instantiable merging framework with the use of abstraction/concretization relation. The abstraction idea is useful because it is easier to define this mergeable property at the abstract level. Moreover, since several data structures share the same abstract data-type, the approach becomes simple to use.

A data structure can be uniquely characterized by the contents it holds, and the structural relationships defined among them. The paper identifies a member relationship and occurred before relationship and show that these two relationships are enough to model many data-structures at the concrete level in the abstract. And, best of all, the general formula is the same as the formula for the counter data structure, but with the member and occurred-before relationships changed to fit the data structure.

The set data structure discussed above is considered as the base abstract data structure all the other data structures can be mapped to. The paper shows that many data structures can be mapped losslessly to the rich domain of relations over sets, wherein relations so merged can be mapped back to the concrete domain to yield consistent and useful definitions of these aforementioned concepts. For these data structures, the merge semantics for arbitrary data types can be automatically derived, given a pair of abstraction ($\alpha$) and concretization ($\gamma$) functions for each data type that map the values of that type to and from the relational domain (the pair ($\alpha,\gamma$) is an *invertible relational specification* of the type).

The paper provides an OCaml library that lets you derive MRDTs from ordinary OCaml data types. The @@deriving merge annotation tells the compiler to automatically derive the merge function.

The video presentation of the paper by Gowtham is easy to follow, and explains  the motivation and the basic idea clearly.


MRDTs also have several limitations. They don't apply to arbitrary complex data structures or objects. To merge concurrent versions, the 3-way merge formula ignores the operations and instead focus on the difference between each version and the LCA.  This is why this idea may not work well for operations that require a precondition to execute. The precondition may be satisfied in one replica and not satisfied in the other replica.

The paper appeared in a programming languages conference, and several parts in the paper are hard to follow for people outside that domain. This paper has good applicability to distributed systems and especially distributed databases. A nice followup to this paper could be to implement this in a NoSQL database, such as  MongoDB or Cassandra, or even Voldemort (for simplicity). Without a Git based infrastructure, how feasible is it to implement MRDTs in these data stores?

Another interesting thing would to be show applications of MRDT idea in the context of coordination avoidance in database systems.


Popular posts from this blog

Graviton2 and Graviton3

Foundational distributed systems papers

Learning a technical subject

Your attitude determines your success

Learning about distributed systems: where to start?

Progress beats perfect

CockroachDB: The Resilient Geo-Distributed SQL Database

Warp: Lightweight Multi-Key Transactions for Key-Value Stores

Amazon Aurora: Design Considerations + On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes

Anna: A Key-Value Store For Any Scale