Elle: Inferring Isolation Anomalies from Experimental Observations

- April 17, 2020

This paper is by Kyle Kingsbury (of Jepsen fame) and Peter Alvaro (of beach wandering fame) and is available on Arxiv.

Adya et.al. (2000) showed that transaction isolation anomalies can be defined in terms of cycles over a Direct Serialization Graph (DSG) that captures the dependencies between transactions. Unfortunately, it was hard to utilize this DGS technique for isolation anomaly checking in practice because many database systems do not have any concept of a version order, or they do not expose that ordering information to clients. This paper shows that it is possible to use an encoding trick on the client side to emulate/maintain that ordering information and ensure that the results of database reads reveal information about their version history. The solution they find is the list data structure, which is supported by many databases. The paper also shows that lighter weight data structures, such as sets, can also be useful for checking violations of weaker isolation properties.

Building on these, the paper presents Elle: a novel checker which infers a DSG using client-observed transactions. Elle not only detect anomalies, it can also discriminate between them, and provide concise explanations of each. The paper gives evidence of its effectiveness via a case study of four real databases using Elle.

The paper is valuable because it builds a bridge between research on dependency graphs and emerging techniques for black-box database testing. On the impact side, I believe Elle the checker will have enormous real work impact as part of the Jepsen framework as it extends checking to multi-object transactions (the serializability branch in the tree). Elle is released as an opensource project (https://github.com/jepsen-io/elle).

In my summary below, I use paragraphs and figures lifted from the arxiv version. I strongly recommend you read the paper, and give Elle a try.

Motivation

Many databases do not provide the isolation guarantees they claim. Checkers help detect violation of isolation guarantees. By generating client workloads and injecting faults, checkers produce anomalies that witness a violation of a stated guarantee.

Many checkers use a particular pattern of transactions for checking. These also have several drawbacks. They find a small number of anomalies in a specific pattern of transactions, and tell us nothing about the behavior of other patterns. They require hand-proven invariants, and each property may require a separate test.

More general checkers exist. But their use is limited by the NP-complete nature of linearizability checking, and the combinatorial explosion of states in a concurrent multi-register system. Serializability checking is also (in general) NP-complete and unlike linearizability, one cannot use real-time constraints to reduce the search space.

Instead of solving for a transaction order, Elle uses its knowledge of the transactions issued by the client, the objects written, and the values returned by reads to reason about the possible dependency graphs of the database in the language of Adya's Direct Serialization Graph (DSG) formalism.

DSG is a graph over transactions in some history H, whose edges are given by these dependencies. It provides a simple way to represent isolation violation anomalies and we can check these properties in linear time: intermediate and aborted reads are straightforward to detect, and once we constructed the dependency graph, cycle detection is solvable in O(vertices + edges) time, thanks to Tarjan's algorithm for strongly connected components.

However, there is one significant obstacle to working with an Adya history: we don’t have it. Many database systems do not have any concept of a version order, or they do not expose that ordering information to clients. This paper shows that it is possible to use an encoding trick on the client side to emulate/maintain that ordering information and ensure that the results of database reads reveal information about their version history. The solution they find is the list data structure, which is supported by many databases.

Deducing dependencies

We can infer properties of every history compatible with a given client observation, by taking advantage of datatypes which allow us to trace the sequence of versions which gave rise to the current version of the object. If every value written to a given register is unique, then we can recover the transaction which gave rise to any observed version. We call this property recoverability: every version we observe can be mapped to a specific write.

Recoverability allows us to infer read dependencies. But blind writes to a register "destroy history". To circumvent this we consider richer datatypes, whose writes do preserve some information about previous versions.

For instance, we could take increment operations on counters, starting at 0. The problem here is we can't tell which increment produced a particular version. This keeps us from inferring write-write, write-read, and read-write dependencies

What if we let our objects be sets of elements, and had each write add a unique element to a given set? While we can recover some (but not all) write-write, write-read, and read-write dependencies, we cannot identify all write-write dependencies due to lack of ordering.

But we can add order to our values by letting each version be a list, to which a write appends a unique value. As with counters and sets, we can use traceability to reconstruct read-read, write-read, and read-write dependencies— but because we have the full version history, we can precisely identify read-write and write-write dependencies for every transaction whose writes were observed by some read.

While Elle can make limited inferences from read-write registers, it shines with richer datatypes, like append-only lists. The paper also shows that lighterweight data structures, such as sets, can also be useful for checking violations of weaker isolation properties.

Implementation and evaluation

Elle is straightforward to run against real-world databases. Most transactional databases offer some kind of list with append. The SQL standard’s CONCAT function and the TEXT datatype are a natural choice for encoding lists, e.g. as comma-separated strings. Some SQL databases, like Postgres, offer JSON collection types. Document stores typically offer native support for ordered collections.

The authors implemented Elle as a checker in the opensource distributed systems testing framework Jepsen and applied it to four distributed systems, including SQL, document, and graph databases, namely TiDB, YugaByte DB, FaunaDB, and Dgraph. Elle revealed anomalies in each of them.

Elle's performance on real-world workloads was excellent; where Knossos (Jepsen’s main linearizability checker) often timed out or ran out of memory after a few hundred transactions, Elle did not exhibit Knossos’ exponential runtimes and was able to check histories of hundreds of thousands of transactions in tens of seconds.

Another nice thing about Elle is that it is informative. It provides a human-readable explanation of why each witness must be an instance of the claimed anomaly.