A read-only transaction anomaly under snapshot isolation

This paper, from Sigmod 2004, is short and sweet. Under snapshot isolation level, it shows a surprising example of a transaction history where the read-only transaction triggers a serialization anomaly, even when the update transactions are serializable.

This is surprising because it was assumed that, under snapshot isolation, read-only transactions always execute serializably without ever needing to wait or abort because of concurrent update transactions.

Background

Snapshot isolation is an attractive consistency model for transactions.  Wikipedia has a very nice summary:

In databases, and transaction processing (transaction management), snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database (in practice it reads the last committed values that existed at the time it started), and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot.

Snapshot isolation has been adopted by several major database management systems, such as InterBase, Firebird, Oracle, MySQL, PostgreSQL, SQL Anywhere, MongoDB and Microsoft SQL Server (2005 and later). The main reason for its adoption is that it allows better performance than serializability, yet still avoids most of the concurrency anomalies that serializability avoids (but not always all). In practice snapshot isolation is implemented within multiversion concurrency control (MVCC), where generational values of each data item (versions) are maintained: MVCC is a common way to increase concurrency and performance by generating a new version of a database object each time the object is written, and allowing transactions' read operations of several last relevant versions (of each object).


More information about snapshot isolation can be found at https://jepsen.io/consistency/models/snapshot-isolation
 

But, here are the basics.

Reading from a snapshot means that a transaction never sees the partial results of other transactions: T sees all the changes made by transactions that commit before start(T), and it sees no changes made by transactions that commit after start(T).

Snapshot isolation uses the first-committer-wins rule: A transaction ti will successfully commit if and only if (iff) no concurrent transaction tk has already committed writes/updates that ti intends to write.

Informally, serializability means that transactions appear to have occurred in some total order. Snapshot isolation comes close to satisfying serializability without being too strict it allows more transactions to occur in parallel for better performance.

The first-committer-wins rule allows snapshot isolation to avoid the common type of concurrency anomalies serializability avoids, like the lost-update error as shown in example 1.


Example 1: Avoiding Lost-Updates

If transaction t1 tries to modify a data item x while a concurrent t2 also tries to modify x, then the first committer wins rule will cause one of the transactions to abort, so the first update will not be lost.  The example history H1 below shows the values read and written in a versioned notation; when ti writes a version of x, it is named xi.

H1: R1(x0,50) R2(x0,50) W2(x2,70) C2 W1(x1,60) A1


Since this is the first example, let's do a step-by-step walk-through:

  • transaction t1 does its read of x and finds x0=50 from its snapshot taken at the beginning of t1
  • t2 starts and does its read of x and find x0=50 from its snapshot as well
  • t2 does its write of x, versioned as x2 following the notation, and increments x by 20 to write x2=70
  • t2 commits
  • t1 does its write of x, incrementing it by 10, and writes its version as x1=60
  • during validation, t1 finds conflict with t2, and due to the first-committer-wins rule, t1 aborts.

This history in snapshot isolation level leaves x with the value of 70, since only t2, which was attempting to add a  increment of 20 to x, was able to complete. In ReadCommitted isolation level, H1 would cause a Lost-Update anomaly.


Example 2: write skew anomaly

Snapshot isolation does not ensure that all executed histories are serializable. It is well known that snapshot isolation is prone to the write-skew anomaly shown below.

Suppose x and y are data items with a constraint that x+y>0. Assume initially x0=70 and y0=80. Under snapshot isolation, transaction t1 reads x0 and y0, then subtracts 100 from X, assuming it is safe because sum is 150. Transaction t2 reads x0 and y0, and subtracts 100 from y, assuming it is safe from the same reason. Each update itself is safe, but the resulting history H2 is not serializable:

H2: R1(x0,70) R2(x0,70) R1(y0,80) R2(y0,80) W1(X1,-30) C1 W2(Y2,-30) C2


Here the final committed state (x1 and y2) violates the constraints x+y>0. This problem was not detected by the first-committer-wins rule because two different data items are updated, each under the assumption that the other remained stable. Hence the name "write skew".

One way to fix this is to require that each read of x and y to update y to give the impression of a write of x (this can be done using SELECT FOR UPDATE statement). Now since it seems that both x and y are updated in H2, a conflict is forced to occur and first-committer-wins rule will work. Another approach requires that each constraint on the sum of two accounts be materialized in another item z and requires that all updates of x and y must keep z up to date.


Example 3: A read only transaction anomaly in snapshot isolation

It was assumed that, under snapshot isolation, read-only transactions always execute serializably without ever needing to wait or abort because of concurrent update transactions. This seemed obvious because all reads take place at an instant of time, when all committed transactions have completed their writes and no write of non-committed transactions are visible. That would mean that read-only transactions will not read anomalous results so long as the update transactions do not write such results, right? But example 3 falsifies this.

Suppose x and y are data items denoting a checking account balance and a savings account balance, and initially x0=0 and y0=0. In history H3 below, transaction t1 deposits 20 to the savings account y, t2 subtracts 10 from the checking account x accepting an overdraft with a penalty charge of 1 if x+y goes negative. Finally, t3 is a read-only transaction that retrieves the values of x and y and prints them out. For one sequence of operations, this can result in the following history under snapshot isolation:

H3: R2(x0,0) R2(y0, 0) R1(y0,0) W1(y1,20) C1 R3(x0,0) R3(y1,20) C3 W2(x2,-11) C2

The read-only transaction t3 prints out x=0 and y=20, while the final values are y=20 and x=-11. This can't happen in any serializable execution since if 20 was added to y before 10 was subtracted from x, no charge of -1 would ever occur, and the final x+y would be equal to 10 and not 9.

Notice that we can serialize the updates (as t2 followed by t1) had it not been for the read-only transaction t3 in there. But the root cause of this anomaly is also related to the write-skew anomaly in example 2. The cause of the anomaly here is that t2 reads x0=0 and y0=0, then writes x2=-11, while t1 concurrently updates y1 to hold 20. This concurrent transaction t1 would change the behavior of t2, if t2 started after t1 committed. Although concurrent transactions don't see each other's results in snapshot isolation, the read-only transaction t3 will see the committed state of t1 and expose the out-of serial order commits in H3.

The paper does not say this, but in fact the two remedies suggested for avoiding the write-skew anomaly also fixes the anomaly in example 3. If we give more context to the update transactions (e.g., SELECT FOR UPDATE or a materialized view z to represent x+y) the anomaly is avoided.

Comments

Tobin Baker said…
The anomaly only exists because serializability per se is too weak a consistency condition to be useful (it does not respect the linearizable commit order of transactions). No read-only anomaly is possible if the update transaction history is strictly serializable. It is trivial (but not cheap) to make snapshot isolation strictly serializable by certifying the absence of read-write conflicts (technically, antidependencies in the conflict graph that violate commit order). Popular mechanisms for making snapshot isolation "serializable" such as SSI or SSN allow read-write conflicts (as long as they do not form "dangerous structures"), and hence violate strict serializability.

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Advice to the young

Foundational distributed systems papers

Linearizability: A Correctness Condition for Concurrent Objects

Distributed Transactions at Scale in Amazon DynamoDB

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book