Life beyond Distributed Transactions: an Apostate's Opinion

Pat Helland is one of the veterans of the database community. He worked on the Tandem Computers with Jim Gray. His tribute to Jim Gray, which gives a lot of insights into Jim Gray as a researcher, is worth reading again and again.

This 2007 position paper from Pat Helland is about extreme scalability in cloud systems, and by its nature anti-transactional. Since Pat has been a strong advocate for transactions and global serializability for most of his career, the title is aptly named as an apostate's opinion.

This paper is very relevant to the NoSQL movement. Pat introduces "entity and activities" abstractions as building primitives for extreme scalability cloud systems. He also talks about at length about the need to craft a good workflow/business-logic on top of these primitives.

Entity and activities abstractions
Entities are collections of named (keyed) data which may be atomically updated within the entity but never atomically updated across entities. An entity lives on a single machine at a time and the application can only manipulate one entity atomically. A consequence of almost-infinite scaling is that this programmatic abstraction must be exposed to the developer of business logic. Each entity has a unique ID, and entities represent disjoint sets of data.

Since you can’t update the data across two entities in the same transaction, you need a mechanism to update the data in different transactions. The connection between the entities is via a message addressed to the other entity.

Activities comprise the collection of state within the entities used to manage messaging relationships with a single partner entity. Activities keep track of messages between entities. This can be used to keep entities eventually-consistent, even when we are limited to do the transaction on a single entity. (Messaging notifies the other entity about this activity, and the other entity may update its state.)

Key-value tuple concept widely employed in key-value stores in cloud computing systems is a good example of an entity. However, key-value tuples do not specify any explicit "activities". Note that, if we can manage to make messages between entities idempotent, then we don't need to keep activities for entities; hence entity+activities concept reduces to the key-value tuple concept.

In fact several developers invented on their own different ad~hoc ways of implementing activities on top entities. What Pat is advocating is to explicitly recognize activities and develop a standard primitive for implementing them to avoid inconsistency bugs.

An example of an activity is found in Google's Percolator paper which replaced MapReduce for creating Google's pagerank index. Percolator provides a distributed transaction middleware leveraging on BigTable. Each row is an entity as a transaction is atomic with respect to a row at any time. However, to build a distributed transaction, the system should remember the state of the transaction with respect to other involved rows, i.e., "activities". This Percolator metadata is again encoded as a separate field in that row in BigTable. Percolator logs the state, for example, primary and secondary locks in these fields. (See Figure 5 for full list.) I guess using coordination services such as Zookeper is also another way of implicitly implementing activities.


Workflow is for dealing with uncertainty at a distance
In a system which cannot count on atomic distributed transactions, the management of uncertainty must be implemented in the business logic. The uncertainty of the outcome is held in the business semantics rather than in the record lock. This is simply workflow. Think about the style of interactions common across businesses. Contracts between businesses include time commitments, cancellation clauses, reserved resources, and much more. The semantics of uncertainty is wrapped up in the behaviour of the business functionality. While more complicated to implement than simply using atomic distributed transactions, it is how the real world works. Again, this is simply an argument for workflow but it is fine-grained workflow with entities as the participants.

Concluding remarks
Systematic support for implementing the activities concept is still lacking today. It seems like this concept needs to address more explicitly and more methodically to improve the NoSQL systems.

Workflow is also the prescribed as the way to deal with the lack of atomic distributed transactions. Workflow requires the developer to think hard and decide on the business logic for dealing with the decentralized nature of the process: time commitments, cancellation clauses, reserved resources, etc. But, are there any support for developing/testing/verifying workflows?

Comments

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book