Posts

Showing posts from August, 2018

TLA+ specification of the bounded staleness and strong consistency guarantees

Image
In my previous post, I had presented a TLA+ modeling of distributed data store that provides the consistent prefix property. In this post, I extend this model slightly to build bounded and strong consistency. In fact the strong consistency specification is achieved when we take the Delta on the bounded consistency as 1.

The TLA+ (well, PlusCal to be more accurate) code for this model is available at https://www.dropbox.com/s/qvmhhgjf9iycaca/boundedstrongP.tla?dl=0

The system model As in the previous post, we assume there is a write region (with ID=1) and read regions (with IDs 2 through NumRegions). The write region performs the write and copies it to the read regions. There are FIFO communication channels between the write and read regions.
WriteRegion == 1
ReadRegions == 2..NumRegions
chan = [n \in 1..NumRegions |-> <<>>]; 

We use D to denote the Delta on the bounded staleness consistency. Bounded staleness ensures that read results are not too stale. That is, the read o…

TLA+ specification of the consistent prefix guarantee

Image
We had covered consistency properties in the recent posts. Now to keep it more concrete and to give you a taste of TLA+, I present a model of a distributed data store that provides the consistent prefix property.

In my next post, I will extend this model slightly to present bounded and strong consistency properties. And, in the post after that, I will add a client model to show how this data store can be extended to provide session consistency. The TLA+ (well, PlusCal to be more accurate) code for this model is available here. 

The system model We assume there is a write region (with ID=1) and read regions (with IDs 2 through NumRegions). The write region performs the write and copies it to the read regions. There are FIFO communication channels between the write and read regions.
WriteRegion == 1
ReadRegions == 2..NumRegions
chan = [n \in 1..NumRegions |-> <<>>]; 

The write region actions
The write region has 3 actions to perform:

Commit the write in the write regionMultic…

Mind Your State for Your State of Mind

Image
This article by Pat Helland appeared at ACM Queue on July 2018. Pat is a self-confessed apostate. He is also a database philosopher; look at the title of his recent publications: Standing on distributed shoulders of giants, Life beyond distributed transactions, Immutability changes everything, Heisenberg was on the "write" track, consistently eventual, etc.

This "mind your state for your state of mind" article looks at the history of interactions of applications and storage/databases, and charts their co-evolution as they move into the distributed and scalable world.

The evolution of state, storage, and computing Storage has evolved from disks directly attached to your computer to shared appliances such as SANs (storage area networks) leading the way to storage clusters of commodity servers contained in a network.

Computing evolved from a single-process on a single server, to multiple processes communicating on a single server, to RPCs (remote procedure calls) acro…

Logical index organization in Cosmos DB

Image
This post zooms into the logical indexing subsystem mentioned in my previous post on "Schema-Agnostic Indexing with Azure Cosmos DB". 

With the advent of big data, we face a big data-integration problem. It is very hard to enforce a schema (structure/type system) on data, and irregularities and entropy is a fact of life. You will be better off if you accept this as a given, rather than pretend you are very organized, you can foresee all required fields in your application/database, and every branch of your organization will be disciplined enough to use the same format to collect/store data.

A patch employed by relational databases is to add sparse new columns to accommodate for possibilities and to store a superset of the schemas. However, after you invoke an alter-table on a big data set, you realize this doesn't scale well, and start searching for schema-agnostic solutions.

Achieving schema agnosticism As we discussed in the previous post, JSON provides a solution for …

Replicated Data Consistency Explained Through Baseball

Image
I had mentioned this report in my first overview post about my sabbatical at Cosmos DB. This is a 2011 technical report by Doug Terry, who was working at Microsoft Research at the time. The paper explains the consistency models through publishing of baseball scores via multiple channels. (Don't worry, you don't have to know baseball rules or scoring to follow the example; the example works as well for publishing of, say, basketball scores.)

Many consistency models have been proposed for distributed and replicated systems. The reason for exploring different consistency models is that there are fundamental tradeoffs between consistency, performance, and availability.

While seeing the latest written value for an object is desirable and reasonable to provide within a datacenter, offering strong consistency for georeplicated services across continents results in lower performance and reduced availability for reads, writes, or both.

On the opposite end of the spectrum, with eventual…

Schema-Agnostic Indexing with Azure Cosmos DB

Image
This VLDB'15 paper is authored by people from the Cosmos DB team and Microsoft research. The paper has "DocumentDB" in the title because DocumentDB was the project that evolved into Azure Cosmos DB. The project started in 2010 to address the developer pain-points inside Microsoft for supporting large scale applications. In 2015 the first generation of the technology was made available to Azure developers as Azure DocumentDB. It has since added many new features and introduced significant new capabilities, resulting in Azure Cosmos DB!

This paper describes the schema-agnostic indexing subsystem of Cosmos DB.  By being fully schema-agnostic, Cosmos DB provides many benefits to the developers. Keeping the database schema and indexes in-sync with an application's schema is especially painful for globally distributed apps. But with a schema-agnostic database, you can iterate your app quickly without worrying of schemas or indexes. Cosmos DB automatically indexes all the d…

The many faces of consistency

Image
This is a lovely paper (2016) from Marcos Aguilera and Doug Terry. It is an easy read. It manages to be technical and enlightening without involving analysis, derivation, or formulas.

The paper talks about consistency. In distributed/networked systems (which includes pretty much any practical computer system today), data sharing and replication brings up a fundamental question: What should happen if a client modifies some data items and simultaneously, or within a short time, another client reads or modifies the same items, possibly at a different replica?

The answer depends on the context/application. Sometimes eventual consistency is what is desired (e.g., DNS), and sometimes you need strong consistency (e.g., reservations, accounting).

The paper points out two different types of consistency, state consistency and operation consistency, and focuses mainly on comparing/contrasting these two approaches.

When I see the terms state versus operation consistency, without reading the rest …

Popular posts from this blog

I have seen things

SOSP19 File Systems Unfit as Distributed Storage Backends: Lessons from 10 Years of Ceph Evolution

PigPaxos: Devouring the communication bottlenecks in distributed consensus

Frugal computing

Fine-Grained Replicated State Machines for a Cluster Storage System

Learning about distributed systems: where to start?

My Distributed Systems Seminar's reading list for Spring 2020

Cross-chain Deals and Adversarial Commerce

Book review. Tiny Habits (2020)

Zoom Distributed Systems Reading Group