Posts

Showing posts matching the search for logical clock

Logical clocks and Vector clocks modeling in TLA+/PlusCal

Image
In a distributed system, there is no shared state, counter, or any other kind of global clock.  So we can not implicitly associate an event with its time, because one node may have a different clock than another. Time synchronization is not easy to achieve, and failures complicate things. It turns out we care about time because of its utility in ordering of the events. Using this observation, in 1978, Leslie Lamport offered a time-free definition for "happens before": Event A happens before event B (denoted as A hb B) if and only if A can causally affect B. In the context of distributed systems, A hb B iff 1. A and B are on the same node and A is earlier in computation than B 2. A is the send of a message and B is the receive event for that message 3. There is some third event C, which A hb C, and C hb B. This also suggest the definition for "concurrent" relation. Events A and B are concurrent iff $\neg( A ~hb~ B) \land \neg( B ~hb~ A)$ To capture the hb ...

Implementation of Cluster-wide Logical Clock and Causal Consistency in MongoDB

Image
This paper (SIGMOD 2019) discusses the design of causal consistency and cluster-wide logical clock for MongoDB.  Causal consistency preserves Lamport's happened-before (transitive and partial) order of events in a distributed system. If an event A causes another event B, causal consistency ensures that every process in the system observes A before observing B. Causal consistency guarantees read-your-writes, write-follows-reads, monotonic reads, and monotonic writes. Causal consistency is the strictest consistency level where the system can still make progress (always-available one-way convergence) in the presence of partitions as discussed in  the "Consistency, Availability, and Convergence" paper . This makes causal-consistency a very attractive model for modern distributed applications. https://jepsen.io/consistency The paper provides a simple and balanced design for causal-consistency in MongoDB. The ideas are simple, yet the implementation and operationalizing (backw...

Paper Summary: On the use of Clocks to Enforce Consistency in the Cloud

This paper is by Manuel Bravo, Nuno Diegues, Jingna Zeng, Paolo Romano, Luis Rodrigues, and appeared in IEEE Data Engineering Bulletin 2015. The purpose of this paper is to revisit how the logical and physical clock concepts are applied in the context of developing distributed data store systems for the cloud and review the choice of clocks in relation to consistency/performance tradeoffs. The use of clocks in weak consistency data stores  Dynamo employs sloppy quorums and hinted hand-off and uses version vector (a special case of vector clocks) to track causal dependencies within the replication group of each key. A version vector contains one entry for each replica (thus the size of clocks grows linearly with the number of replicas). The purpose of this metadata is to detect conflicting updates and to be used in the conflict reconciliation function. Here is a link to my Dynamo review. COPS is a geo-replicated datastore and it assigns a scalar clock to each object. Clien...

Analysis of Bounds on Hybrid Vector Clocks

Image
This work is in collaboration with Sorrachai Yingchareonthawornchai and Sandeep Kulkarni at the Michigan State University and is currently under submission. Practice of distributed systems employs loosely synchronized clocks, mostly using NTP. Unfortunately, perfect synchronization is unachievable due to messaging with uncertain latency, clock skew, and failures. These sync errors lead to anomalies. For example, a send event at Branch1 may be assigned a timestamp greater than the corresponding receive event at Branch2, because Branch1's clock is slightly ahead of Branch2's clock. This leads to /inconsistent state snapshots/ because, at time T=12:00, a money transfer is recorded as received at Branch2, whereas it is not recorded as sent at Branch1. Theory of distributed systems shrugs and doesn't even try. Theory abstracts away from the physical clock and uses logical clocks for ordering events. These are basically counters, as in Lamport's clocks and vector cloc...

Use of Time in Distributed Databases (part 1)

Image
Distributed systems are characterized by nodes executing concurrently with no shared state and no common clock. Coordination between nodes are needed to satisfy some correctness properties, but since coordination requires message communication there is a performance tradeoff preventing nodes from frequently communicating/coordinating. Timestamping and ordering Why do the nodes, in particular database nodes, need coordinating anyway? The goal of coordination among nodes is to perform event ordering and align independent timelines of nodes with respect to each other. This is why timestamping events and ordering them is very important. Nodes run concurrently without knowing what the other nodes are doing at the moment. They learn about each other's states/events only by sending and receiving messages and this information by definition come from the past state of the nodes. Each node needs to compose a coherent view of the system from these messages and all the while the system is movi...

Hybrid clocks crate in Rust

Image
I have been learning Rust in my spare time. It has a very steep learning curve, but in return you get to use an elegant, fast, and safe programming language you can use anywhere from embedded systems to datacenter computing. Rust stole like an artist the best concepts from several languages and combined them under one roof with style. Rust has seen a lot of traction recently, and I think Rust is here to stay for a long time.  In the last couple months, I have been writing small programs to practice Rust, and recently I decided I should go bigger and consider reading a real code base, but something with manageable size. I chose to focus on the hybrid clocks crate because it was the right size, and it solved an important problem.  The hybrid clocks crate showcases many of the best practices of Rust. It has a thoughtful design. It shows practical use of struct and trait implementations and trait objects . It was initially hard for me to wrap my mind around the design, because t...

Use of Time in Distributed Databases (part 4): Synchronized clocks in production databases

Image
This is part 4 of our "Use of Time in Distributed Databases" series . In this post, we explore how synchronized physical clocks enhance production database systems. Spanner Google's Spanner (OSDI'12) implemented a novel approach to handling time in distributed database systems through its TrueTime API. TrueTime API provides time as an interval that is guaranteed to contain the actual time, maintained within about 6ms (this is 2012 published number which improved significantly since then) of uncertainty using GPS receivers and atomic clocks. This explicit handling of time uncertainty allows Spanner to provide strong consistency guarantees while operating at a global scale. Spanner uses multi-version concurrency control (MVCC) and achieves external consistency (linearizability) for current transactions through techniques like "commit wait," where transactions wait out the uncertainty in their commit timestamps before making their writes visible. Spanner uses ...

Hybrid Logical Clocks

Image
Here I will write about our recent work on Hybrid Logical Clocks, which provides a feasible alternative to Google's TrueTime. A brief history of time (in distributed systems) Logical Clocks (LC) was proposed in 1978 by Lamport for ordering events in an asynchronous distributed system. LC has several drawbacks for modern systems. Firstly, LC is divorced from physical time (PT), as a result we cannot query events in relation to real-time. Secondly, to capture happened-before relations, LC assumes that there are no backchannels and all communication occurs within the system. Physical Time (PT) leverages on physical clocks at nodes that are synchronized using the Network Time Protocol (NTP). PT also has several drawbacks. Firstly, in a geographically distributed system obtaining precise clock synchronization is very hard; there will unavoidably be uncertainty intervals. Secondly,   PT has several kinks such as leap seconds and non-monotonic updates.  And, when the uncertaint...

Popular posts from this blog

Hints for Distributed Systems Design

My Time at MIT

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Foundational distributed systems papers

Advice to the young

Learning about distributed systems: where to start?

Distributed Transactions at Scale in Amazon DynamoDB

Making database systems usable

Looming Liability Machines (LLMs)

Analyzing Metastable Failures in Distributed Systems