Utilizing highly synchronized clocks in distributed databases

This master's thesis at Lund University Sweden explores how CockroachDB's transactional performance can be improved by using tightly synchronized clocks. The paper addresses two questions: how to integrate high-precision clock synchronization into CockroachDB and the resulting impact on performance. Given the publicly available clock synchronization technologies like Amazon Time Sync Service and ClockBound, the researchers (Jacob and Fabian) argue that the traditional skepticism around the reliability of clocks in distributed systems is outdated. 


CockroachDB vs Spanner approaches

CockroachDB uses loosely-synchronized NTP clocks, and achieves linearizability by using Hybrid Logical Clocks (HLC) and relying on a static maximum offset (max_offset=500milliseconds) to account for clock skew. However, this approach has limitations, particularly in handling transactional conflicts within a predefined uncertainty interval. When a transaction reads a value with a timestamp falling within this interval, it triggers an uncertainty restart to ensure consistency. These restarts, categorized as necessary (caused by actual clock skew) or unnecessary (caused by latency), increase transaction latency. By using tight clock synchronization and reducing uncertainty intervals, it is possible to achieve performance gains.

To investigate this further, the paper compares CockroachDB's approach with Google's Spanner, which uses the TrueTime API. TrueTime provides bounded timestamps that guarantee the actual time falls within an interval. This enables Spanner to ensure linearizability using commit-wait, which delays transaction commits until timestamp bounds guarantee consistency. That is, a transaction must wait until tmax of the transaction’s timestamp has passed before committing and releasing its locks. As tmax provides an upper bound for timestamps generated across all nodes, it is guaranteed that the changes made by the transaction will be visible across the entire cluster, achieving linearizability.

To adopt tightly synchronized clocks for CockroachDB, the paper considers two main approaches:

  1. Adopting a commit-wait model: This approach, inspired by Google Spanner, involves waiting for a certain period after a transaction acquires all necessary locks to ensure that its changes are visible across the entire cluster. However, the implementation complexity was deemed significant for the current project, and this is not pursued.
  2. Dynamically adjusting uncertainty intervals: This approach focuses on dynamically reducing the size of uncertainty intervals by leveraging bounded timestamps provided by high-precision clock synchronization services, like AWS TimeSync. By dynamically adjusting these intervals based on the actual clock skew, the number of unnecessary transaction restarts can be significantly reduced, leading to improved performance.


TrueClock and ClockBound   

The authors implemented the second approach by modifying CockroachDB to utilize a TrueTime like API and ClockBound. They ran into practical challenges during implementation. Initial tests revealed that retrieving bounded timestamps from ClockBound introduced significant latency (50 microseconds) compared to standard system clock readings (60 nanoseconds). To address this, they ported the open-source ClockBound daemon as a Go library, called TrueClock, which allowed them to include it directly in CockroachDB as it is also written in Go. This in turn removed the need for a datagram request for each clock reading. This reduced clock reading latency from 50 microseconds to 250 nanoseconds, a negligible overhead for database operations.


Evaluation

To test their hypothesis and evaluate the transactional performance gains, the researchers modified CockroachDB to replace the static max_offset with dynamic values calculated based on real-time clockbound synchronization precision. They deployed a CockroachDB cluster consisting of three replicas across three availability zones in the eu-north-1 (Stockholm) region. As predicted, this change significantly reduced the uncertainty interval size, improving the database's ability to process transactions efficiently. The uncertainty intervals shrank by a factor of over 500 compared to the standard configuration, with maximum bounds of 1.4 milliseconds versus the default 500 milliseconds.

Experiments showed significant performance improvements, with read latency reduced by up to 47% and write latency by up to 43%. The authors attribute these gains partly to reduced latch contention. In CockroachDB, latches are used to serialize operations, even for the read operations. Shorter uncertainty intervals allow latches to release sooner, reducing contention and the time writes must wait for reads to complete. Interestingly, although only reads are directly affected by uncertainty restarts, reducing their latency also indirectly benefited write performance due to lower contention.


Conclusion

This work emphasizes the importance of embracing high-precision clocks as they become increasingly accessible in production environments. By dynamically adapting to real-time synchronization precision in place of static/pessimistic assumptions about clock skew, the experiments showed improved performance even with the latches for reads still being in place.

This highlights the potential of integrating modern clock synchronization methods into distributed databases. The results suggest that tighter synchronization not only improves transactional throughput but also offers a feasible path to achieving stronger consistency models like linearizability without significant overhead. The guiding principle should be to use clock synchronization for performance, and not for correctness. 



Comments

Dichen Li said…
This is great learning to me. Also impressive how you discovered research works not only from published articles, but also master's theses. I'm also curious about your last statement: "The guiding principle should be to use clock synchronization for performance, and not for correctness." This work indeed improved performance but did not change the CockroachDB's core algorithm to ensure linearizability (one aspect of correctness). But in Spanner, bounded clock error was a core mechanism to ensure linearizability. (And smaller clock error bound can be used to improve performance.)

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Advice to the young

Foundational distributed systems papers

Distributed Transactions at Scale in Amazon DynamoDB

Linearizability: A Correctness Condition for Concurrent Objects

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book