Posts

Showing posts from December, 2024

Utilizing highly synchronized clocks in distributed databases

Image
This master's thesis at Lund University Sweden  explores how CockroachDB 's transactional performance can be improved by using tightly synchronized clocks. The paper addresses two questions: how to integrate high-precision clock synchronization into CockroachDB and the resulting impact on performance. Given the publicly available clock synchronization technologies like Amazon Time Sync Service and ClockBound, the researchers (Jacob and Fabian) argue that the traditional skepticism around the reliability of clocks in distributed systems is outdated.  CockroachDB vs Spanner approaches CockroachDB uses loosely-synchronized NTP clocks, and achieves linearizability by using Hybrid Logical Clocks (HLC) and relying on a static maximum offset (max_offset=500milliseconds) to account for clock skew. However, this approach has limitations, particularly in handling transactional conflicts within a predefined uncertainty interval. When a transaction reads a value with a timestamp falling ...

Best of Metadata in 2024

Image
I can't believe we wasted another good year. It is time to reflect back on the best posts at Metadata blog in 2024. (I think you guys should tip me just because I didn't call this post "Metadata wrapped".) Distributed systems posts Transactional storage for geo-replicated systems(SOSP11):  I like this paper because it asked the right questions, and introduced parallel snapshot isolation. No individual part is novel (vector clocks, csets) but their composition together and application to WAN web applications have been novel. Walter showed how to limit WAN coordination, while still developing useful applications. An Hourglass Architecture for Distributed Systems (SRDS24 Keynote):  This work successfully bridges theoretical research and practical implementation in large-scale distributed systems in Facebook/Meta control plane. The shared log abstraction proposes to separate the system into two layers: the database layer (proposers/learners) and the log layer (acceptors)....

Index for Designing Data Intensive Applications (DDIA) book

The DDIA book is a great textbook, because it is not written as a textbook, but more of a guidebook.  Textbooks are generally bland and boring. Textbooks that are written by professors even more so, because thoser are often written to impress other professors and to flaunt academic flair. Few textbooks take teaching as the primary goal. DDIA book has clear writing, and it is pragmatic. It is as if your smart colleague is filling you in about the fundamentals as well as intricacies (dirty secrets) of their field. It is genuine and authentic. Kudos Kleppmann! Here are my summaries of the DDIA book chapters. A new version of the book is coming soon, and I look forward to seeing the updated content. Designing Data Intensive Applications (DDIA) Book, Chp 1. Intro and Chp2. Data Models and Query Languages DDIA: Chp 3. Storage and Retrieval (Part 1) DDIA: Chp 3. Storage and Retrieval (Part 2) DDIA: Chp 4. Encoding and Evolution (Part 1) DDIA: Chp 4. Encoding and Evolution (Part 2) DDIA: C...

DDIA: Chapter 11 - Stream Processing

Image
Daily batch processes introduce significant latency, since input changes reflected in the output only after a day. For fast paced business, this is too slow. To reduce delays, stream processing occurs more frequently (e.g., every second) or continuously, where events are handled as they happen.  In stream processing, a record is typically called an event—a small, immutable object containing details of an occurrence, often with a timestamp. Polling for new events becomes costly when striving for low-latency continuous processing. Frequent polling increases overhead as most requests return no new data. Instead, systems should notify consumers when new events are available. Messaging systems handle this by pushing events from producers to consumers. Direct messaging systems require application code to handle message loss and assume producers and consumers are always online, limiting fault tolerance. Message brokers (or message queues) improve reliability by acting as intermediaries. P...

Exploring the NaiadClock TLA+ model in TLA-Web

Image
I have been impressed by the usability of TLA-Web from Will Schultz . Recently I have been using it for my TLA+ modeling of MongoDB catalog protocols internally, and found it very useful to explore and understand behavior. This got me thinking that TLA-Web would be really useful when exploring and understanding an unfamiliar spec I picked up on the web. To test my hunch, I browsed through the TLA+ spec examples here ,  and I came across this spec about the Naiad Clock . Since I had read DBSP paper recently , this was all the more interesting to me. I had written about Naiad in 2014 , and about dataflow systems more broadly in 2017 . Getting to the ASCII version of the spec Unfortunately, I would not be able to play with the spec, because it only came in paper form: "The Naiad Clock Protocol: Specification, Model Checking, and Correctness Proof. "  The spec was available only as 13 pages of latex symbols in the Appendix A of this paper. I did briefly consider manually transfor...

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Understanding the Performance Implications of Storage-Disaggregated Databases

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Designing Data Intensive Applications (DDIA) Book