Posts

Showing posts from July, 2025

Chapter 7: Distributed Recovery (Concurrency Control Book)

Image
Chapter 7 of the Concurrency Control and Recovery in Database Systems book by Bernstein and Hadzilacos (1987) tackles the distributed commit problem: ensuring atomic commit across a set of distributed sites that may fail independently. The chapter covers these concepts: The challenges of transaction processing in distributed database systems (which wasn't around in 1987) Failure models (site and communication) and timeout-based detection The definition and guarantees of Atomic Commitment Protocols (ACPs) The Two-Phase Commit (2PC) protocol (and its cooperative termination variant) The limitations of 2PC (especially blocking) Introduction and advantages of the Three-Phase Commit (3PC) protocol Despite its rigor and methodical development, the chapter feels like a suspense movie today. We, the readers, equipped with modern tools like FLP impossibility result and Paxos protocol watch as the authors try to navigate a minefield, unaware of the lurking impossibility results that were pu...

Popular posts from this blog

Hints for Distributed Systems Design

My Time at MIT

Making database systems usable

Advice to the young

Looming Liability Machines (LLMs)

Learning about distributed systems: where to start?

Foundational distributed systems papers

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Distributed Transactions at Scale in Amazon DynamoDB

Linearizability: A Correctness Condition for Concurrent Objects