Chapter 7: Distributed Recovery (Concurrency Control Book)

Chapter 7 of the Concurrency Control and Recovery in Database Systems book by Bernstein and Hadzilacos (1987) tackles the distributed commit problem: ensuring atomic commit across a set of distributed sites that may fail independently. The chapter covers these concepts: The challenges of transaction processing in distributed database systems (which wasn't around in 1987) Failure models (site and communication) and timeout-based detection The definition and guarantees of Atomic Commitment Protocols (ACPs) The Two-Phase Commit (2PC) protocol (and its cooperative termination variant) The limitations of 2PC (especially blocking) Introduction and advantages of the Three-Phase Commit (3PC) protocol Despite its rigor and methodical development, the chapter feels like a suspense movie today. We, the readers, equipped with modern tools like FLP impossibility result and Paxos protocol watch as the authors try to navigate a minefield, unaware of the lurking impossibility results that were pu...