Metadata

Posts

Chapter 2: Serializability Theory (Concurrency Control Book)

- May 14, 2025

Chapter 2 of Concurrency Control and Recovery in Database Systems (1987) by Bernstein, Hadzilacos, and Goodman is a foundational treatment of serializability theory. It is precise, formal, yet simple and elegant, a rare combination for foundational theory in a systems domain. Databases got lucky here: serializability theory is both powerful and clean. The chapter builds up the theory step by step, introducing: Histories Serializable histories The Serializability Theorem Recoverability and its variants Generalized operations beyond reads/writes View equivalence Each section motivates definitions clearly, presents tight formalism, and illustrates ideas with well-chosen examples. 2.1 Histories This section lays the groundwork. It starts slow, and doesn't do anything fancy. It first defines what it means for the operations within a transaction to form a well-founded partial order. This intra-transaction ordering extends naturally to inter-transaction operations, forming a superset rel...

Modular verification of MongoDB Transactions using TLA+

- May 08, 2025

Joint work with Will Schultz . A transaction groups multiple operations into an all-or-nothing logical-box to reduce the surface area exposed to concurrency control and fault recovery, simplifying the application programmer's job. Transactions support ACID guarantees: atomicity, consistency, isolation, durability. Popular isolation levels include, Read Committed (RC), Snapshot Isolation (SI), and Serializability (SER), which offer increasing protection against concurrency anomalies. MongoDB Transactions MongoDB’s transaction model has evolved incrementally. v3.2 (2015): Introduced single-document transactions using MVCC in the WiredTiger storage engine. v4.0 (2018): Extended support to multi-document transactions within a replica set (aka shard). v4.2 (2019): Enabled fully distributed transactions across shards. Replica Set Transactions. All transaction operations are first performed on the primary using the WiredTiger transaction workflow/algorithm. Before co...

Concurrency Control and Recovery in Database Systems: Preface and Chapter 1

- May 06, 2025

I'm catching up on Phil Eaton's book club and just finished the preface and Chapter 1 of Concurrency Control and Recovery in Database Systems by Bernstein, Hadzilacos, and Goodman. This book came out in 1987: two years before the fall of Berlin wall, 5 years before Windows 3.1, and before the Gray/Reuters book on Transaction Processing . I was first surprised about why "Recovery" was featured prominently in the book title, but then realized that in 1987 there was no solid solution for recovery. The ARIES paper on the write-ahead-log came out in 1992. Once I realized that, the structure made sense: concurrency control (Chapters 1–5), recovery (Chapters 6–7), and a forward-looking chapter on replication (Chapter 8), where they candidly admit: "No database systems that we know of support general purpose access to replicated distributed data." The Problem Serializability Theory Two Phase Locking Non-Locking Schedulers Multiversion Concurrency Control Centrali...

Notes from the TLA+ Community Event

- May 06, 2025

I attended the TLA+ Community Event at Hamilton, Ontario on Sunday. Several talks pushed the boundaries of formal methods in the real world through incorporating testing, conformance, model translation, and performance estimation. The common thread was that: TLA+ isn't just for specs anymore. It's being integrated into tooling: fuzzers, trace validators, and compilers. The community is building bridges from models to reality, and it's a good time to be in the loop. Below is a summary of selected talks, followed by some miscellaneous observations. This is just a teaser; the recordings will be posted soon on the TLA+ webpage . Model-Based Fuzzing for Distributed Systems — Srinidhi Nagendra Traditional fuzzing relies on random inputs and coverage-guided mutation, and works well for single-process software. But it fails for distributed systems due to too many concurrent programs, too many interleavings, and no clear notion of global coverage. Srinidhi's work brings model-b...

TLA+ Community Event at ETAPS 2025

- May 01, 2025

This Sunday, I'll be attending (and speaking at) the TLA+ Community Event , co-located with ETAPS 2025 in Hamilton, Ontario. The setting is fitting. ETAPS (European Joint Conferences on Theory and Practice of Software) has long been a hub for research that combines theory with software engineering. It seems that, while the U.S. academia largely left software engineering to industry, European researchers remained more strongly involved in software engineering discipline. ETAPS has consistently hosted work on model checking, type systems, static analysis, and formal methods. Think of work on abstract interpretation, K frameworks, or compilers verified in Coq. I have never been to ETAPS before. It seems that they are rebranding as "International Joint Conferences On Theory and Practice of Software" and droppping the European. And this year is the first time, after 28 years, that the event moves outside of Europe . Interesting. McMaster University, the ETAPS 2025 host, is ...

Multi-Grained Specifications for Distributed System Model Checking and Verification

- April 22, 2025

This EuroSys 2025 paper wrestles with the messy interface between formal specification and implementation reality in distributed systems. The case study is ZooKeeper. The trouble with verifying something big like ZooKeeper is that the spec and the code don’t match. Spec wants to be succinct and abstract; code has to be performant and dirty. For instance, a spec might say, “this happens atomically.” But the actual system says, “yeah, buddy, right.” Take the case of FollowerProcessNEWLEADER: the spec bundles updating the epoch and accepting the leader’s history into a single atomic step. But in the real system, those steps are split across threads and separated by queuing, I/O, and asynchronous execution. Modeling them as atomic would miss real, observable intermediate states, and real bugs. To bridge this model-code gap, the authors use modularization and targeted abstraction. Don’t write one spec, write multi-grained specs. Different parts of the system are modeled at different ...

What I'd do as a College Freshman in 2025

- April 10, 2025

Do Computer Science Absolutely. Still would. Many are spooked by LLMs. Some, like Jensen Huang, argue that "nobody has to learn how to program." I argue the opposite . And I double down. Being supported by AI tools is not a substitute for mastery . You can’t borrow skills. You have to earn them. Computer science builds vital skills : hacking, debugging, abstract thinking, and quick adaptation. These don’t go out of style. Do STEM. It’s LLM-resistant. LLMs can retrieve and remix information, but do you know what to do with them? Like the dog chasing the car, what now? STEM teaches you that. It teaches you to think, to reason, to act. It gets you from information to wisdom. But only after you've mastered the foundations. We're heading into the age of π-shaped people: depth in two areas, and generalist across . Building depth first, and then ranging is good strategy . So yes, I would learn the foundations of both CS and AI. And then do AI + X, where X is s...

Cedar: A New Language for Expressive, Fast, Safe, and Analyzable Authorization

- March 24, 2025

This paper (2024) presents Cedar, AWS's new authorization policy language. By providing a clear declarative way to manage access control policies, Cedar addresses the common limitations of embedding authorization logic directly into application code: problems with correctness, auditing, and maintainence. Cedar introduces a domain-specific language (DSL) to express policies that are separate from application code, and improves readability and manageability. In that sense, this is like aspect-oriented programming but for authorization policy. The language is designed with four main objectives: expressiveness, performance, safety, and analyzability. Cedar balances these goals by supporting role-based (RBAC), attribute-based (ABAC), and relationship-based (ReBAC) access control models. Policy Structure and Evaluation Cedar policies consist of three primary components: effect, scope, and conditions. The effect can either be "permit" or "forbid", defining whether acc...

ParallelRaft: Out-of-Order Executions in PolarFS

- March 20, 2025

This paper (2021) dives deeper in the Parallel Raft protocol introduced with PolarFS. PolarFS (VLDB'18) is a distributed file system with ultra-low latency and high availability, developed by Alibaba. Its variant of the Raft consensus protocol, ParallelRaft, relaxes Raft's strict serialization constraints. ParallelRaft allows state machines to commit and execute log entries out of order. This study provides a formal specification of ParallelRaft in TLA+, proves its correctness, and identifies consistency issues caused by ghost log entries. To address these issues, the refined ParallelRaft-CE protocol limits parallelism during leader transitions. Introduction Raft is a tight-ass. It enforces sequential commitment and execution of log entries. No funny business. Multi-Paxos is a bit rebellious. It allows out-of-order commitment, but it still insists on ordered execution. In order not to leave any performance on the table, PolarFS's ParallelRaft rebels completely: it not on...

Publish and Perish: Why Ponder Stibbons Left the Ivory Tower

- March 10, 2025

(With apologies to Terry Pratchett) Ponder Stibbons sat in the shadowy recesses of the Uncommon Room, clutching a lukewarm cup of tea and the last vestiges of his patience. Across from him, the Dean, wrapped in his usual self-satisfaction, puffed his pipe and surveyed the room as if it were all a great and glorious joke that he alone understood. "Leaving?" the Dean spluttered, when Ponder finally managed to slide the conversation in that direction. "What do you mean, leaving? Where would you even go?" "Industry," said Ponder, trying not to make it sound like a curse. "Databases. Big ones." The Dean blinked. "But we have databases here. The Archchancellor’s hat alone contains centuries of magical indexing..." "Yes, but in the real world, we use them to actually store and retrieve information, not just argue about what information is worth storing in the first place," Ponder snapped. "And I’ll be paid properly." "...

Our K-Drama Journey

- February 27, 2025

Finding a show that suits everyone in our family has always been a challenge. Intense action/stress is too much for our youngest daughter, and we prefer to keep things PG-13. We’ve finally found a great solution: Korean-dramas. We’ve been watching them back to back recently, and they’ve been a hit across all ages. Hometown Cha-Cha-Cha (2021) This was our gateway into K-dramas. Since it’s dubbed in English, it was easy to start. Set in a small seaside town, it’s a wholesome, family-friendly show with both comedic moments and touching scenes. It also gave us a glimpse into everyday Korean life. It's a solid feel-good watch. Queen of Tears (2024) We picked this next because it was also dubbed. This one was a commitment (16 episodes, each 1.5 hours long) but it was worth it. It had more romance, more drama, better cinematography, and a stellar cast. The intense drama in some scenes made our youngest bawling. We got so hooked that we ended up binge-watching the l...

Smart Casual Verification of the Confidential Consortium Framework

- February 26, 2025

This paper (NSDI'25) applies lightweight formal methods (hence the pun "smart casual" in contrast to formal attire) to the Confidential Consortium Framework (CCF). CCF is an open-source platform for trustworthy cloud applications, used in Microsoft's Azure Confidential Ledger service. The authors combine formal specification, model checking, and automated testing to validate CCF's distributed protocols, specifically its custom consensus protocol. Background CCF uses trusted execution environments (TEEs) and state machine replication. Its consensus protocol is based on Raft but has major modifications: signature transactions (Merkle tree root over the whole log signed by the leader), optimistic acknowledgments, and partition leader step-down. CCF also avoids RPCs, using a uni-directional messaging layer. These changes add complexity, requiring formal verification. At the time of writing, CCF's implementation spans 63 kLoC in C++. The authors define five requi...