Showing posts from January, 2024

Fault-Tolerant Replication with Pull-Based Consensus in MongoDB

This paper, from NSDI 2021, presents the design and implementation of strongly consistent replication in MongoDB using a consensus protocol derived from Raft. Raft provides fault-tolerant state-machine-replication (SMR) over asynchronous networks. Raft (like most SMR protocols) uses push-based replication. But MongoDB uses pull-based replication scheme, so when integrating/invigorating MongoDB's SMR with Raft, this caused challenges. The paper focuses on examining and solving these challenges, and explaining the resulting MongoSMR protocol (my term, not the paper's).  The paper restricts itself to the strongest consistency level, linearizability, but it also talks about how serving weaker models interact/shape decisions made in MongoDB's replication protocol. The paper talks about extensions/optimizations of MongoDB SMR protocol, but I skip those for brevity. I also skip the evaluation section, and just focus on the core of the SMR protocol. Design Background Unlike conve

Looking Back at Postgres

This is a 2019 article by Joe Hellerstein, the Jim Gray Professor of Computer Science at UC Berkeley. Last year at Sigmod, Joe was awarded the Ted Codd innovation award where he gave an awesome overview of his research agenda. This article, written to be included in Stonebraker’s Turing Award book, provides a retrospective on Postgres project, which Stonebraker led from the mid-1980’s to the mid-1990’s. I love this article a lot, because as I have written before, context is my crack : "The more context I know, the better I become able to locate something almost spatially, and the more I can make sense of it. Even reading the history and motivation for the subject can give my understanding a big boost." The entire paper is context, and it even has a section titled "Context", how cool is that? The footnotes on the article are also excellent! Very interesting gems there as well. Disclaimer: I use a lot of text from the article to summarize it. The features and the imp

Scalable OLTP in the Cloud: What’s the BIG DEAL?

This paper is from Pat Helland, the apostate philosopher of database systems, overall a superb person, and a good friend of mine. The paper appeared this week at CIDR'24. (Check out the program for other interesting papers). The motivating question behind this work is: " What are the asymptotic limits to scale for cloud OLTP (OnLine Transaction Processing) systems? " Pat says that the CIDR 2023 paper "Is Scalable OLTP in the Cloud a Solved Problem?" prompted this question.  The answer to the question? Pat says that the answer lies in the joint responsibility of database and the application. If you know of Pat's work, which I have summarized several in this blog , you would know that Pat has been advocating along these lines before. But this paper provides a very crisp, specific, concrete answer. Read on for my summary of the paper. Disclaimer: This is a wisdom and technical information/detail packed 13-page paper, so I will try my best to summarize the sa

Oblivious Paxos: Privacy-Preserving Consensus Over Secret-Shares

This paper appeared in SOCC'23. The paper presents a primary-backup secret-shared state machine (PBSSM) architecture and the associated consensus protocol, Oblivious Paxos (OPaxos). OPaxos enables privacy-preserving consensus by allowing acceptors to safely and consistently agree on a secret-shared value without untrusted acceptors knowing the value. OPaxos protocol overview OPaxos uses (t, n) threshold secret-sharing. This means generating n secret-shares from a single secret value such that it is possible to reconstruct the secret with just t shares. In order to make (t, n) threshold secret-sharing play well with Paxos, the protocol requires that the cardinality of the intersection of any phase1 quorum and phase2 quorum is larger than t.  This can be achieved by choosing the quorum size as (n+t)/2. More accurately,  one quorum (say phase1) would have cardinality as the ceiling of (n+t)/2, and the other quorum (phase2) would have cardinality as floor of this (n+t)/2. The justi

Dude, where's my Emacs?

It had been 3 years since I had to setup a new laptop. Past Murat had left me helpful instructions , but I got a bad surprise while configuring my Emacs setup. My beloved starter-kits, which worked smoothly under Emacs 27, stopped working under Emacs 29. I tried to fix the errors, but this proved futile due to my limited elisp/emacs skills.  I explored alternative starter-kits to configure Emacs. Doom emerged as the dominant choice, but it appeared large, bloated, and complex for my liking. I tried a couple small starter kits, but I faced other problems and was unable to integrate my customizations and get to a reasonable setup.  I am an Emacs user for 25 years, and I know a thing or two, but it seems everyone's now an Emacs pro. I don't get how people are able to accept and work with those complicated config files/directories. I started to speculate. Maybe the ordinary users left, leaving behind the proficient Emacs enthusiasts, who kept writing more and more intricate confi

Recent reads

Here are brief reviews for the 6 books I read in the last couple  of months. How Not to Be Wrong: The Power of Mathematical Thinking (Jordan Ellenberg, 2014) This book discusses applying mathematical thinking in everyday situations for decision-making in order to avoid common pitfalls. The book shows how mathematical principles can be applied to real-world scenarios involving statistics, probability, and game theory. There are some fun real world examples discussed including lotteries, election polls, finance, tax schemes, and medicine. The book aims to show readers how embracing mathematical reasoning can guard them against being misled by faulty arguments or misinterpretations of data. The book is written in an engaging and accessible style, but it gets tedious as it is more than 450 pages long. Inevitably, it delves into repetitive explanations, creating monotony and disorganization. The book loses focus as it disperses its attention. Did we really need this long of a book? Couldn&

The checklist manifesto (Dr. Atul Gawande, 2009)

This book advocates for integrating checklists as potent safety and fault-tolerance tools across diverse domains. Atul Gawande, a prominent surgeon, enriches the narrative with numerous surgery cases to emphasize the effectiveness of checklists in handling intricate tasks. The surgery cases are graphic. The unemotional, matter-of-fact tone of the audiobook paradoxically intensified the emotional impact for me. Listening to those accounts gave me sweaty palms, and I instinctively clenched myself with pain. The graphic details effectively drive home the point. While listening, I couldn't help but lament that a simple checklist could have caught the mistake, averting all the blood and suffering. The book also delves into how the aviation industry has successfully embraced checklists to reduce errors and improve communication. Gawande argues that disciplined checklist use in aviation significantly enhances safety and reliability. He details how the aviation industry rigorously operati

Popular posts from this blog

The end of a myth: Distributed transactions can scale

Foundational distributed systems papers

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Metastable failures in the wild

Scalable OLTP in the Cloud: What’s the BIG DEAL?

SIGMOD panel: Future of Database System Architectures

Amazon Aurora: Design Considerations + On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes

Dude, where's my Emacs?

There is plenty of room at the bottom