Posts

A life well lived

What makes you content and fulfilled? What stops you from doing more of that? These are important questions, and you should take time to ask them every now and then and deliberate on your answers. But let's dissect these questions first. The words "content" and "fulfilled" both imply "a state of satisfaction." The word fulfilled actually means a bit more; it is satisfaction that comes from fully developing one's abilities or character. Being content and fulfilled are different from being happy or pleased. The latter imply a feeling of satisfaction (which is often fleeting), and not a state of satisfaction. Many things can give you an initial jolt of pleasure but leave you without any contentment. Watching mindless TV makes you feel good while doing it, but then you don't feel content or satisfied. Surfing the Internet for "tech news", memes, and political news feel good, because it gives you a dopamine hit, but then it leave...

Book review Draft No. 4 by John McPhee

This book is a collection/compilation of many essays McPhee wrote over his career as a prominent writer for the New Yorker. Here is one of those essays, which forms the basis for one of the chapters in the book and gives the book its title. The book is about the craft of writing, with its many aspects. But the book is stamped with McPhee's unmistakable take on writing, which puts a big emphasis on the structure/composition of the piece: You can build a strong, sound, and artful structure. You can build a structure in such a way that it causes people to want to keep turning pages. A compelling structure in nonfiction can have an attracting effect analogous to a story line in fiction.  I thoroughly enjoyed reading the book. To tell the truth, I was prejudiced going into this. I imagined McPhee would be a boring high-brow NewYorker writer, who would use fancy words I don't understand. Quite the contrary, he is a very colorful figure, a great observer, and insightful and sinc...

Paper review. Sharding the Shards: Managing Datastore Locality at Scale with Akkio

Image
This paper by Facebook, which appeared in OSDI'18, describes the data locality management service, Akkio. Akkio has been in production use at Facebook since 2014. It manages over 100PB of data, and processes over 10 million data accesses per second. Why do we need to manage locality?  Replicating all data to all datacenters is difficult to justify economically (due to the extra storage and WAN networking costs) when acceptable durability and request serving latency could be achieved with 3 replicas. It looks like Facebook had been doing full replication (at least for ViewState and AccessState applications discussed in the evaluation) to all the 6 datacenters back-in-the-day, but as the operation and the number of datacenters grew, this became untenable. So, let's find suitable home-bases for data, instead of fully replicating it to all datacenters. But the problem is access locality is not static. What was a good location/configuration for the data ceases to become suita...

Paper review: Probabilistically Bounded Staleness for Practical Partial Quorums

Image
There is a fundamental trade-off between operation latency and data consistency in distributed database replication. The PBS paper (VLDB'12) examines this trade-off for partial quorum replicated data stores. Quorum systems We can categorize quorum systems into strict versus partial quorums. Strict quorum systems ensure strong consistency  by ensuring that read & write replica sets overlap: $R + W > N$. Here N is the total number of replicas in the quorum, R is the number of replicas that need to reply to a read query, and W is the number of replicas that need to reply to a write query. Employing partial quorums  can lower latency by requiring fewer replicas to respond, but R and W need not overlap: $R+W \leq N$. Such partial quorums offer eventual consistency . Here is a visual representation of an expanding quorum system. The coordinator forwards a write requests to all N replicas, and wait for W acknowledgements for responding back to the client for completi...

Paper review. An Empirical Study on Crash Recovery Bugs in Large-Scale Distributed Systems

Image
Crashes happen. In fact they occur so commonly that they are classified as anticipated faults. In a large cluster of several hundred machines, you will have one node crashing every couple of hours. Unfortunately, as this paper shows, we are still not very competent at handling crash failures. This paper from 2018 presents a comprehensive empirical study of 103 crash recovery  bugs from 4 popular open-source distributed systems: ZooKeeper, Hadoop MapReduce, Cassandra, and HBase. For all the studied bugs, they analyze their root causes, triggering conditions, bug impacts and fixing. Summary of the findings Crash recovery bugs are caused by five types of bug patterns: incorrect backup (17%) incorrect crash/reboot detection (18%) incorrect state identification (16%) incorrect state recovery (28%)  concurrency (21%) Almost all (97%) of crash recovery bugs involve no more than four nodes. This finding indicates that we can detect crash recovery bugs in a small set of...

Paper review. Serverless computing: One step forward, two steps back

Serverless computing offers the potential to program the cloud in an autoscaling and pay-only-per-invocation manner. This paper from UC Berkeley (to appear at CIDR 19) discusses limitations in the first-generation serverless computing, and argues that its autoscaling potential is at odds with data-centric and distributed computing. I think the paper is written to ignite debate on the topic, so here I am writing some of my takes on these arguments. Overall, I think the paper could have been written in a more constructive tone. After you read the entire paper, you get a sense that the authors want to open constructive dialogue for improving the state of serverless rather than to crucify it. However, the overly-critical tone of the paper leads to some unfair claims,  not only about serverless, but also about cloud computing as well. This paragraph in the introduction is a good example. New computing platforms have typically fostered innovation in programming languages and environ...

Two-phase commit and beyond

Image
In this post, we model and explore the two-phase commit protocol using TLA+. The two-phase commit protocol is practical and is used in many distributed systems today. Yet it is still brief enough that we can model it quickly, and learn a lot from modeling it. In fact, we see that through this example we get to illustrate a fundamental impossibility result in distributed systems directly. The two-phase commit problem A transaction is performed over resource managers (RMs).  All RMs must agree on whether the transaction is committed  or aborted . The transaction manager (TM)  finalizes the transaction with a commit or abort decision. For the transaction to be committed, each participating RM must be prepared to commit it. Otherwise, the transaction must be aborted. Some notes about modeling We perform this modeling in the shared memory model, rather than in message passing, to keep things simple. This also ensures that the model checking is fast. But we add no...

Popular posts from this blog

Hints for Distributed Systems Design

The Agentic Self: Parallels Between AI and Self-Improvement

Learning about distributed systems: where to start?

Foundational distributed systems papers

Building a Database on S3

Cloudspecs: Cloud Hardware Evolution Through the Looking Glass

TLA+ mental models

Advice to the young

Analyzing Metastable Failures in Distributed Systems

My Time at MIT