Posts

Showing posts from January, 2019

Paper review: Probabilistically Bounded Staleness for Practical Partial Quorums

Image
There is a fundamental trade-off between operation latency and data consistency in distributed database replication. The PBS paper (VLDB'12) examines this trade-off for partial quorum replicated data stores. Quorum systems We can categorize quorum systems into strict versus partial quorums. Strict quorum systems ensure strong consistency  by ensuring that read & write replica sets overlap: $R + W > N$. Here N is the total number of replicas in the quorum, R is the number of replicas that need to reply to a read query, and W is the number of replicas that need to reply to a write query. Employing partial quorums  can lower latency by requiring fewer replicas to respond, but R and W need not overlap: $R+W \leq N$. Such partial quorums offer eventual consistency . Here is a visual representation of an expanding quorum system. The coordinator forwards a write requests to all N replicas, and wait for W acknowledgements for responding back to the client for completion of

Paper review. An Empirical Study on Crash Recovery Bugs in Large-Scale Distributed Systems

Image
Crashes happen. In fact they occur so commonly that they are classified as anticipated faults. In a large cluster of several hundred machines, you will have one node crashing every couple of hours. Unfortunately, as this paper shows, we are still not very competent at handling crash failures. This paper from 2018 presents a comprehensive empirical study of 103 crash recovery  bugs from 4 popular open-source distributed systems: ZooKeeper, Hadoop MapReduce, Cassandra, and HBase. For all the studied bugs, they analyze their root causes, triggering conditions, bug impacts and fixing. Summary of the findings Crash recovery bugs are caused by five types of bug patterns: incorrect backup (17%) incorrect crash/reboot detection (18%) incorrect state identification (16%) incorrect state recovery (28%)  concurrency (21%) Almost all (97%) of crash recovery bugs involve no more than four nodes. This finding indicates that we can detect crash recovery bugs in a small set of nodes, r

Paper review. Serverless computing: One step forward, two steps back

Serverless computing offers the potential to program the cloud in an autoscaling and pay-only-per-invocation manner. This paper from UC Berkeley (to appear at CIDR 19) discusses limitations in the first-generation serverless computing, and argues that its autoscaling potential is at odds with data-centric and distributed computing. I think the paper is written to ignite debate on the topic, so here I am writing some of my takes on these arguments. Overall, I think the paper could have been written in a more constructive tone. After you read the entire paper, you get a sense that the authors want to open constructive dialogue for improving the state of serverless rather than to crucify it. However, the overly-critical tone of the paper leads to some unfair claims,  not only about serverless, but also about cloud computing as well. This paragraph in the introduction is a good example. New computing platforms have typically fostered innovation in programming languages and environment

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book