Posts

DDIA: Chp 9. Consistency and Consensus

Image
The chapter 9 of the Designing Data Intensive Applications (DDIA) book has the following 4 sections (which contain a total of 10 subsections).   Consistency guarantees Linearizability Ordering guarantees Distributed transactions and consensus TMI (Too much info) The chapter tries to do too much. Almost an entire semester of distributed systems content is force-jammed into this chapter. I don't think the discord reading group will be able to complete this chapter and make sense of what is going on. For a more manageable load, much of this chapter should be moved to other chapters. The chapter starts with explaining linearizability. It would have been better to explain this in Chapter 5, Replication. Linearizability sounds easy, but it takes time to internalize. I had discussed linearizability here , and later revisited it with this paper . And of course it was mentioned in many other posts. The chapter continues with CAP tradeoff, rightly calling it "the unhelpful CAP theorem&

Auto-WLM: machine learning enhanced workload management in Amazon Redshift

Image
This paper appeared in Sigmod'23. What? Auto-WLM is a machine learning based *automatic workload manager* currently used in production in Amazon Redshift. I thought this would be a machine learning paper, you know deep learning and stuff. But this paper turned out to be a practical/applied data systems paper. At its core, this paper is about improving query performance and resource utilization in data warehouses, possibly the first for a database system in production at scale.  They are not using deep learning, and rightly so! The main take-away from the paper is that locally-trained simple models (using XGBoost , a decision tree-based model built from the query plan trees) outperformed globally trained models, likely due to their ability to "instance optimize" to specific databases and workloads. They are using simple ML. And it works. Why? This is an important problem. If tuning is done prematurely, resources are unnecessarily wasted, and if it is done too late, overall

DDIA: Chp 8. The Trouble with Distributed Systems

Image
This is a long chapter. It touches on so many things. Here is the table of contents. Faults and partial failures Unreliable networks detecting faults timeouts and unbounded delays sync vs async networks Unreliable clocks monotonic vs time-of-day clocks clock sync and accuracy relying on sync clocks process pauses Knowledge truth and lies truth defined by majority byzantine faults system model and reality I don't know if listing a deluge of problems is the best way to approach this chapter. Reading these is fine, but it doesn't mean you learn them. I think you need time and hands on work to internalize these. What can go wrong? Computers can crash. Unfortunately they don't fail cleanly.   Fail-fast is failing fast! And again unfortunately, partial failures (limping computers) are very difficult to deal with. Even worse, with the transistor density so high, we now need to deal with silent failures. We have memory corruption and even silent faults from CPUs. The HPTS'24 s

DDIA: Chp 7. Transactions (Part 2): Serializability

Image
We are continuing from the first part of our Chapter 7 review .  Serializable isolation ensures that the final result of concurrent transactions is equivalent to if they had been run one at a time, without any concurrency. This eliminates any concurrency anomalies, since it ensures the transactions would behave as they would in a sequential environment. Databases offering serializability typically use one of three methods: Executing transactions literally in a serial order  Two-phase locking (2PL) Optimistic concurrency control, like serializable snapshot isolation (SSI) For now, we will focus on single-node databases. The book discusses how these methods apply to distributed systems in Chapter 9. Actual Serial Execution Serial transaction execution is used in systems like VoltDB/H-Store, Redis, and Datomic. While limiting the database to a single thread can boost performance by eliminating coordination overhead, it restricts throughput to the capacity of one CPU core. Unlike tradition

DDIA: Chp 7. Transactions (Part 1)

Image
Chapter 7 of the Designing Data Intensive Applications (DDIA) book discusses transactions, yay! Transactions in database systems group multiple operations into a logical unit, a box of operations if you will. They simplify error handling and help manage concurrency issues. See Gray-Reuters book introduction and fault-tolerance sections for the motivation on this. The ACID properties (Atomicity, Consistency, Isolation, Durability) are often used to describe transaction characteristics, but their precise meaning had been left fuzzy by the implementations. Atomicity ensures all operations in a transaction either complete fully or not at all. It's not about concurrency, but about handling faults during transaction processing. Consistency is an application-level concern, ensuring that transactions maintain an application-defined data integrity according to defined rules. This consistencey is not the same consistency as in CAP , and it is best to forget about C in ACID, and understand

700

Image
This is a special milestone: 700th post, after 14 years of blogging here. 700 posts is a lot of blogging. But that comes down to 50 posts per year, which is one post a week, totally doable, right? If I can get another 14 years of blogging at this rate, I will get to 1400. That is more than the EWD documents in terms of the sheer number of posts, not that I am comparing my posts with EWD documents. (BTW, of course I had a blog post about the EWDs. )  I hope I can go for many more years, because I do enjoy this. Writing in the open and getting occassional feedback is nice. But I don't do it for the feedback. I wrote about "Why I blog" recently , so I am not going to rehash that topic. Instead a good use of this post may be to serve as an index to other posts in this blog. I noticed I keep referring people to my previous posts where I covered a question before. 700 posts is a lot of posts, so maybe an index at this snapshot can help people get more out of this blog. Let

SRDS Day 2

Image
Ok, continuing on the SRDS day 1 post , I bring you SRDS day 2. Here are my summaries from the keynote, and from the talks for which I took some notes.  Mahesh Balakrishnan's Keynote Mahesh 's keynote was titled "An Hourglass Architecture for Distributed Systems" . His talk focused on the evolution of his perspective on distributed systems research and the importance of abstraction in managing complexity. He began by reflecting on how in his PhD days, he believed the goal was to develop better protocols with nice properties. He said, he later realized that the true challenge in distributed systems research lies in creating new abstractions that simplify these complex systems. Complexity creeps into distributed systems through failures, asynchrony, and change. Mahesh also confessed that he didn't realize the extent to the importance of managing change until his days in industry.  While other fields in computer science have successfully built robust abstractions (su

SRDS Day 1

Image
This week, I was at the 43rd International Symposium on Reliable Distributed Systems (SRDS 2024) at Charlotte, NC. The conference center was at the UNC Charlotte, which has a large and beautiful campus. I was the Program Committee chair for SRDS'24 along with Silvia Bonomi. Marco Vieira and Bojan Cukic were the general co-chairs. A lot of work goes in to organizing a conference. Silvia and I recruited the reviewers (27 reviewers, plus external reviewers), oversaw the reviewing process (using hotcrp conference review management system), and came up with the conference program for the accepted papers. We got 87 submissions. 2 papers have been desk-rejected. Each paper received 3 reviews, which brought it up to 261 reviews in total. 27 papers were finally accepted (acceptance rate of ~30%), 1 has been withdrawn after the notification. Marco and Bojan took care of everything else including all local arrangements, web-site management, which are a lot more involved than you would guess.

DDIA: Chp 6. Partitioning

Image
Chapter 6 of the Designing Data Intensive Applications (DDIA) book discusses partitioning, a key technique for scaling large datasets and high query throughput in distributed databases. By breaking data into smaller partitions, it can be distributed across multiple nodes in a shared-nothing cluster. This allows the storage and processing load to be spread across many disks and processors. Fun fact: Partitioned databases were pioneered in the 1980s by products such as Teradata and Tandem NonStop SQL, and in 2000s rediscovered by NoSQL databases and Hadoop-based data warehouses. Partitioning is often combined with replication , where each partition is stored on multiple nodes for fault tolerance. In a typical setup, each partition has a leader node that handles writes, with follower nodes replicating the data. Partitioning of Key-Value Data There are two main approaches to partitioning key-value data. Partitioning by key range: Assign a continuous range of keys to each partition, like v

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book