Posts

Showing posts from July, 2024

Advice to the young

Image
I notice I haven't written any advice posts recently. Here is a collection of my advice posts pre 2020. I've been feeling all this elderly wisdom pent up in me, ready to pour at any moment. So here it goes. Get ready to quench your thirst from my fount of wisdom. No man, think for yourself, only get what works for you. It is called foundations, not theory Foundations of computer science (or rather any field of study) are the most important topics you can learn. These lay down the frame of thinking/perspective for that area of study. Yet, I am saddened to hear these called as "theory", and labeled as "unpractical". This couldn't be farther from the truth. Take a look at how I recommend studying distributed systems . Don't you dare call this "theory" and "unpractical". This lays the bedrock that you build your practice on. Don't skimp on the foundations. Don't build your home on quicksand. Keep your hands dirty, your mind cl

Understanding the Performance Implications of Storage-Disaggregated Databases

Image
Storage-compute disaggregation in databases has emerged as a pivotal architecture in cloud environments, as evidenced by Amazon ( Aurora ), Microsoft ( Socrates ), Google (AlloyDB), Alibaba ( PolarDB ), and Huawei (Taurus). This approach decouples compute from storage, allowing for independent and elastic scaling of compute and storage resources. It provides fault-tolerance at the storage level. You can then share the storage for other services, such as adding read-only replicas for the databases. You can even use the storage level for easier sharding of your database. Finally, you can also use this for exporting a changelog asynchronously to feed into peripheral cloud services, such as analytics. Disaggregated architecture was the topic of Sigmod 23 panel . I think this quote summarizes the industry's thinking on the topic. "Disaggregated architecture is here, and is not going anywhere. In a disaggregated architecture, storage is fungible, and computing scales independently.

Unanimous 2PC: Fault-tolerant Distributed Transactions Can be Fast and Simple

Image
This paper (PAPOC'24) is a cute paper. It isn't practical or very novel, but I  think it is a good thought-provoking paper. It did bring together the work/ideas around transaction commit for me. It also has TLA+ specs in the appendix, which could be helpful to you in your modeling adventures.  I don't like the paper's introduction and motivation sections, so I will explain these my way. The problem with 2PC 2PC is a commit protocol. A coordinator (transaction manager, TM) consistently decides based on participants (resource managers, RMs) feedback to commit or abort. If one RM sees/applies a commit, all RMs should eventually apply commit. If one RM sees/applies an abort, all RMs should eventually apply abort. Below  figure shows 2PC in action. (If you are looking for a deeper dive to 2PC, read this .) There are three different transactions going on here. All transactions access two RMs X and Y. T1, the blue transaction is coordinated by C1, and this ends up committing.

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book