On distributed systems broadly defined and other curiosities. The opinions on this site are my own.
Talk on innovation
Get link
Facebook
Twitter
Pinterest
Email
Other Apps
-
Today I gave a talk to juniors and seniors at Bilkent University, CS department on innovation and entrepreneurship. Here is the link to the pdf for those who requested it.
This is definitely not a "learn distributed systems in 21 days" post. I recommend a principled, from the foundations-up, studying of distributed systems, which will take a good three months in the first pass, and many more months to build competence after that. If you are practical and coding oriented you may not like my advice much. You may object saying, "Shouldn't I learn distributed systems with coding and hands on? Why can I not get started by deploying a Hadoop cluster, or studying the Raft code." I think that is the wrong way to go about learning distributed systems, because seeing similar code and programming language constructs will make you think this is familiar territory, and will give you a false sense of security. But, nothing can be further from the truth. Distributed systems need radically different software than centralized systems do. --A. Tannenbaum This quotation is literally the first sentence in my distributed systems syllabus. Inst
This is with apologies to Butler Lampson, who published the " Hints for computer system design " paper 40 years ago in SOSP'83. I don't claim to match that work of course. I just thought I could draft this post to organize my thinking about designing distributed systems and get feedback from others. I start with the same disclaimer Lampson gave. These hints are not novel, not foolproof recipes, not laws of design, not precisely formulated, and not always appropriate. They are just hints. They are context dependent, and some of them may be controversial. That being said, I have seen these hints successfully applied in distributed systems design throughout my 25 years in the field, starting from the theory of distributed systems (98-01), immersing into the practice of wireless sensor networks (01-11), and working on cloud computing systems both in the academia and industry ever since. These heuristic principles have been applied knowingly or unknowingly and has proven
I talked about the importance of reading foundational papers last week. To followup, here is my compilation of foundational papers in the distributed systems area. (I focused on the core distributed systems area, and did not cover networking, security, distributed ledgers, verification work etc. I even left out distributed transactions, I hope to cover them at a later date.) I classified the papers by subject, and listed them in chronological order. I also listed expository papers and blog posts at the end of each section. Time and State in Distributed Systems Time, Clocks, and the Ordering of Events in a Distributed System. Leslie Lamport, Commn. of the ACM, 1978. Distributed Snapshots: Determining Global States of a Distributed System. K. Mani Chandy Leslie Lamport, ACM Transactions on Computer Systems, 1985. Virtual Time and Global States of Distributed Systems. Mattern, F. 1988. Practical uses of synchronized clocks in distributed systems. B. Liskov, 1991. Expository papers
This paper appeared in OSDI'22. There is a great summary of the paper by Aleksey (one of the authors and my former PhD student, go Aleksey!). There is also a great conference presentation video from Lexiang. Below I will provide a brief overview of the paper followed by my discussion points. This topic is very interesting and important, so I hope you have fun learning about this. Metastability concept and categories Metastable failure is defined as permanent overload with low throughput even after the fault-trigger is removed. It is an emergent behavior of a system, and it naturally arises from the optimizations for the common case that lead to sustained work amplification. In this paper, the authors are able to capture/abstract the system behavior of interest in terms of two parameters, the load and capacity. If the load is above capacity, you have work piling up, right? Or if the capacity drops under the sustained load level, the same effect, right? Both of these create a tem
NVDIA CEO Jensen Huang recently made very contraversial remarks : "Over the course of the last 10 years, 15 years, almost everybody who sits on a stage like this would tell you that it is vital that your children learn computer science, and everybody should learn how to program. And in fact, it’s almost exactly the opposite. It is our job to create computing technology such that nobody has to program and that the programming language is human. Everybody in the world is now a programmer. This is the miracle of artificial intelligence." I am not going to wise crack and say that this is power poisioning and this is what happens when your company valuation more than triples in a year and surpasses Amazon and Google. (Although I don't discount this effect completely.) Jensen is very smart and also has some great wisdom , so I think we should give this the benefit of doubt and try to respond in a thoughtful manner. A response is warranted because this statement got a lot of p
This paper is from Pat Helland, the apostate philosopher of database systems, overall a superb person, and a good friend of mine. The paper appeared this week at CIDR'24. (Check out the program for other interesting papers). The motivating question behind this work is: " What are the asymptotic limits to scale for cloud OLTP (OnLine Transaction Processing) systems? " Pat says that the CIDR 2023 paper "Is Scalable OLTP in the Cloud a Solved Problem?" prompted this question. The answer to the question? Pat says that the answer lies in the joint responsibility of database and the application. If you know of Pat's work, which I have summarized several in this blog , you would know that Pat has been advocating along these lines before. But this paper provides a very crisp, specific, concrete answer. Read on for my summary of the paper. Disclaimer: This is a wisdom and technical information/detail packed 13-page paper, so I will try my best to summarize the sa
This paper appeared in VLDB'17. The paper presents NAM-DB, a scalable distributed database system that uses RDMA (mostly 1-way RDMA) and a novel timestamp oracle to support snapshot isolation (SI) transactions. NAM stands for network-attached-memory architecture, which leverages RDMA to enable compute nodes talk directly to a pool of memory nodes. Remote direct memory access (RDMA) allows bypassing the CPU when transferring data from one machine to another. This helps relieve a major factor in scalability of distributed transactions: the CPU overhead of the TCP/IP stack. With so many messages to process, CPU may spend most of the time serializing/deserializing network messages, leaving little room for the actual work. We had seen this phenomena first hand when we were researching the performance bottlenecks of Paxos protocols. This paper reminds me of the "Is Scalable OLTP in the Cloud a Solved Problem? (CIDR 2023)" which we reviewed recently. The two papers share one
I mentioned this panel in my SIGMOD/PODS day 2 writeup. The panel consisted of (from right to left) Gustavo Alonso (ETH), Swami Sivasubramanian (AWS), Anastasia Ailamaki (EPFL), Raghu Ramakrishnan (Microsoft), and Sam Madden (MIT). Sailesh Krishnamurthy from Google was unable to attend, so Anastasia who is on sabbatical at Google, also covered for him. The panel started with 5 minutes speeches from participants. The panel was lively and playful. The panelists tried to make their speeches controversial, and have zingers for each other. Sam talked about the demise of DBMS monolith. He said, the Oracle of 90s is dead because it was incompatible with modern data ecosystem, and the modern DBMSs are shifting towards disaggregated designs. When Sam was a PhD student under Eric Brewer, Eric said we should build disaggregated databases, and people said that won't happen. But we are there now: Disaggregation has arrived! While Sam mentioned about the benefits of disaggregation, he also ca
My blog has been going for 14 years now, and has just passed 4 million pageviews. Yay! I remember the 1 million pageviews moment in 2017 ! The main reason I was able to persist for so long is because I blog for selfish reasons. Let me try to unpack why I blog, and why I keep blogging. I write for myself The audience I have in mind is myself. I blog to clarify my understanding and thinking about a topic. Reading a research/technical paper is already time consuming. I can't do it in less than 4 hours. Period. I love learning. And I am fortunate that I get to read research papers as part of my work. I double-dip on this effort to blog about them, to improve my understanding of these papers. Writing a blog post is the final step in my pipeline for reading a paper. I think my blog reviews of papers hits a good niche. Research papers are written for the wrong audience (or rather maybe the right audience but for the wrong reason): they are written to please 3 specific expert reviewers
This is a pun on the saying "there is always room at the top". This is also the title of a famous Feynman lecture from 1959 , where he made a case for nanotechnology. In this post, I will try to argue that there is plenty of room at the bottom for distributed algorithms. Most work on distributed algorithms are done at a high-level abstraction plane. These high level solutions do not transfer well to the implementation level, and this opens a lot of space to explore at the implementation level. But this is not just an opportunistic argument. It is imperative to target the implementation level with our distributed algorithms work, otherwise they remain as theoretical, unused, inapplicable. Let me try to demonstrate using consensus protocols as examples. Someone else can make the same case using another subdomain. Be mindful of what is swept under the rug What is succinct at the high level could be very hard to implement and get right at the low level. Paxos seems simple , bu
Comments
Thanks for the post