Posts

Showing posts from January, 2015

Dijkstra's stabilizing token ring algorithm

Image
One of the classical algorithms I teach in my distributed systems class is Dijkstra's stabilizing token ring algorithm. This algorithm has started the self-stabilization field as a subfield of fault-tolerance. And, it still receives interest even after 40 years. There has been probably hundreds of self-stabilization papers that revisits Dijkstra's stabilizing token ring algorithm as part of a solution or as part of a case study. This algorithm never gets old for me as well. I still enjoy talking about this algorithm in class and thinking about it once in a while. I guess this is because it is a very elegant algorithm. Linked is Dijkstra's original paper introducing the algorithm. This paper also includes two variants of the stabilizing token ring algorithm, 3-state and 4-state token ring algorithms. Dijkstra would later do a followup writeup, titled, "A belated proof of self-stabilization" , where he gave a proof of stabilization for 3-state token ring progra...

My Distributed Systems Seminar reading list for Spring 2015

Below is the list of papers I plan to discuss in my distributed systems seminar this semester. If you have some suggestions on other good/recent papers to cover, please let me know in the comments. Perspectives on the CAP Theorem Consistency, availability, and convergence. Salt: combining ACID and BASE in a distributed database Extracting More Concurrency from Distributed Transactions Eidetic Systems Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications GraphX: Graph Processing in a Distributed Dataflow Framework Architecture of a database system Arrakis: The Operating System is the Control Plane Fast Databases with Fast Durability and Recovery Through Multicore Parallelism Project Adam: Building an Efficient and Scala...

My experience with using TLA+ in distributed systems class

Image
I used TLA+ in my distributed system class in Fall 2014. ( To learn the backstory on this, read my pre-semester TLA+ post. ) In short, I loved the experience and I am hooked. Integrating TLA+ to the class gave students a way to get a hands-on experience in algorithms design and correctness verification. Even for a sophisticated algorithm, such as Paxos, I can refer the students to TLA+ to practice and play with the algorithm, so they can internalize what is going on. Throughout the semester as I had more chance to work with TLA+, I started growing a library of TLA+ modeling of distributed algorithms. Next time I teach the class, I hope to provide the students with a TLA+ model of each algorithm I cover in the class. The student feedback was also positive. The students liked TLA+ since it gave them a way to experiment and supported them in reasoning about the algorithms. Several students complained of the learning curve. They wanted me to cover TLA+ in more detail in the class. (The...

Popular posts from this blog

Hints for Distributed Systems Design

My Time at MIT

Making database systems usable

Looming Liability Machines (LLMs)

Learning about distributed systems: where to start?

Advice to the young

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Foundational distributed systems papers

Distributed Transactions at Scale in Amazon DynamoDB

Linearizability: A Correctness Condition for Concurrent Objects