Showing posts from January, 2015

Dijkstra's stabilizing token ring algorithm

One of the classical algorithms I teach in my distributed systems class is Dijkstra's stabilizing token ring algorithm. This algorithm has started the self-stabilization field as a subfield of fault-tolerance. And, it still receives interest even after 40 years. There has been probably hundreds of self-stabilization papers that revisits Dijkstra's stabilizing token ring algorithm as part of a solution or as part of a case study. This algorithm never gets old for me as well. I still enjoy talking about this algorithm in class and thinking about it once in a while. I guess this is because it is a very elegant algorithm. Linked is Dijkstra's original paper introducing the algorithm. This paper also includes two variants of the stabilizing token ring algorithm, 3-state and 4-state token ring algorithms. Dijkstra would later do a followup writeup, titled, "A belated proof of self-stabilization" , where he gave a proof of stabilization for 3-state token ring progra

My Distributed Systems Seminar reading list for Spring 2015

Below is the list of papers I plan to discuss in my distributed systems seminar this semester. If you have some suggestions on other good/recent papers to cover, please let me know in the comments. Perspectives on the CAP Theorem Consistency, availability, and convergence. Salt: combining ACID and BASE in a distributed database Extracting More Concurrency from Distributed Transactions Eidetic Systems Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications GraphX: Graph Processing in a Distributed Dataflow Framework Architecture of a database system Arrakis: The Operating System is the Control Plane Fast Databases with Fast Durability and Recovery Through Multicore Parallelism Project Adam: Building an Efficient and Scala

My experience with using TLA+ in distributed systems class

I used TLA+ in my distributed system class in Fall 2014. ( To learn the backstory on this, read my pre-semester TLA+ post. ) In short, I loved the experience and I am hooked. Integrating TLA+ to the class gave students a way to get a hands-on experience in algorithms design and correctness verification. Even for a sophisticated algorithm, such as Paxos, I can refer the students to TLA+ to practice and play with the algorithm, so they can internalize what is going on. Throughout the semester as I had more chance to work with TLA+, I started growing a library of TLA+ modeling of distributed algorithms. Next time I teach the class, I hope to provide the students with a TLA+ model of each algorithm I cover in the class. The student feedback was also positive. The students liked TLA+ since it gave them a way to experiment and supported them in reasoning about the algorithms. Several students complained of the learning curve. They wanted me to cover TLA+ in more detail in the class. (The

Popular posts from this blog

Foundational distributed systems papers

I have seen things

My Distributed Systems Seminar's reading list for Fall 2020

PigPaxos: Devouring the communication bottlenecks in distributed consensus

Learning about distributed systems: where to start?

Fine-Grained Replicated State Machines for a Cluster Storage System

Read papers, Not too much, Mostly foundational ones

Facebook's software architecture

Paxos unpacked

Research, writing, and career advice