Posts

Showing posts from January, 2015

Dijkstra's stabilizing token ring algorithm

Image
One of the classical algorithms I teach in my distributed systems class is Dijkstra's stabilizing token ring algorithm. This algorithm has started the self-stabilization field as a subfield of fault-tolerance. And, it still receives interest even after 40 years. There has been probably hundreds of self-stabilization papers that revisits Dijkstra's stabilizing token ring algorithm as part of a solution or as part of a case study. This algorithm never gets old for me as well. I still enjoy talking about this algorithm in class and thinking about it once in a while. I guess this is because it is a very elegant algorithm. Linked is Dijkstra's original paper introducing the algorithm. This paper also includes two variants of the stabilizing token ring algorithm, 3-state and 4-state token ring algorithms. Dijkstra would later do a followup writeup, titled, "A belated proof of self-stabilization" , where he gave a proof of stabilization for 3-state token ring progra

My Distributed Systems Seminar reading list for Spring 2015

Below is the list of papers I plan to discuss in my distributed systems seminar this semester. If you have some suggestions on other good/recent papers to cover, please let me know in the comments. Perspectives on the CAP Theorem Consistency, availability, and convergence. Salt: combining ACID and BASE in a distributed database Extracting More Concurrency from Distributed Transactions Eidetic Systems Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications GraphX: Graph Processing in a Distributed Dataflow Framework Architecture of a database system Arrakis: The Operating System is the Control Plane Fast Databases with Fast Durability and Recovery Through Multicore Parallelism Project Adam: Building an Efficient and Scala

My experience with using TLA+ in distributed systems class

Image
I used TLA+ in my distributed system class in Fall 2014. ( To learn the backstory on this, read my pre-semester TLA+ post. ) In short, I loved the experience and I am hooked. Integrating TLA+ to the class gave students a way to get a hands-on experience in algorithms design and correctness verification. Even for a sophisticated algorithm, such as Paxos, I can refer the students to TLA+ to practice and play with the algorithm, so they can internalize what is going on. Throughout the semester as I had more chance to work with TLA+, I started growing a library of TLA+ modeling of distributed algorithms. Next time I teach the class, I hope to provide the students with a TLA+ model of each algorithm I cover in the class. The student feedback was also positive. The students liked TLA+ since it gave them a way to experiment and supported them in reasoning about the algorithms. Several students complained of the learning curve. They wanted me to cover TLA+ in more detail in the class. (The

Popular posts from this blog

Graviton2 and Graviton3

Foundational distributed systems papers

Learning a technical subject

Your attitude determines your success

Learning about distributed systems: where to start?

Progress beats perfect

CockroachDB: The Resilient Geo-Distributed SQL Database

Warp: Lightweight Multi-Key Transactions for Key-Value Stores

Amazon Aurora: Design Considerations + On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes

Anna: A Key-Value Store For Any Scale