From Clarity to Efficiency for Distributed Algorithms

Y. A. Liu, S. D. Stoller, B. Lin, M. Gorbovitski, Appeared in OOPSLA'12.

This paper describes a high-level language, DistAlgo, for clear description of distributed algorithms (e.g., Paxos, mutual execution). In contrast to embarrassingly parallel tasks which involve very regular and simple synchronization patterns (e.g., MapReduce tasks), distributed algorithms generally involve complex and tight synchronization patterns. DistAlgo specifies these complex synchronization conditions via using logical quantifications over message histories of processes involved in the algorithm. But, unfortunately, this turns out to be extremely inefficient if executed straightforwardly: each quantifier will cause a linear factor in running time, and any use of the history of messages sent and received will cause space usage to be unbounded.

To address this inefficiency, this paper focuses on optimizing implementations of DistAlgo programs by incrementalization of these expensive synchronization conditions. To this end, their method first transforms these quantifications are into set queries, and then transforms further to minimize their nesting. Finally, from here, the method systematically extracts single quantified order comparisons, and transforms them into efficient incremental updates of auxiliary variables as messages are sent and received. (An earlier paper that focuses more on the syntax of DistAlgo, and uses Paxos as a running example, appeared at SSS'12.).

To give a glance of DistAlgo and how it does this transformations, I am including some figures from the paper as is. The paper uses Lamport's mutual exclusion algorithm as a running example and shows its DistAlgo and automatically optimized DistAlgo implementation. DistAlgo fills a gap and seems like it can be very useful in the coming years for writing distributed programs. Distributed algorithms can be expressed in DistAlgo clearly at a high level, like in pseudocode, but also precisely, like in formal specification languages, and be executed as part of real applications, as in programming languages. DistAlgo extends Python with processes as objects, send/receive of messages, yield points, synchronization conditions as logical quantifications, and configuration of distributed system topology. DistAlgo compiler and optimizer are implemented in Python (building on Python's parser module and AST package) and generates executable Python code.


Hugh said…
This comment has been removed by a blog administrator.

Popular posts from this blog

Foundational distributed systems papers

Your attitude determines your success

Progress beats perfect

Cores that don't count

Silent data corruptions at scale

Learning about distributed systems: where to start?

Read papers, Not too much, Mostly foundational ones

Sundial: Fault-tolerant Clock Synchronization for Datacenters


Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3 (SOSP21)