Distributed systems analogies

- January 12, 2020

I start my distributed systems class each semester with a video of a murmuration of starlings. I tell the students that this still qualifies for a distributed system, because the definition fits: A collection of autonomous nodes communicating to address a problem collectively, with no shared memory and no common physical clock.

Just from simple local actions of each starling (i.e., that of readjusting position with respect to neighbors) a global behavior of the murmuration emerges in poetic beauty. I like this analogy pedagogically because it is very visual. What I don't like about the analogy is that the specification of the global behavior is lax. The starlings don't crash into each other, fine, but there is no other constraint about the global behavior of the murmuration. No constraints there. The murmuration can go to any direction, split, rejoin, etc.

What could be a better analogy for a distributed system?

To answer this question, let's start by considering an auxiliary question: What is the "action word" for distributed systems? For databases, the characteristic action word is storing/retrieving. For machine learning, it is perception. How about distributed systems? What do you think?

I think the action word for distributed systems is coordination. The fundamental challenge in distributed systems is to coordinate the behavior of nodes which execute concurrently with limited information about each other.

This brings me to my distributed systems analogy: a soccer team. The players try to execute certain plays/tactics, but no player can read the other player's mind. There is no shared memory/state among players, and timing of the players can also be off. Each player has only a limited, incomplete view of the system (e.g., an evolving attack). Yet, the global objective for the team is well-defined: defend your goal post, and eventually score a goal.

I like this analogy because it is very visual, and the global specification is nontrivial. It is easy to see "a collection of autonomous nodes communicating to address a problem collectively, with no shared memory and no common physical clock." One may even call FC Barcelona's tiki taka as a finely tuned micro services deployment.

It may be possible to stretch the analogy by relating team tactics to distributed algorithms, injuries to crash faults, etc. One part of the analogy I dislike is the team pits against an opposing team. We don't put distributed systems in opposition to each other in deployment.

Well, there is the orchestra analogy, which also involves coordination. But I don't think an orchestra is a fitting analogy to a distributed system. The musicians play notes collectively, but the coordination is not with respect to each other or dynamic inputs to the system, but with respect to the musical composition, which is heavily prescripted. Maybe the orchestra analogy is more suitable for parallel computing. Jazz players jamming/improvising is more suitable as a distributed systems analogy.

A team of construction workers also makes a good analogy for distributed systems. Something I noticed in the video is that these workers don't get in each others' way; unlike many distributed systems they don't seem to require strict coordination/synchronization. Maybe this is a suitable analogy for a big data processing system, with map-reduce workers.

If you are not worried about the visual demonstration component of the analogy, coordination within a company/corporation may also serve as a distributed system analogy. If you consider a company as a distributed system, you can think of several organization tactics: centralized, decentralized, federated, tiered, or hierarchical. I think there should be good synergy between management-science and distributed systems. But I don't know if this has been explored much. (Please let me know if you have good leads on this.)

What are closely related or synergistic disciplines to distributed systems?

The other day I was thinking if it would make sense to couple the adjective "distributed" with other disciplines. For example, distributed biology, distributed chemistry, distributed math, distributed economics, distributed medicine, distributed education, distributed civil engineering, distributed geology, etc.

Most of these don't make sense. But some does. When there is agency in the nodes, then distributed makes sense and may become applicable. For example, distributed economics/finance may make sense because of different markets and players affecting economy/finance. Maybe even distributed medicine makes sense, if you think of fighting an epidemic as a coordination problem starting the vaccinations/treatments from many different locations.

Distributed civil engineering may make sense because of the transportation problem, but maybe centralized algorithms are sufficient to address the static parts of transportation logistics. Distributed algorithms may be more appropriate for real-time traffic engineering solution, and definitely for sub-second geographical coordination required in some distribution management systems, such as electric grids.