Sunday, August 27, 2017

Paper summary: Performance clarity as a first-class design principle (HOTOS'17)

This paper appeared in HOTOS'17 and is by Kay Ousterhout, Christopher Canel, Max Wolffe, Sylvia Ratnasamy, and Scott Shenker. (The team is at UC Berkeley Christopher is now at CMU.)

The paper argues that performance clarity is as important a design goal as performance or scalability. Performance clarity means ease of understanding where bottlenecks lie in a deployment and the performance implications of various system changes.

The motivation for giving priority to performance clarity arises from the performance opaqaness of distributed systems we have, and how hard it is to configure/tune them to optimize performance. In current distributed data processing systems, a small change in hardware or software configurations may cause disproportional impact on performance. An earlier paper, Ernest, showed that selecting an appropriate cloud instance type could improve performance by 1.9× without increasing cost. And when things run slowly in a deployment, it is hard to determine where the bottlenecks are.

This discussion on performance clarity reminds me of a Tony Hoare quote on software design:
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the Other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. 
This paper advocates to design the system so clear that if there are deficiencies in the deployment/configuration, they become obvious and can be accounted for.


Concretely, the paper proposes an architecture for data analytics frameworks (Spark in particular) in which jobs are decomposed into schedulable small units called "monotasks". Each monotask use exactly one of CPU, disk, and network.

This is an unconventional take. Today's frameworks break jobs into tasks that time-share the use of CPU, disk, and network. Concurrent tasks may contend for resources, even when their aggregate resource use does not exceed the capacity of the machine. In contrast, a single monotask has exclusive access to a resource. The benefit this provides is guaranteed resource isolation without any cross contention from another tasks.

Using monotasks to explicitly separate resources makes the bottleneck visible: the bottleneck is simply the resource (disk, CPU, or network) with the longest queue.


The hypothesis tested in the evaluation is that "using monotasks improves performance clarity without sacrificing high performance".

They give a prototype implementation of monotasks for Apache Spark, called MonoSpark, that  achieves performance parity with Spark for benchmark workloads. Their monotask implementation also yields a simple performance model that can predict the effects of future hardware and software changes.

So does monotasks hurt performance? I am quoting from the paper:
"Because using monotasks serializes the resource use of a multitask, the work done by one of today’s multitasks will take longer to complete using monotasks. To compensate, MonoSpark relies on the fact that today’s jobs are typically broken into many more multitasks than there are slots to run those tasks, and MonoSpark can run more concurrent tasks than today’s frameworks. For example, on a worker machine with four CPU cores, Apache Spark would typically run four concurrent tasks: one on each core, as shown in Figure 1. With MonoSpark, on the other hand, four CPU monotasks can run concurrently with a network monotask (to read input data for future CPU monotasks) and with two disk monotasks (to write output), so runtime of a job as a whole is similar using MonoSpark and Spark. In short, using monotasks replaces pipelining with statistical multiplexing, which is robust without requiring careful tuning."


The monotasks idea is very simple. But simple ideas have a chance to win. Performance predictability becomes a snap with this, because it is so easy to see/understand what will happen in a slightly different configuration. Monotasks provide very nice observability and resource isolation features.

I suspect this is somewhat Spark specific. It looks like this works best with predictable and CPU/disk/network balanced tasks/applications. These conditions are satisfied for Spark and Spark workloads.

Finally, the monotasks ideas will not work in current virtual machine, container, or cloud environments. Monotasks require direct access to machines to implement per disk, per CPU, and network schedulers.

No comments:

Two-phase commit and beyond

In this post, we model and explore the two-phase commit protocol using TLA+. The two-phase commit protocol is practical and is used in man...