Arrakis: The OS is the control plane

This paper (authored by Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, and Thomas Anderson, University of Washington; Timothy Roscoe, ETH Zürich) was awarded a best paper award in OSDI 2014. 

The paper "described and evaluated Arrakis, a new operating system designed to remove the kernel from the I/O data path without compromising process isolation. Unlike a traditional operating system, which mediates all I/O operations to enforce process isolation and resource limits, Arrakis uses device hardware to deliver I/O directly to a customized user-level library. The Arrakis kernel operates in the control plane, configuring the hardware to limit application misbehavior."

The Arrakis paper avoids mentioning containers, but what they propose has a lot of applicability to the containers technology. Containers aim to provide isolation/portability of VM without incurring the overhead of VMs. So containers run an application set on the OS and raw metal with better performance instead of running it on a VM layer. Arrakis is providing OS level technology to improve efficiency for the same goal.

The Arrakis approach is also closely related to the ExoKernel and MicroKernel approach. Containers, ExoKernel, Xen Unikernel, and the Arrakis project form a spectrum from monolithic to microkernel OS. It seems like Tanenbaum will have the last laugh.

Hardware support


Arrakis exploits hardware support provided for Virtual-Machine-level virtualization, and pushes further and implements virtualization at the application (or potentially at the container) level. Arrakis is built on Barrelfish, which already supports standalone user-mode device drivers, akin to found in microkernels. The paper argues that with some modifications the idea can be brought to Linux as well.

This is what Arrakis requires from the hardware:
"Arrakis assumes the network devices provide support for virtualization by presenting themselves as multiple virtual network interface cards (VNICs) and that they can also multiplex/demultiplex packets based on complex filter expressions, directly to queues that can be managed entirely in user space without the need for kernel intervention. Similarly, each storage controller exposes multiple virtual storage interface controllers (VSICs) in our model. Each VSIC provides independent storage command queues (e.g., of SCSI or ATA format) that are multiplexed by the hardware. Associated with each such virtual interface card (VIC) are queues and rate limiters."

"Network cards that support SR-IOV have the key elements of this model: they allow the creation of multiple VNICs that each may have multiple send and receive queues, and support at least rudimentary transmit and receive filters."

"Storage controllers have some parts of the technology needed to provide the interface we describe. For example, RAID adapters have a translation layer that is able to provide virtual disks above physical extents, and SSDs use a flash translation layer for wear-leveling. SCSI host-bus adapters support SR-IOV technology for virtualization and can expose multiple VSICs, and the NVMe standard proposes multiple command queues for scalability."

Comments

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Advice to the young

Foundational distributed systems papers

Distributed Transactions at Scale in Amazon DynamoDB

Linearizability: A Correctness Condition for Concurrent Objects

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book