SOSP19 Day 2, Scaling Symbolic Evaluation for Automated Verification of Systems Code with Serval

Verification session was the first session for Day 2. I like formal methods, and I did enjoy these papers. In this post I will only talk about the first paper in the session, the Serval paper. (You can read about SOSP19 Day 1 here.)

This paper is by Luke Nelson (University of Washington), James Bornholt (University of Washington), Ronghui Gu (Columbia University), Andrew Baumann (Microsoft Research), Emina Torlak (University of Washington), Xi Wang (University of Washington).

This paper received a best paper award at SOSP19, and the software is publicly available at https://unsat.cs.washington.edu/projects/serval/.

SOSP has a tradition of publishing systems verification papers, such as seL4 (SOSP’09), Ironclad Apps (OSDI’14), FSCQ (SOSP’15), CertiKOS (PLDI’16), Komodo (SOSP’17). A downside of systems verification is it is very effort-intensive. The Certikos manual proof consisted of more than 200K lines.

To help address this problem,  this paper introduces Serval, a framework for  developing automated verifiers for systems software. Serval accomplishes this by lifting interpreters written by developers into automated verifiers. It also provides a systematic approach to identifying and repairing verification performance bottlenecks using symbolic profiling and optimizations.

Wait, wait... What is an interpreter? And what is lifting?

In prior work on automatic verification (such as Hyperkernel SOSP17), a verifier implements symbolic evaluation for specific systems, and the verifier is not reusable/generalized. To make the verifier reusable and general, in Serval, the developers write an interpreter for an instruction set using Rosette, an extension of the Racket language for symbolic reasoning. Serval leverages Rosette to "lift" an interpreter into a verifier; which means to "transform a regular program to work on symbolic values". The developers also give the system specifications to be verified.


In the Serval framework the verifier consists of the lifted interpreter and the symbolic optimization. The steps are: write a verifier as interpreter, then Serval performs symbolic profiling to find bottleneck, and apply optimizations until verification becomes feasible.


Serval uses symbolic execution to avoid the state space explosion problem. But the program counter (PC) becoming symbolic is bad as it unnecessarily opens up search space. Serval prevents this with symbolic optimizations:

  • peephole optimization
  • fine-tune symbolic evaluation
  • use domain language to reduce the concrete values PC can take, and avoid path explosion problem.

Unfortunately I didn't understand much about the first two optimizations from listening to the presentation.

Using Serval, the authors build automated verifiers for the RISC-V, x86-32, LLVM, and BPF instruction sets. Targeting low level end of compiling stack can be an advantage for verification, because we don't need to trust higher level language toolkits. Future work will consider how the low-level-guarantees identified and verified by Serval could be connected to high level data structures for proof verification.

To show that existing systems can be retrofitted for Serval, they Retrofitted CertiKOS and Komodo for Serval. They mention this takes around 4 weeks for a new system. They also found 15 new bugs in Linux BPF JIT.

I will read the paper carefully to understand Serval better. It seems promising for scaling verification to practical systems. Of course the process still requires expertise and several weeks worth of effort, but Serval improves on the state-of-the-art with many months of effort.

Comments

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book