My Distributed Systems Seminar's reading list for Spring 2017

Below is the first draft list of papers I plan to discuss in my distributed systems seminar in the Spring semester. If you have some suggestions on some good/recent papers to cover, please let me know in the comments.

Datacenter Operating System

Firmament: Fast, Centralized Cluster Scheduling at Scale (OSDI 16)
Large-scale cluster management at Google with Borg (Eurosys 15)
Apache Hadoop YARN: yet another resource negotiator (SOCC 13)
Slicer: Auto-Sharding for Datacenter Applications (OSDI 16)

Monitoring

Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems (SOSP 15)
Shasta: Interactive Reporting At Scale (SIGMOD 16)
Adaptive Logging: Optimizing Logging and Recovery Costs in Distributed In-memory Databases (SIGMOD 16)

Consistency

The many faces of consistency (2016)
The SNOW Theorem and Latency-Optimal Read-Only Transactions (OSDI 16)
Incremental Consistency Guarantees for Replicated Objects  (OSDI 16)
Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering  (OSDI 16)
FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs  (OSDI 16)

BFT 

The Honey Badger of BFT Protocols (2016)
The Bitcoin Backbone Protocol: Analysis and Applications (2015)
XFT: Practical Fault Tolerance beyond Crashes (OSDI 16)

Links

2016 Seminar reading list
2015 Seminar reading list

Comments

Punya said…
Thanks for posting this. I believe the pivot tracing link might be pointing to the wrong URL - could you check?
蓝葻 said…
+1
This seems to be the correct link:
http://sigops.org/sosp/sosp15/current/2015-Monterey/printable/122-mace.pdf
Murat said…
I corrected the link. Thank you.
Nice collection! Also, I was wondering, are you aware of https://dcos.io/ which is the actual Datacenter Operating System (DC/OS)?
Sam BESSALAH said…
Great List.
I would add two papers in the Datacenter OS section. One is
DRF https://people.eecs.berkeley.edu/~alig/papers/drf.pdf (Dominant Resource Fairness) which is the mechanism that underpins Mesos (https://people.eecs.berkeley.edu/~alig/papers/mesos.pdf) .
Sam BESSALAH said…
Oh, and on interesting paper is the Dapper paper for Distributed System Tracing from Google. One close open source implementation is in the Zipkin project : http://static.googleusercontent.com/media/research.google.com/en//archive/papers/dapper-2010-1.pdf

Popular posts from this blog

Foundational distributed systems papers

Your attitude determines your success

Graviton2 and Graviton3

Progress beats perfect

Cores that don't count

Silent data corruptions at scale

Learning about distributed systems: where to start?

Read papers, Not too much, Mostly foundational ones

Sundial: Fault-tolerant Clock Synchronization for Datacenters

Linearizability