- Learning Machine Learning: A beginner's journey. I wrote this to quickly recap about what I found confusing and useful while I learned about machine learning. I had no idea this would blow up. This received the most pageviews: 45,000 and counting.
- TensorFlow: A system for large-scale machine learning.
- Realtime Data Processing at Facebook.
- Measuring and Understanding Consistency at Facebook.
- Holistic Configuration Management at Facebook.
- Why Does the Cloud Stop Computing? Lessons from Hundreds of Service Outages.
- TaxDC: A Taxonomy of nondeterministic concurrency bugs in datacenter distributed systems.
- Modular Composition of Coordination Services. Make sure you read the comment at the end, where one of the authors Kfir Lev-Ari provides answers the questions raised in my post.
- Implementing Linearizability at Large Scale and Low Latency.
- Consensus in the Cloud: Paxos Systems Demystified.
- Modeling the Dining Philosophers Algorithm in TLA+.
- Modeling Paxos and Flexible Paxos in Pluscal and TLA+.
- TLA+ modeling of chain-replicated key-value store.