Showing posts from October, 2021

Log structured protocols in Delos (SOSP21)

This paper picks up from where the "Virtual Consensus in Delos" paper finishes , and  talks about using Delos to build control plane databases at Facebook. These are my notes from Mahesh's excellent conference presentation. These control plane databases at Facebook were required to support multiple APIs: some SQL, some key-value pairs, and some ZooKeeper namespaces. Each such API typically requires a separate database to be built. But implementing and operating even a single zero dependency control plane system is difficult. To cut this dilemma, they observe that these databases have a similar structure: a consensus protocol at the bottom, and a replicated state machine on top. A lot of that state machine uses generic logic that can be reused across different APIs. This implies that there is an hourglass architecture at work here, where Delos platform is the narrow waist. This paper focuses on protocols which allow us to layer multiple apis on this common Delos platform.

David G. Andersen AMA (SOSP Day 3)

Dave has broad research interests in computer systems in the networked environment. What do you think systems community should be working on but isn't getting enough attention? With the stuttering of Moore's law as we get into nanoscales, there is more need to extract performance from systems through integrated co-design of hardware and software. We need integrated work through the entire stack to make systems faster and more reliable. How do you pick research ideas/projects? I stumble on interesting questions. There is a lot of cross-fertilization going on between different areas I am working on. If everything else fails, once a year or so, I take a notebook, go for a walk, and write down my ideas. My two sabbaticals were also very fruitful for finding research ideas and projects. At Google, I worked with AI/ML people, which opened new horizons for me. What do you think about the future prospects of blockchain systems? Will there be a killer application, like ever? There are v

Jon Howell AMA at SOSP21 (Day 2)

 There were several AMAs at SOSP21, where the attendees can ask questions. I really like Jon Howell 's AMA session. I asked two questions I was interested about learning.   Why do you think verification is at a breakaway point now? What are the factors that make this the go-big moment? We are now figuring out how to make common problems cheap. Historically Coq proving was about being clever: do something not done before. Now we are focusing on how do we make the common stuff cheap. The Ironclad project began when Rustan Leino came to our team, and said I have a tool, let's use this tool. The Dafny demos were interesting to me. A red squiggly line showed a bug. When you give hint, the bug went away. It was a lightbulb moment. It took me extra line of code/explanation for me to get it. The verifier already got it. We had verification runouts in slicon valley,  with tlc, tlaps and others. The verifier send back question mark, and we try to decode it, what we realize was, almost

SOSP21 conference (Day 1)

SOSP, along with OSDI, is the premiere conference in computer systems. SOSP was held biannually; I had attended SOSP19 in person, and shared notes and paper summaries from that.  This year SOSP is virtual, which made it a lot easier to travel to. It is nice attending the conference from the convenience of your home. The experience is not too inferior to physical conference attending if you take the convenience factor into account.  Here are some notes from SOSP21. If you find and interesting paper, you can dig in, because: All SOSP papers are available as open access All the presentations are also available on YouTube Opening The conference opened with announcements from program committee (PC) chairs. SOSP21 had 348 submissions from 2078 authors, resulting on avg 6 authors per paper. 54 out of 348 papers are accepted. Reviews were conducted by 64 PC members, who produced 1500+ reviews. The trend is clear, the number of submissions are growing steeply.  Last year OSDI had announced i


Distributed/networked systems employ data replication to achieve availability and scalability. Consistency is concerned with the question of what should happen if a client modifies some data items and concurrently another client reads or modifies the same items possibly at a different replica. Linearizability is a strong form of consistency. (That is why it is also called as strong-consistency.) For a system to satisfy linearizability,  each operation must appear (from client perspective) to occur at an instantaneous point between its start time (when the client submits it) and finish time (when the client receives the response), and  execution at these instantaneous points should form a valid sequential execution (i.e., it should be as if operations are executed one at a time ---without concurrency, like they are being executed by a single node)  Let's simplify things further. In practice, consistency is often defined for systems that have two very specific operations: read and w

Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3 (SOSP21)

This paper comes from my colleagues at AWS S3 Automated Reasoning Group, detailing their experience applying lightweight formal methods to a new class of storage node developed for S3 storage backend. Lightweight formal methods emphasize automation and usability. In this case, the approach involves three prongs: developing executable reference models as specifications, checking implementation conformance to those models, and building infrastructure to ensure the models remain accurate in the future. ShardStore ShardStore is a new append-only key-value storage node developed for AWS S3 backend. It is over 40K lines of Rust code. Shardstore is a log-structured merge tree (LSM tree) but with shard data stored outside the tree to reduce write amplification. ShardStore employs soft updates for avoiding the cost of redirecting writes through a write-ahead log while still being crash consistent. A soft updates implementation ensures that only writes whose dependencies are persisted are sent

Popular posts from this blog

Learning a technical subject

Foundational distributed systems papers

Learning about distributed systems: where to start?

Strict-serializability, but at what cost, for what purpose?

CockroachDB: The Resilient Geo-Distributed SQL Database

Amazon Aurora: Design Considerations + On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes

Anna: A Key-Value Store For Any Scale

Warp: Lightweight Multi-Key Transactions for Key-Value Stores

The Seattle Report on Database Research (2022)

Graviton2 and Graviton3