Posts

Showing posts from April, 2024

Linux OSSNA'24

Image
Last week I attended the Linux Foundation Open Software Summit North America (OSSNA). The TLA+ Conference (which I wrote about here) was colocated with Linux OSSNA as a pre-conference event, and having made the continental cross, it made sense to stay for the first two days of OSSNA. Linus Torvalds would speak on day 2, and there were several sessions about different aspects of Linux  ecosystem. I had read that in one of the previous years 3000 people attended OSSNA, so it seemed like a good opportunity to learn about current state of the opensource ecosystem. The embedded opensource summit was also colocated with this event as well. So how bad could that be? It was pretty bad. I regretted staying for the conference. It was nice seeing Linus speak live, and I checked that out of my nerd bucket list, but otherwise this was an underwhelming and boring two days.  I will be whining a bit in the next couple pages, before I share you my notes from the Linus session. But you may l...

TLA+ conference 2024

Image
The TLA+ foundation got established last year in April as an independent, non-profit organization under the umbrella of the Linux Foundation. This April 15th, we colocated our annual TLA+ conference with the Linux OSSNA event to reach out to developers attending this event. Being under Linux Foundation, we didn't have to pay for the venue, but we paid $3K for professional recording of the sessions, so we can share the talks with the world at large. The videos will be available within a couple weeks.  Marc Brooker: 15 years of TLA+ Marc Brooker gave the keynote this Monday in TLA+ conference 2024. His keynote was a good mix of inspiration for formal methods and challenging formal methods to do better. Brooker is a VP/distinguished engineer at AWS. He has been on-call for 15 years, and were involved with many popular AWS services, including Aurora, Lambda, EC2, EBS, API gateway, iot, bedrock. He defines himself as mostly a practitioner, building large scale distributed systems. Br...

OLTP Through the Looking Glass, and What We Found There

Image
This paper appeared in Sigmod 2008.  The goal of the paper is to rethink the Online Transaction Processing (OLTP) database architecture (which remained largely unchanged from late 1970s) and to test out a streamlined/stripped-out in-memory database architecture. To this end, the paper dissects where time is spent inside of a single node OLTP database system, and carefully measures the performance of various stripped down variants of that system. Motivation OLTP databases were optimized for the computer technology of the 1970s-80s.  Multi-threaded execution was adopted to allow other transactions to make progress while one transaction waits for data to be fetched from the slow, rotating disk drive. Locking-based concurrency control was adopted since conflicts were inevitable in these disk-bound transactions, and since the small dataset sizes and the workloads of the time resulted in extremely "hot" keys. The computing world now is quite a different environment. Fast SSDs and a...

Chardonnay: Fast and General Datacenter Transactions for On-Disk Databases (OSDI'23)

Image
This paper appeared in OSDI'23. I haven't read the paper, and I am going by the two presentations I have seen of the paper, and by a quick skim of the paper. The paper has a cool idea, and I decided I should not procrastinate more waiting to read the paper in detail, and instead just capture and communicate this idea. Datacenter networking got very fast, much faster than SSD/disk I/O. That means, rather than two-phase commit (2PC), cold reads from SSD/disk is the dominant latency. Chardonnay banks on this trend in an interesting manner. It executes read/write transactions in "dry run" mode first, just to discover the readset and writeset and pin them to main memory. Then in "definitive" mode, it executes these transactions using two-phase-locking (2PL) and 2PC swiftly because everything (mostly) is in-memory. Having learned the readset and writeset during dry run also allows definitive execution achieve deadlock-freedom (and prevent aborts due to deadlock a...

A critique of ANSI SQL Isolation layers (Transaction Processing Book followup)

Image
This paper is from Sigmod 1995. As the title says, it critiques the ANSI SQL 92 standard (book) with respect to problems in isolation layer definitions. This is also a good follow up to t he Transaction Book chapter 7 where we reviewed the isolation layers . In fact, here Gray (with co-authors) gets a chance to amend the content in the Transaction Book with recent developments and better understanding/mapping of isolation layers to the degrees 0-4. Snapshot isolation (SI) finally makes its debut here in Gray's writing. Why so late? As we discussed earlier in the 80s the database keys were hot, and people rightly ignored optimistic concurrency control as it was not suitable for those workloads. Moreover, memory space was very scarce commodity, and multi-version (as in snapshot storage) was also off the table. With its increased adoption (with MySQL's use of SI around the corner) we see coverage of snapshot isolation in Gray's writing.  The paper is very well written, and is...

Implementation of Cluster-wide Logical Clock and Causal Consistency in MongoDB

Image
This paper (SIGMOD 2019) discusses the design of causal consistency and cluster-wide logical clock for MongoDB.  Causal consistency preserves Lamport's happened-before (transitive and partial) order of events in a distributed system. If an event A causes another event B, causal consistency ensures that every process in the system observes A before observing B. Causal consistency guarantees read-your-writes, write-follows-reads, monotonic reads, and monotonic writes. Causal consistency is the strictest consistency level where the system can still make progress (always-available one-way convergence) in the presence of partitions as discussed in  the "Consistency, Availability, and Convergence" paper . This makes causal-consistency a very attractive model for modern distributed applications. https://jepsen.io/consistency The paper provides a simple and balanced design for causal-consistency in MongoDB. The ideas are simple, yet the implementation and operationalizing (backw...

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Advice to the young

Foundational distributed systems papers

Distributed Transactions at Scale in Amazon DynamoDB

Linearizability: A Correctness Condition for Concurrent Objects

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book