Posts

Our Florida vacation

Image
No paper review this week. Instead, I stumbled upon notes buried in my blog.org entries. With Buffalo now cold and snowy, reminiscing about last June's hot Florida vacation seemed fitting. True to our tradition, we drove there from Buffalo, relishing the two-day road trip. Road trips are our love -- in 2018-19 we had crossed the US East-to-West and then West-to-East. I had documented the East-to-West drive here . This time, it was North-to-South, all the way to the southernmost point of the US—the Key West Islands . Yes, the driving was a bit tiring. But with our new SUV, good audiobooks, and sightseeing on the way, it was enjoyable. We Hotwire'd the hotels for the drive at around afternoon of each driving day. That is how the Demirbas family rolls. Our AirBnB in Orlando was at a resort. It was a 5 bedroom rental. We were able to get it cheap at $200 a day after taxes and everything. It was very comfortable, and we enjoyed the lazy life at the resort. Oh God, everything is big

Towards Modern Development of Cloud Applications

Image
This paper is from HotOS'23. At 6 pages, it is an easy-to-read paper, but it is not an easy-to-agree-with paper. The message is controversial: Don't do microservices, write a monolith, and our runtime will take care of deployment and distribution. This is a big claim, and we have been burned by ambitious attempts like this many times before. I realize big claims are part of the style of HotOS, where work-in-progress and sometimes provocative papers make a debut to kickstart a discussion. This paper sure does a good job of starting a discussion. Good There is code, and it is opensource , so this is not just a speculation paper. A Go framework does exist, which has been under development for sometime inside Google. Given Google's expertise on infrastructure and Go, I think this framework will be a big boon to the Google Cloud Platform (GCP), if it gets into production. To evaluate the framework (let's call it ServiceWeaver, with its Github name, shall we?), they consider

Epoxy: ACID Transactions Across Diverse Data Stores

Image
This VLDB'23 paper is a lovely and useful/practical piece of work. It is database in a can! It goes through all aspects of the protocol and implementation and solves a practical real problem. As with any systems (and especially distributed systems) problem, you initially think "oh how hard/complicated this could be", but as you delve in to the details, you realize there is a lot of things to consider and resolve. The paper does a great job presenting the challenges, and walking the reader through them. Ok, what is this Epoxy work about? Epoxy leverages Postgres transactional database as the primary/coordinator and  extends multiversion concurrency control (MVCC) for cross-data store isolation. It provides isolation as well as atomicity and durability through its optimistic concurrency control (OCC) plus two-phase commit (2PC) protocol. Epoxy was implemented as a bolt-on shim layer for five diverse data stores: Postgres, MySQL, Elasticsearch, MongoDB, and Google Cloud Sto

PolarDB-SCC: A Cloud-Native Database Ensuring Low Latency for Strongly Consistent Reads

Image
This paper from Alibaba group appeared in VLDB'23 . It talks about how to perform low latency strongly-consistent reads from secondaries in PolarDB database deployments. PolarDB adopts the canonical primary secondary architecture of relational databases. The primary is a read-write (RW) node, and the secondaries are read-only (RO) nodes. Having RO nodes help for executing queries, and scaling out in terms of querying performance. This is essentially t he AWS Aurora architecture . Durability is satisfied through shared storage, so we can ignore that and orthogonally focus on the optimizations for improving RO node performance. The way to improve the RO node performance is by shipping the redo log (essentially WAL) to these RO nodes so they can  keep their buffers ready, and serve reads from the buffer quickly, rather than having to reach out to shared storage. PolarDB architecture follows the same ideas. On top of this, they are interested in being able to serve strong-consistency r

Cabbage, Goat, and Wolf Puzzle in TLA+

Image
Over the past couple of months, I have been harping on the value of TLA+ for teaching engineers the art of abstraction . It is important to emphasize that this is an art, not a science, and it is best learned through studying examples and practicing hands-on with modeling. TLA+ excels in providing rapid feedback on your modeling and designs, which facilitates this learning process significantly. Modeling the "cabbage, goat, and wolf" puzzle taught me that tackling real/physical-world scenarios is a great way to practice abstraction and design -- cutting out the clutter and focusing on the core challenge. How will you represent this world? What will be your actors, objects, and the relations between them? Will one approach result in a cleaner protocol model than another? Or, as I like to ask, does the protocol flow cleanly from the model? The puzzle description Here is the puzzle. A farmer with a wolf, a goat, and a cabbage must cross a river by boat. The boat can carry only t

Kora: A Cloud-Native Event Streaming Platform For Kafka

Image
This paper from VLDB'23 (awarded the Best Industry Paper) describes how Confluent built Kora, to provide Kafka as a managed cloud event streaming platform. Kora combines best practices to deliver cloud features such as high availability, durability, scalability, elasticity, cost efficiency, performance, multi-tenancy. For example, the Kora architecture decouples its storage and compute tiers to facilitate elasticity, performance, and cost efficiency. As another example, Kora defines a Logical Kafka Cluster (LKC) abstraction to serve as the user-visible unit of provisioning, so it can help customers distance themselves from the underlying hardware and think in terms of application requirements. The writing of the paper could be much better. I think the paper fails to symphatize with the reader, who lacks the context about Kafka in the first place, and rushes in to explaining the mechanics how Kora makes Kafka a cloud managed offering. The motivation and use cases of Kafka could hav

TiDB: A Raft-based HTAP Database

Image
This paper is from VLDB 2020. TiDB is an opensource Hybrid Transactional and Analytical Processing (HTAP) database, developed by PingCap. The TiDB server, written in Go, is the query/transaction processing component; it is stateless, in the sense that  it does not store data and it is for computing only. The underlying key-value store, TiKV, is written in Rust, and it uses RocksDB as the storage engine. They add a columnar store called TiFlash, which gets most of the coverage in this paper. In this figure PD stands for Placement Driver (PD), which is responsible for managing Raft ranges, and automatically moving ranges to balance workloads. PD also hosts the timestamp oracle (TSO), which provides strictly increasing and globally unique timestamps to serve as transaction IDs. Each timestamp includes the physical time and logical time. The physical time refers to the current time with millisecond accuracy, and the logical time takes 18 bits. If you know about CockroachDB/CRDB ( here i

Popular posts from this blog

The end of a myth: Distributed transactions can scale

Foundational distributed systems papers

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Speedy Transactions in Multicore In-Memory Databases

Metastable failures in the wild

Amazon Aurora: Design Considerations + On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes

SIGMOD panel: Future of Database System Architectures

The Seattle Report on Database Research (2022)

There is plenty of room at the bottom