Posts

Building a Database on S3

Image
Hold your horses, though. I'm not unveiling a new S3-native database. This paper is from 2008. Many of its protocols feel clunky today. Yet it nails the core idea that defines modern cloud-native databases: separate storage from compute. The authors propose a shared-disk design over Amazon S3, with stateless clients executing transactions. The paper provides a blueprint for serverless before the term existed. SQS as WAL and S3 as Pagestore The 2008 S3 was painfully slow, and 100 ms reads weren't unusual. To hide that latency, the database separates "commit" from "apply". Clients write small, idempotent redo logs to Amazon Simple Queue Service (SQS) instead of touching S3 directly. An asynchronous checkpoint by a client applies those logs to B-tree pages on S3 later. This design shows strong parallels to modern disaggregated architectures . SQS becomes the write-ahead log (WAL) and logstore. S3 becomes the pagestore. Modern Aurora follows a similar logic : t...

800th blog post: Write that Blog!

Image
I had given an email interview to the "Write That Blog!" newsletter. That came out today , which coincided with my 800th blog post. I am including my answers also here.  Why did you start blogging – and why do you continue? In 2010, when I was a professor, one of my colleagues in the department was teaching a cloud computing seminar. I wanted to enter that field coming from theory of distributed systems, and later wireless sensor networks fields. So I attended the seminar. As I read the papers, I started blogging about them. That is how I learn and retain concepts better, by writing about them. Writing things down helps crystalize ideas for me. It lets me understand papers more deeply and build on that understanding.  The post on MapReduce , the first paper discussed in the seminar, seems to have opened the floodgates of my blogging streak, which has been going strong for 15 years. I think a big influence on me has been  the EWD documents . I remember the day I came acros...

Popular posts from this blog

Hints for Distributed Systems Design

The F word

The Agentic Self: Parallels Between AI and Self-Improvement

Learning about distributed systems: where to start?

Foundational distributed systems papers

Cloudspecs: Cloud Hardware Evolution Through the Looking Glass

Advice to the young

Agentic AI and The Mythical Agent-Month

Are We Becoming Architects or Butlers to LLMs?

Welcome to Town Al-Gasr