Skip to main content

Search This Blog

Metadata

On distributed systems broadly defined and other curiosities. The opinions on this site are my own.

My iPhone 4 has a transparent screen

Get link
Facebook
X
Pinterest
Email
Other Apps

- November 07, 2010

I just took a picture of my palm, and set it as the wallpaper to give the transparency effect. The trick worked well on some unsuspecting friends.

Get link
Facebook
X
Pinterest
Email
Other Apps

Comments

Popular posts from this blog

Hints for Distributed Systems Design

- October 02, 2023

This is with apologies to Butler Lampson, who published the " Hints for computer system design " paper 40 years ago in SOSP'83. I don't claim to match that work of course. I just thought I could draft this post to organize my thinking about designing distributed systems and get feedback from others. I start with the same disclaimer Lampson gave. These hints are not novel, not foolproof recipes, not laws of design, not precisely formulated, and not always appropriate. They are just hints. They are context dependent, and some of them may be controversial. That being said, I have seen these hints successfully applied in distributed systems design throughout my 25 years in the field, starting from the theory of distributed systems (98-01), immersing into the practice of wireless sensor networks (01-11), and working on cloud computing systems both in the academia and industry ever since. These heuristic principles have been applied knowingly or unknowingly and has proven...

My Time at MIT

- February 16, 2025

Twenty years ago, in 2004-2005, I spent a year at MIT’s Computer Science department as a postdoc working with Professor Nancy Lynch. It was an extraordinary experience. Life at MIT felt like paradise, and leaving felt like being cast out. MIT Culture MIT’s Stata Center was the best CS building in the world at the time. Designed by Frank Gehry, it was a striking abstract architecture masterpiece ( although like all abstractions it was a bit leaky ). Furniture from Herman Miller complemented this design. I remember seeing price tags of $400 on simple yellow chairs. The building buzzed with activity. Every two weeks, postdocs were invited to the faculty lunch on Thursdays, and alternating weeks we had group lunches. Free food seemed to materialize somewhere in the building almost daily, and the food trucks outside were also good. MIT thrived on constant research discussions, collaborations, and talks. Research talks were advertised on posters at the urinals, as a practical touch of M...

Making database systems usable

- August 19, 2024

C. J. Date's Sigmod 1983 keynote, "Database Usability", was prescient. Usability is the most important thing to the customers. They care less about impressive benchmarks or clever algorithms, and more about whether they can operate and use a database efficiently to query, update, analyze, and persist their data with minimal headache. (BTW, does anyone have a link to the contents of this Sigmod'83 talk? There is no transcript around, except for this short abstract .) The paper we cover today is from Sigmod 2007. It takes on the database usability problem raised in that 1983 keynote head-on, and calls out that the king is still naked. Let's give some context for the year 2007. Yes, XML format was still popular then. The use-case in the paper is XQuery. The paper does not contain any reference to json. MongoDB would be released in 2009 with the document model; and that seems to be great timing for some of the usability pains mentioned in the paper! Web 2.0 was in ...

Looming Liability Machines (LLMs)

- August 24, 2024

As part of our zoom reading group ( wow, 4.5 years old now ), we discussed a paper that uses LLMs for automatic root cause analysis (RCA) for cloud incidents. This was a pretty straightforward application of LLMs. The proposed system employs an LLM to match incoming incidents to incident handlers based on their alert types, predicts the incident's root cause category, and provides an explanatory narrative. The only customization is through prompt-engineering. Since this is a custom domain, I think a more principled and custom-designed machine learning system would be more appropriate rather than adopting LLMs. Anyways, the use of LLMs for RCAs spooked me vicerally. I couldn't find the exact words during the paper discussion, but I can articulate this better now. Let me explain. RCA is serious business Root cause analysis (RCA) is the process of identifying the underlying causes of a problem/incident, rather than just addressing its symptoms. One RCA heuristic is asking 5 Why...

Advice to the young

- July 30, 2024

I notice I haven't written any advice posts recently. Here is a collection of my advice posts pre 2020. I've been feeling all this elderly wisdom pent up in me, ready to pour at any moment. So here it goes. Get ready to quench your thirst from my fount of wisdom. No man, think for yourself, only get what works for you. It is called foundations, not theory Foundations of computer science (or rather any field of study) are the most important topics you can learn. These lay down the frame of thinking/perspective for that area of study. Yet, I am saddened to hear these called as "theory", and labeled as "unpractical". This couldn't be farther from the truth. Take a look at how I recommend studying distributed systems . Don't you dare call this "theory" and "unpractical". This lays the bedrock that you build your practice on. Don't skimp on the foundations. Don't build your home on quicksand. Keep your hands dirty, your mind cl...

Learning about distributed systems: where to start?

- June 10, 2020

This is definitely not a "learn distributed systems in 21 days" post. I recommend a principled, from the foundations-up, studying of distributed systems, which will take a good three months in the first pass, and many more months to build competence after that. If you are practical and coding oriented you may not like my advice much. You may object saying, "Shouldn't I learn distributed systems with coding and hands on? Why can I not get started by deploying a Hadoop cluster, or studying the Raft code." I think that is the wrong way to go about learning distributed systems, because seeing similar code and programming language constructs will make you think this is familiar territory, and will give you a false sense of security. But, nothing can be further from the truth. Distributed systems need radically different software than centralized systems do. --A. Tannenbaum This quotation is literally the first sentence in my distributed systems syllabus. Inst...

Scalable OLTP in the Cloud: What’s the BIG DEAL?

- January 17, 2024

This paper is from Pat Helland, the apostate philosopher of database systems, overall a superb person, and a good friend of mine. The paper appeared this week at CIDR'24. (Check out the program for other interesting papers). The motivating question behind this work is: " What are the asymptotic limits to scale for cloud OLTP (OnLine Transaction Processing) systems? " Pat says that the CIDR 2023 paper "Is Scalable OLTP in the Cloud a Solved Problem?" prompted this question. The answer to the question? Pat says that the answer lies in the joint responsibility of database and the application. If you know of Pat's work, which I have summarized several in this blog , you would know that Pat has been advocating along these lines before. But this paper provides a very crisp, specific, concrete answer. Read on for my summary of the paper. Disclaimer: This is a wisdom and technical information/detail packed 13-page paper, so I will try my best to summarize the sa...

Foundational distributed systems papers

- February 27, 2021

I talked about the importance of reading foundational papers last week. To followup, here is my compilation of foundational papers in the distributed systems area. (I focused on the core distributed systems area, and did not cover networking, security, distributed ledgers, verification work etc. I even left out distributed transactions, I hope to cover them at a later date.) I classified the papers by subject, and listed them in chronological order. I also listed expository papers and blog posts at the end of each section. Time and State in Distributed Systems Time, Clocks, and the Ordering of Events in a Distributed System. Leslie Lamport, Commn. of the ACM, 1978. Distributed Snapshots: Determining Global States of a Distributed System. K. Mani Chandy Leslie Lamport, ACM Transactions on Computer Systems, 1985. Virtual Time and Global States of Distributed Systems. Mattern, F. 1988. Practical uses of synchronized clocks in distributed systems. B. Liskov, 1991. Exp...

Distributed Transactions at Scale in Amazon DynamoDB

- August 17, 2023

This paper appeared in July at USENIX ATC 2023. If you haven't read about the architecture and operation of DynamoDB, please first read my summary of the DynamoDB ATC 2022 paper . The big omission in that paper was discussion about transactions. This paper amends that. It is great to see DynamoDB, and AWS in general, is publishing/sharing more widely than before. Overview A killer feature of DynamoDB is predictability at any scale. Do read Marc Brooker's post to fully appreciate this feature. Aligned with this predictability tenet, when adding transactions to DynamoDB, the first and primary constraint was to preserve the predictable high performance of single-key reads/writes at any scale. The second big constraint was to implement transactions using update in-place operation without multi-version concurrency control. The reason for this was they didn't want to mock with the storage layer which did not support multi-versioning. Satisfying both of the above constraints may s...

Linearizability: A Correctness Condition for Concurrent Objects

- August 09, 2024

This paper is from Herlihy and Wing appeared in ACM Transactions on Programming Languages and Systems 1990. This is the canonical reference for the linearizability definition. I had not read this paper in detail before, so I thought it would be good to go to the source to see if there are additional delightful surprises in the original text. Hence, this post. I will dive into a technical analysis of the paper first, and then discuss some of my takes toward the end. I had written an accessible explanation of linearizability earlier; you may want to read that first. I will assume an understanding of linearizability to keep this review at reasonable length. Introduction I love how the old papers just barge in with the model, without bothered by pleasantries such as motivation of the problem. These are the first two sentences of the introduction. "A concurrent system consists of a collection of sequential processes that communicate through shared typed objects . This model encompass...

Powered by Blogger

Theme images by Michael Elkan

Murat Demirbas

Murat: I am a principal research scientist at MongoDB Research. Ex-AWS. Ex-professor at SUNY Buffalo. I work on distributed systems, distributed consensus, and cloud computing. You can follow me on Mastodonor Twitter.

Pageviews

Recent Posts

July4
June2
May8
April2
March3
February5
January5
December8
November5
October8

Show more Show less

Topics

2PC2 abstraction6 AI5 analytics3 atomic storage2 auditability5 automated reasoning12 aws5 Azure11 benchmarks4

bestof8 big-data30 Blockchain39 book-review54 calm5 chaos2 cloud computing17 ConcurrencyControlBook7 consistency32 Cosmos DB11 CosmosDB12 crdts2 data warehouse2 databases56 datacenter networking1 dataflow11 dbos1 DDIA15 disaggregation2 distributed consensus50 distributed transactions38 distSQL8 facebook16 failures18 fault-tolerance47 formal methods14 graph-processing1 hpts3 htap3 humans10 indexing3 isolation levels4 links2 mad-questions42 main-memory1 measuring1 metastability3 microservices2 misc118 ML1 mlbegin7 mldl26 mobile2 mongodb9 MVCC2 my advice19 my-paper10 networking1 newsql3 NoSQL2 OLAP2 OLTP7 paper-review149 paxos51 postgres1 presenting4 privacy1 programming7 query-processing2 raft2 RDMA2 reading-group23 reconfiguration4 research-advice52 research-question44 Rust3 scheduling3 security1 seminar9 serializability2 serverless1 smartphones2 snapshot isolation6 sonification1 SQL6 stabilization6 statistics3 stream-processing12 teaching31 tensorflow11 time20 time synchronization5 timeDB7 tla56 tpbook1 transactions36 trip-report34 wpaxos6 writing30

Show more Show less