Posts

Showing posts from 2021

Best of metadata in 2021

As it became our tradition, here are some highlights from my 2021 posts. Systems Foundational distributed systems papers There is plenty of room at the bottom Graviton2 and Graviton3 Fail-silent Corruption Execution Errors (CEEs) at CPU/cores SOSP21 conference (Day1) Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3 Building Distributed Systems With Stateright Sundial: Fault-tolerant Clock Synchronization for Datacenters Do tightly synchronized clocks help consensus? Databases Linearizability What's Really New with NewSQL? A read-only transaction anomaly under snapshot isolation FoundationDB Record Layer: A Multi-Tenant Structured Datastore Misc Learning a technical subject Your attitude determines your success Humans of Computer Systems: Obdurodon Facebook: The Inside Story (2020) by Steven Levy Previous years in review Year in review 2020 Year in review 2019 Year in review 2018 Research, writing, and career advice

Learning a technical subject

I love learning. I wanted to write about how I learn, so I can analyze if there is a method to this madness. I will first talk about what my learning process looks like in abstract terms, and then I'll give an analogy to make things more concrete and visual.   Learning is a messy process for me I know some very clear thinkers. They are very organized and methodical. I am not like that. These tidy thinkers seem to learn a new subject quickly (and effortlessly) by studying the rules of the subject and then deriving everything about that subject from that set of rules. They speak in precise statements and have clear and hard-set opinions about the subject. They seem to thrive most in theoretical subjects. In my observation those tidy learners are in the minority. Maybe the tidy thinkers are able to pull this feat off because they come from a neighboring domain/subject and map the context there to this subject quickly. But, again from my experience, it doesn't feel like that. It s

A read-only transaction anomaly under snapshot isolation

This paper, from Sigmod 2004, is short and sweet. Under snapshot isolation level, it shows a surprising example of a transaction history where the read-only transaction triggers a serialization anomaly, even when the update transactions are serializable. This is surprising because it was assumed that, under snapshot isolation, read-only transactions always execute serializably without ever needing to wait or abort because of concurrent update transactions. Background Snapshot isolation is an attractive consistency model for transactions.  Wikipedia has a very nice summary: In databases, and transaction processing (transaction management), snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database (in practice it reads the last committed values that existed at the time it started), and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot. Sn

Humans of Computer Systems: Ted

Programming How did you learn to program? Through a project in high school.     Tell us about the most interesting/significant piece of code you wrote.   Electronic mail system. Who did you learn most from about computer systems?   Edsger Dijkstra https://www.cs.utexas.edu/users/EWD/     What is the best code you have seen?   IBM 360 operating system       What do you believe are the most important skills to be successful in your field?     There are many paths to success, and a variety of skills to get there - no MOST IMPORTANT       What quality or ability do you value most in a computer systems person? The ability to explain. Personal Which of your work/code/accomplishments are you most proud of? Read, eg, Erich Fromm - pride is not a quality that should be considered What comes to you easy that others find hard? What are your superpowers? Recursion is natural to me. What was a blessing in disguise for you? What seemed like a failure at the time but led to something better later

Graviton2 and Graviton3

Image
What do modern cloud workloads look like? And what does that have to do with new chip designs? I found these gems in Peter DeSantis's ReInvent20 and ReInvent21 talks. These talks are very informative and educational. Me likey! The speakers at ReInvent are not just introducing new products/services, but they are also explaining the thought processes behind them. To come up with this summary, I edited the YouTube video transcripts slightly (mostly shortening it). The presentation narratives have been really well planned, so this makes a good read I think. Graviton2 This part is from the ReInvent2020 talk from Peter DeSantis.   Graviton2 is the best performing general purpose processor in our cloud by a wide margin. It also offers significantly lower cost. And it's also the most power efficient processor we've ever deployed. Our plan was to build a processor that was optimized for AWS and modern cloud workloads. But, what do modern cloud workloads look like? Let's start by

Chess

World chess championship is on. Magnus is the clear favorite, but Nepomniachtchi is coming strong. It is a lot of fun. I haven't been posting for sometime. And this seems like a good time to talk about chess. My journey picking up on chess, again As a kid I wasn't much interested in chess. My younger brother was into chess, he would buy books to review grandmasters' games. I didn't go further than playing casual games with him. Maybe it was the quarantine that triggered this, but in the last year I started playing some chess on the smartphone. I had an Android phone for the last 3 years, and I used a random chess app downloaded at Play Store. I thought the app was very neat because it let me play against the computer at different levels and it allowed me to go back and try different things. Like Git, you know. The app also suggested me hints. I thought this was a dope way to improve one's chess skills. Little did I know, I was just scratching the surface. Three yea

What’s Really New with NewSQL?

Image
This paper is by Andy Pavlo and Matthew Aslett, and it appeared in Sigmod 2016.   NoSQL managed to scale horizontally, but this came at the expense of losing transaction and rich querying capability. NewSQL followed NoSQL to amend things and restore balance to the force. NewSQL is a class of modern relational DBMSs that seek to provide on-par scalability to NoSQL for OLTP read-write workloads while still maintaining ACID guarantees for transactions. Let's dissect this definition. The biggest benefit of NewSQL is that developers do not have to write code to deal with eventually consistent updates as they would in a NoSQL system, because they will be able to use ACID transactions and SQL-like rich querying capabilities. NewSQL is about OLTP (online transaction processing), not OLAP (online data analysis like in data warehouse systems). Well, with the caveat that it can also be about HTAP (hybrid transactional-analytical processing), as the paper mentions under the future trends secti

Rabia: Simplifying State-Machine Replication Through Randomization (SOSP'21)

Image
This paper appeared in SOSP'21 . I took notes and screen-snapshots during the presentation of this paper, and decided to put together a summary of what I understood from it. The paper has a simple idea and a somewhat unexpected result. It will be interesting to dive deep and explore to the extent this idea can be applied in practice.  Here is the idea. State-machine replication through Paxos , more accurately MultiPaxos, is commonly used in practice. (Yes, that includes state-machine replication through Raft, if I have to spell this out to the Rafters among us.) The MultiPaxos leader drives the protocol. It is basically one round, phase-2 execution of "accept this!" "yes, boss" between the leader and followers, where phase3 commit can be piggybacked to phase-2 message. This is as simple and efficient as it gets. You can't beat it! Or can you? The paper argues that, although the happy path of MultiPaxos based state-machine replication is simple and efficient

Log structured protocols in Delos (SOSP21)

Image
This paper picks up from where the "Virtual Consensus in Delos" paper finishes , and  talks about using Delos to build control plane databases at Facebook. These are my notes from Mahesh's excellent conference presentation. These control plane databases at Facebook were required to support multiple APIs: some SQL, some key-value pairs, and some ZooKeeper namespaces. Each such API typically requires a separate database to be built. But implementing and operating even a single zero dependency control plane system is difficult. To cut this dilemma, they observe that these databases have a similar structure: a consensus protocol at the bottom, and a replicated state machine on top. A lot of that state machine uses generic logic that can be reused across different APIs. This implies that there is an hourglass architecture at work here, where Delos platform is the narrow waist. This paper focuses on protocols which allow us to layer multiple apis on this common Delos platform.

David G. Andersen AMA (SOSP Day 3)

Dave has broad research interests in computer systems in the networked environment. What do you think systems community should be working on but isn't getting enough attention? With the stuttering of Moore's law as we get into nanoscales, there is more need to extract performance from systems through integrated co-design of hardware and software. We need integrated work through the entire stack to make systems faster and more reliable. How do you pick research ideas/projects? I stumble on interesting questions. There is a lot of cross-fertilization going on between different areas I am working on. If everything else fails, once a year or so, I take a notebook, go for a walk, and write down my ideas. My two sabbaticals were also very fruitful for finding research ideas and projects. At Google, I worked with AI/ML people, which opened new horizons for me. What do you think about the future prospects of blockchain systems? Will there be a killer application, like ever? There are v

Jon Howell AMA at SOSP21 (Day 2)

 There were several AMAs at SOSP21, where the attendees can ask questions. I really like Jon Howell 's AMA session. I asked two questions I was interested about learning.   Why do you think verification is at a breakaway point now? What are the factors that make this the go-big moment? We are now figuring out how to make common problems cheap. Historically Coq proving was about being clever: do something not done before. Now we are focusing on how do we make the common stuff cheap. The Ironclad project began when Rustan Leino came to our team, and said I have a tool, let's use this tool. The Dafny demos were interesting to me. A red squiggly line showed a bug. When you give hint, the bug went away. It was a lightbulb moment. It took me extra line of code/explanation for me to get it. The verifier already got it. We had verification runouts in slicon valley,  with tlc, tlaps and others. The verifier send back question mark, and we try to decode it, what we realize was, almost

SOSP21 conference (Day 1)

SOSP, along with OSDI, is the premiere conference in computer systems. SOSP was held biannually; I had attended SOSP19 in person, and shared notes and paper summaries from that.  This year SOSP is virtual, which made it a lot easier to travel to. It is nice attending the conference from the convenience of your home. The experience is not too inferior to physical conference attending if you take the convenience factor into account.  Here are some notes from SOSP21. If you find and interesting paper, you can dig in, because: All SOSP papers are available as open access All the presentations are also available on YouTube Opening The conference opened with announcements from program committee (PC) chairs. SOSP21 had 348 submissions from 2078 authors, resulting on avg 6 authors per paper. 54 out of 348 papers are accepted. Reviews were conducted by 64 PC members, who produced 1500+ reviews. The trend is clear, the number of submissions are growing steeply.  Last year OSDI had announced i

Popular posts from this blog

The end of a myth: Distributed transactions can scale

Hints for Distributed Systems Design

Foundational distributed systems papers

Learning about distributed systems: where to start?

Metastable failures in the wild

Scalable OLTP in the Cloud: What’s the BIG DEAL?

SIGMOD panel: Future of Database System Architectures

The demise of coding is greatly exaggerated

Dude, where's my Emacs?

There is plenty of room at the bottom