Posts

My Time at MIT

Image
Twenty years ago, in 2004-2005, I spent a year at MIT’s Computer Science department as a postdoc working with Professor Nancy Lynch. It was an extraordinary experience. Life at MIT felt like paradise, and leaving felt like being cast out. MIT Culture MIT’s Stata Center was the best CS building in the world at the time. Designed by Frank Gehry, it was a striking abstract architecture masterpiece ( although like all abstractions it was a bit leaky ). Furniture from Herman Miller complemented this design. I remember seeing price tags of $400 on simple yellow chairs. The building buzzed with activity.  Every two weeks, postdocs were invited to the faculty lunch on Thursdays, and alternating weeks we had group lunches. Free food seemed to materialize somewhere in the building almost daily, and the food trucks outside were also good. MIT thrived on constant research discussions, collaborations, and talks. Research talks were advertised on posters at the urinals, as a practical touch of M...

Hanging in there

I have been reviewing papers for USENIX ATC and handling work stuff at MongoDB Research. I cannot blog about either yet. So, instead of a paper review or technical blog, I share some random observations and my current mood. Bear with me as I vent here. You may disagree with some of my takes. Use the comments to share your thoughts. Damn, Buffalo. Your winter is brutal and depressing. ( Others on r/Buffalo also suffer ; many suggest video games, drugs, or drinking.) After 20 years of Buffalo winters, I am fed up with the snow and cold. When I taught distributed systems course in fall semesters, I would ask the new batch of Indian students how many had never seen snow, and all hands would shoot up. I would warn them by winter's end they would despise that magic fairy dust. Ugh, sidewalks are already piled high with snow that freezes, muddies, and decays into holes. Forgive the gloomy start. I had a bad flu ten days ago. Just as I began to recover, another milder bout struck. My joint...

Intelligence wants to be everywhere

Image
Imagine a world where intelligence permeates every corner of existence, from the devices in your home to the trees in your backyard. This is a world where everything is alive with contemplation, purpose, and the ability to learn, adapt, and grow. A world where intelligence radiates from everywhere. Ubiquitous AI In mathematics, one way to understand a concept is to push it to its extremes. Let's apply that to AI. Enabled by the rapid advancements in LLMs, inference capabilities, chip efficiency, and energy availability, imagine a future where AGI is embedded in the fabric of our lives, radiating from everyday objects. Technology has always moved toward the ethereal. We went from horses to cars powered by liquid fuel, and then to electric vehicles that run on invisible currents of energy. Electricity is easier to transmit, store, and harness than gasoline. I was struck by this recently when I saw an electric car charging in a remote state park. No gas stations and no pipelines aroun...

GaussDB-Global: A Geographically Distributed Database System

Image
This paper , presented in the industry track of ICDE 2024 , introduces GaussDB-Global (GlobalDB) , Huawei's geographically distributed database system. GlobalDB replaces the centralized transaction management (GTM) of GaussDB with a decentralized system based on synchronized global clocks (GClock) . This approach mirrors Google Spanner's TrueTime approach and its commit-wait technique, which provides externally serializable transactions by waiting out the uncertainty interval. However, GlobalDB claims compatibility with commodity hardware, avoiding the need for specialized networking infrastructure for synchronized clock distribution. The GClock system uses GPS receivers and atomic clocks as the global time source device at each regional cluster. Each node synchronizes its clock with the global time source over TCP every 1 millisecond. Clock deviation is kept low because synchronization is achieved within 60 microseconds as a TCP round trip, and the CPU’s clock drift is bound...

Use of Time in Distributed Databases (part 5): Lessons learned

Image
This concludes our series on the use of time in distributed databases , where we explored how use of time in distributed systems evolved from a simple ordering mechanism to a sophisticated tool for coordination and performance optimization. A key takeaway is that time serves as a shared reference frame that enables nodes to make consistent decisions without constant communication. While the AI community grapples with alignment challenges, in distributed systems we have long confronted our own fundamental alignment problem. When nodes operate independently, they essentially exist in their own temporal universes. Synchronized time provides the global reference frame that bridges these isolated worlds, allowing nodes to align their events and states coherently. At its core, synchronized time serves as an alignment mechanism in distributed systems. As explored in Part 1, synchronized clocks enable nodes to establish "common knowledge" through a shared time reference, which is pow...

Use of Time in Distributed Databases (part 4): Synchronized clocks in production databases

Image
This is part 4 of our "Use of Time in Distributed Databases" series . In this post, we explore how synchronized physical clocks enhance production database systems. Spanner Google's Spanner (OSDI'12) implemented a novel approach to handling time in distributed database systems through its TrueTime API. TrueTime API provides time as an interval that is guaranteed to contain the actual time, maintained within about 6ms (this is 2012 published number which improved significantly since then) of uncertainty using GPS receivers and atomic clocks. This explicit handling of time uncertainty allows Spanner to provide strong consistency guarantees while operating at a global scale. Spanner uses multi-version concurrency control (MVCC) and achieves external consistency (linearizability) for current transactions through techniques like "commit wait," where transactions wait out the uncertainty in their commit timestamps before making their writes visible. Spanner uses ...

I Can’t Believe It’s Not Causal! Scalable Causal Consistency with No Slowdown Cascades

Image
I recently came across the Occult paper (NSDI'17) during my series on "The Use of Time in Distributed Databases." I had high expectations, but my in-depth reading surfaced significant concerns about its contributions and claims. Let me share my analysis, as there are still many valuable lessons to learn from Occult about causality maintenance and distributed systems design. The Core Value Proposition Occult (Observable Causal Consistency Using Lossy Timestamps) positions itself as a breakthrough in handling causal consistency at scale. The paper's key claim is that it's "the first scalable, geo-replicated data store that provides causal consistency without slowdown cascades." The problem they address is illustrated in Figure 1, where a slow/failed shard A (with delayed replication from master to secondary) can create cascading delays across other shards (B and C) due to dependency-waiting during write replication. This is what the paper means by "...

Use of Time in Distributed Databases (part 3): Synchronized clocks in databases

Image
This is part 3 of our "Use of Time in Distributed Databases" series . In this post, we explore how synchronized physical clocks enhance database systems, focusing on research and prototype databases. Discussion of time's role in production databases will follow in our next post. To begin, let's revisit the utility of synchronized clocks in distributed systems. As highlighted in Part 1 , synchronized clocks provide a shared time reference across distributed nodes and partitions. For simple, single-key replication tasks, such precision is often unnecessary and leader-based approaches such as MultiPaxos or Raft is much more appropriate. Even WPaxos might be considered if you need a WAN deployment. Of course, if you want to go very fancy by using a leaderless designs, such as those in the EPaxos family/Tempo/Accord,  then dependency graphs and time synchronization re-enter the picture. The true value of synchronized clocks becomes apparent in distributed multi-key operat...

Use of Time in Distributed Databases (part 2): Use of logical clocks in databases

Image
This is part 2 of our "Use of Time in Distributed Databases" series . We talk about the use of logical clocks in databases in this post. We consider three different approaches: vector clocks dependency graph maintenance epoch service  In the upcoming posts we will allow in physical clocks for timestamping, so there is no (almost no) physical clocks involved in the systems in part 2.    1. Vector clocks Dynamo: Amazon's highly available key-value store (SOSP'07) Dynamo employs sloppy quorums and hinted hand-off and uses version vector (a special case of vector clocks) to track causal dependencies within the replication group of each key. A version vector contains one entry for each replica (thus the size of clocks grows linearly with the number of replicas). The purpose of this metadata is to detect conflicting updates and to be used in the conflict reconciliation function. Dynamo provides eventual consistency thanks to this reconciliation function and conflict detect...

Popular posts from this blog