SIGMOD panel: Future of Database System Architectures

I mentioned this panel in my SIGMOD/PODS day 2 writeup. 

The panel consisted of (from right to left) Gustavo Alonso (ETH), Swami Sivasubramanian (AWS), Anastasia Ailamaki (EPFL), Raghu Ramakrishnan (Microsoft), and Sam Madden (MIT). Sailesh Krishnamurthy from Google was unable to attend, so Anastasia who is on sabbatical at Google, also covered for him. The panel started with 5 minutes speeches from participants. The panel was lively and playful. The panelists tried to make their speeches controversial, and have zingers for each other.


Sam talked about the demise of DBMS monolith. He said, the Oracle of 90s is dead because it was incompatible with modern data ecosystem, and the modern DBMSs are shifting towards disaggregated designs. When Sam was a PhD student under Eric Brewer, Eric said we should build disaggregated databases, and people said that won't happen. But we are there now: Disaggregation has arrived! While Sam mentioned about the benefits of disaggregation, he also cautioned about the hype, and warned that a big microservices architecture, if not done right, can result in a complexity, and expensive cloud bills. He said his team at MIT is focused on building better designed disaggregated systems, with better fault-tolerance, adaptive storage systems, smart offloading, and management of complex multi-system cloud deployments.

Gustavo's talk was very funny. He delivered zingers in a deadpan tone, and got the crowd loling. He said, he is a modern person, and learned to exchange bad words with more politically correct versions. So he learned that instead of saying fucked up, it is more appropriate to  say disaggregated, as in "I am feeling really disaggregated today". He said the point of disaggregated is somebody selling you something, because disaggregation generates a lot of data movement back and forth. He said that the Oracle cloud runs on exadata, but it is closer to old main frames. He argued that disaggregation will increase latency significantly, and you need to compensate with caching, which will disaggregate you. He called disaggregation a "bad idea that spreads: from storage to memory".

Anastasia's talk was titled "DB engines and hardware: p(r)ay as you go". Instead of disaggregation in the architecture, she talked about what comes next.  Previously, coprocessors did the mundane work, and CPU did the real/heavy work. Now, accelerators do the real/heavy work, and CPU takes a supporting role. She said, now is the real hardware-software codesign era, where the software and hardware need to adapt to each other. She talked about their ProteusDB project She said, coding is headed to extinction with specs being written in a high level language and with the advances in LLM. She said we can use LLM and DBs for generating graph algorithms and parallel query processing on GPUs. But in the future we will have more than that: we will have  self-correcting systems that can optimize  hardware and software, and self-expanding systems.

Swami said that when Raghu invited him to be on the panel, he didn't know he would have to disagree with his PhD advisor Gustavo. He said that disaggregation has already arrived. He took a customer-focused view and said that the boundary between analytics, transactional, and ML is irrelevant for customers, and these are artificial distinctions of the research community has that needs to die. He built on the hardware-software codesign theme Anastasia mentioned. He said that humans are not good at high cardinality problems, this is where ML helps,  and there is not enough investment on how to use ML for building DB. Being on-call at 2am, debugging, makes you appreciate these things. He said, being known as the NoSQL guy, he would controversially claim that "SQL is going to die" because LLMs are going to reinvent spec, and allow natural language based coding. (Well, the SQL paper was titled "SEQUEL: A structured English query language", so SQL will likely survive this one as well :-). LLMs are going to change how data-driven apps are built, and the attention would be moved to making guardrail decisions about correctness/security when using LLMs. In a nondeterministic LLM world, how do you ensure correctness? What is the interface to the database? He said that this is going to be such an exciting time to be in industry, as the walls are coming down between various database personas (user archetypes/skills), and we are given a blank slate to reinvent the world.

Raghu said "Sorry Gustavo, you are wrong, disaggregated architecture is here, and is not going anywhere. In a disaggregated architecture, storage is fungable, and computing scales independently. Customer value is here, and the technical problems will be solved in time. Thanks to the  complexities of disaggregation problems, every assistant prof is going to get tenure figuring how to solve them." He said, LLMs are not controversial for this demographic, and he believes, anything someone does with DB, there will be an interactive LLM model right next to it. Productivity will be a metric we will track. Assembly is not taught today, and in the future, programming languages are not going to be taught. Natural language is the new programming language.

After these speeches, a Q&A session started.

There was some discussion about who is the liability agent, if LLMs are not liable.

There was discussion about the productivity LLMs afford. Swami and Raghu argued that the speed of development is much more important, and ease of use trumps everything else. Disaggregation or not, with cloud you can build 10-100x faster than trying to do on-prem. Customers want ease of use, and fast time to production.

One question was, how do we measure productivity? Things that win are not the fastest, but easiest to use, but academy is misaligned on this. A panel member (I think Raghu) suggested:"find your colleagues in sociology dept and team up with them, form user research groups, and evaluate productivity gains and ease of use".

There was more discussion about disaggregation. The panel mentioned more benefits of LLMs. Thanks to disaggregation, cloud gives you constant time recovery. Disaggregation also provides simplicity in management of data via providing separation of concerns, and making the database disappear. You shouldn't have to choose how many machines/resources you need.

Someone asked is there a conflict of interest for cloud companies adopting disaggregation as this may mean in some cases increased cloud bills. Panel members mentioned how the cloud providers dropped prices, and that they are  thinking of how to get to the slow-to-move regions of the world, rather than increasing profit margins.

There was discussion about how the Sigmod attendees can be more entrepreneurial. Sam said, there are great experts in this room, start companies and get the money, people who are less capable are getting 100 millons of dollars starting companies. The recipe was: pick something you are an expert, with big user domain, then use LLMs to disrupt it.

Gustavo was not buying the LLM recipe. He said, if he were to do a startup, he would build verticals, particular niche architectures.

There was friendly banter of zingers. Gustavo delivered (with perfect comedic timing) many zingers on his student Swami. Swami was very respectful, and only laughed at these along with the crowd. He won my sympathy and respect big time.

At one point, as AWS and Azure get most of the coverage in zingers (maybe since Sailesh from Google was unable to attend), an audience member from the Google BigQuery team reminded people that there is another cloud provider besides AWS and Azure that people can complain about.

There was even more rejoicing about ML and LLMs. Some select quotes included: "These will provide access to information not data." "Databases are exciting now, as fundamental limits are blown up." "LLMs are like the internet, big inflection point!"

This was Tuesday afternoon. On Wednesday morning, I visited a friend and collaborator at Microsoft research. He was unable to give me guest wifi access because the email link he got for approving my access failed to resolve an address. I later asked him to print a paper for me, and he said printing from a Mac laptop required a long/intrusive configuration process and he didn't configure his laptop. (He is a very skilled developer and is very technical, he just refused to follow the lengthy configuration instructions.) This was a good reality check. After the LLM fueled future of databases panel, it was back to the present day of real software systems.

Okay, as a balancing counterpoint, on Wednesday and Thursday there have been some presentations and demos about how prototype systems used LLMs for querying databases. But there is still an arduous process going from prototypes and research papers to actual products.


Popular posts from this blog

The end of a myth: Distributed transactions can scale

Foundational distributed systems papers

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Speedy Transactions in Multicore In-Memory Databases

Metastable failures in the wild

Amazon Aurora: Design Considerations + On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes

The Seattle Report on Database Research (2022)

There is plenty of room at the bottom