Posts

Showing posts with the label databases

Transaction Healing: Scaling Optimistic Concurrency Control on Multicores

Image
This paper from SIGMOD 2016 proposes a transaction healing approach to improve the scalability of Optimistic Concurrency Control (OCC) in main-memory OLTP systems running on multicore architectures. Instead of discarding the entire execution when validation fails, the system repairs only the inconsistent operations to improve throughput in high-contention scenarios. If this sounds familiar, it's because we recently reviewed the Morty paper from EuroSys 2023 , which applied healing ideas to interactive transactions using continuations to support re-execution. This 2016 Transaction Healing paper is scoped to static stored procedures, and focuses more on integrating healing into OCC for stored procedures.  Key Ideas OCC works well under low contention because it separates reads from writes and  keeps critical sections short (only for validation). But under high contention, especially in workloads with skewed access patterns (like Zipfian distributions), transactions are frequent...

Morty: Scaling Concurrency Control with Re-Execution

Image
This EuroSys '23 paper reads like an SOSP best paper. Maybe it helped that EuroSys 2023 was in Rome. Academic conferences are more enjoyable when the venue doubles as a vacation. The Problem Morty tackles a fundamental question: how can we improve concurrency under serializable isolation (SER), especially without giving up on interactive transactions? Unlike deterministic databases (e.g., Calvin ) that require transactions to declare read and write sets upfront, Morty supports transactions that issue dynamic reads and writes based on earlier results. Transactional systems, particularly in geo-replicated settings, struggle under contention. High WAN latency stretches transaction durations, increasing the window for conflicts. The traditional answer is blind exponential backoff, but that leads to low CPU utilization. TAPIR and Spanner replicas often idle below 17% under contention as Morty's evaluation experiments show. Morty's approach to tackle the problem is to start from...

Everything is a Transaction: Unifying Logical Concurrency Control and Physical Data Structure Maintenance in Database Management Systems

Image
This paper from CIDR'21 introduces the Deferred Action Framework (DAF). This framework aims to unify transaction control and data structure maintenance under multi-version concurrency control (MVCC) database systems, particularly for complex maintenance tasks like garbage collection and index cleanup. In MVCC systems, transactions and data maintenance tasks (like garbage collection) often interfere, requiring complex coordination. This can lead to issues in system performance and code maintainability, as physical structures (e.g., B+ trees) aren't inherently designed to handle multi-versioning. The paper proposes DAF to handle deferred actions—maintenance tasks that are queued to execute only when they no longer interfere with active transactions. DAF relies on timestamp-based ordering and executes tasks only when their actions won’t affect concurrent transactions. DAF guarantees to process actions deferred at some timestamp t only after all transactions started before t have ...

DBSP: Automatic Incremental View Maintenance for Rich Query Languages

Image
Incremental computation represents a transformative (!) approach to data processing. Instead of recomputing everything when your input changes slightly, incremental computation aims to reuse the original output and efficiently update the results. Efficiently means performing work proportional only to input and output changes. This paper introduces DBSP, a programming language inspired by signal processing (hence the name DB-SP). DBSP is  simple, yet it offers extensive computational capabilities. With just four operators, it covers complex database queries, including entire relational algebra, set and multiset computations, nested relations, aggregations, recursive queries, and streaming computations. Basic DBSP operators  The language is designed to capture computation on streams. Streams are represented as infinite vectors indexed by consecutive time. Each stream can represent values of any type and incorporates basic mathematical operations like addition, subtraction, and ...

HPTS'24 Day 1, part 1

Image
Wow, what a week that was! The two days of HPTS (Monday and Tuesday this week) felt like a week to me. I learned a lot, and had a lot of good conversations, and even was able to squeeze in some beach walks in there.  HPTS has been operating since 1985, convenening mostly every two years. It has been described as Davos for database systems. Pat Helland has been running HPTS since 1989, and as usual he kicked it off with some short opening remarks 8:30am Monday morning.  Pat reminisced on the early HPTSs which discussed cutting edge work on scalable computing, punching past 100 txn on a mainframe! Then he quickly set the context of the workshop. This event is about meeting people. There are 130 people, with 20+ of them students. He said "your job is to meet everyone, HPTS is about the connections you make". The HPTS website also emphasizes this. "HPTS is about the breaks. The presentations punctuate the breaks! HPTS exists to promote community and relationships. This comm...

Making database systems usable

Image
C. J. Date's Sigmod 1983 keynote, "Database Usability", was prescient. Usability is the most important thing to the customers. They care less about impressive benchmarks or clever algorithms, and more about whether they can operate and use a database efficiently to query, update, analyze, and persist their data with minimal headache. (BTW, does anyone have a link to the contents of this Sigmod'83 talk? There is no transcript around, except for this short abstract .) The paper we cover today is from Sigmod 2007. It takes on the database usability problem raised in that 1983 keynote head-on, and calls out that the king is still naked.  Let's give some context for the year 2007. Yes, XML format was still popular then. The use-case in the paper is XQuery. The paper does not contain any  reference to json. MongoDB would be released in 2009 with the document model; and that seems to be great timing for some of the usability pains mentioned in the paper! Web 2.0 was in ...

Understanding the Performance Implications of Storage-Disaggregated Databases

Image
Storage-compute disaggregation in databases has emerged as a pivotal architecture in cloud environments, as evidenced by Amazon ( Aurora ), Microsoft ( Socrates ), Google (AlloyDB), Alibaba ( PolarDB ), and Huawei (Taurus). This approach decouples compute from storage, allowing for independent and elastic scaling of compute and storage resources. It provides fault-tolerance at the storage level. You can then share the storage for other services, such as adding read-only replicas for the databases. You can even use the storage level for easier sharding of your database. Finally, you can also use this for exporting a changelog asynchronously to feed into peripheral cloud services, such as analytics. Disaggregated architecture was the topic of Sigmod 23 panel . I think this quote summarizes the industry's thinking on the topic. "Disaggregated architecture is here, and is not going anywhere. In a disaggregated architecture, storage is fungible, and computing scales independently. ...

Popular posts from this blog

Hints for Distributed Systems Design

My Time at MIT

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Foundational distributed systems papers

Advice to the young

Learning about distributed systems: where to start?

Distributed Transactions at Scale in Amazon DynamoDB

Making database systems usable

Looming Liability Machines (LLMs)

Analyzing Metastable Failures in Distributed Systems