Posts

Showing posts with the label benchmarks

Take Out the TraChe: Maximizing (Tra)nsactional Ca(che) Hit Rate

Image
This paper appeared in OSDI 2023 in July, and is authored by Audrey Cheng, David Chu, Terrance Li, Jason Chan, Natacha Crooks, Joseph M. Hellerstein, and Ion Stoica, UC Berkeley; Xiangyao Yu, University of Wisconsin--Madison. The paper seems to break the acronym rule. It defines one acronym, TraChe, in the title, but uses another one, Detox, for the rest of the paper. I guess I agree, referring to your system as Trache is not very appealing.   I kid, because I love the authors. Looks like I cover almost any paper Audrey Cheng writes these days. Speaking of Audrey, she had written a brief summary of this paper here: https://audreyccheng.com/blog/transaction-caching . Hmm, she didn't use the word "TraChe" in her post even once. I wonder which one of the advisors on the paper coined the typo TraChe, and pushed it to the title. Ok, enough with the trache talk. The Problem           You have been doing caching wrong for your transactional workloads! That ...

Characterizing Microservice Dependency and Performance: Alibaba Trace Analysis

Image
This paper got the best paper award at SOCC 2021 . The paper conducts a comprehensive study of large scale microservices deployed in Alibaba clusters.  They analyze the behavior of more than 20,000 microservices in a 7-day period and profile their characteristics based on the 10 billion call traces collected. They find that: microservice graphs are dynamic in runtime most graphs are scattered to grow like a tree size of call graphs follows a heavy-tail distribution Based on their findings, they offer some practical tips about improving microservice runtime performance. They also develop a stochastic model to simulate microservice call graph dependencies and show that it approximates the dataset they collected (which is available at https://github.com/alibaba/clusterdata ). What are microservices? Microservices is a software development approach that divides an application into independently deployable services, owned by small teams organized around business capabilities. Each servi...

Polyjuice: High-Performance Transactions via Learned Concurrency Control (OSDI'21)

Image
This paper appeared in OSDI 2021 . I really like this paper. It is informative, it taught me new things about concurrency control techniques. It is novel, it shows a practical application of simple machine learning to an important systems problem, concurrency control. It shows significant benefits for a limited but reasonable setup. It has good evaluation coverage and explanation. It is a good followup paper to the benchmarking papers we have been looking at recently. So let's go on learning about how Polyjuice was brewed. Problem and motivation There is no supreme concurrency control (CC) algorithm for all conditions. Different CC algorithms return the best outcome under different conditions.  Consider two extreme CC algorithms. Two phase locking (2PL) waits-for every dependent transactions to finish. Optimistic concurrency control (OCC) don't track or wait for any dependent transaction but validate at the end. We find that OCC is better with less contention, because it avoids...

TAOBench: An End-to-End Benchmark for Social Network Workloads

Image
TAOBench is an opensource benchmarking framework that captures the social graph workload at Meta (who am I kidding, I'll call it Facebook). This paper (which will appear at VLDB'2022) studies the production workloads of Facebook's social graph datastore TAO, and distills them to a small set of representative features.  The integrity of TAOBench's workloads are validated by testing them against their production counterparts. The paper also describes several use cases of TAOBench at Facebook. Finally, the paper uses TAOBench to evaluate five popular distributed database systems (Spanner, CockroachDB, Yugabyte, TiDB, PlanetScale). The paper is a potpourri of many subquests. It feels a bit unfocused, but maybe I shouldn't complain because the paper is full to the brim, and provides three papers for the price of one. TAO TAO is a read-optimized geographically distributed in-memory data store that provides access to Facebook's social graph for diverse products and b...

Popular posts from this blog

Hints for Distributed Systems Design

My Time at MIT

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Foundational distributed systems papers

Advice to the young

Learning about distributed systems: where to start?

Distributed Transactions at Scale in Amazon DynamoDB

Making database systems usable

Looming Liability Machines (LLMs)

Analyzing Metastable Failures in Distributed Systems