Posts

Showing posts from July, 2016

Kafka, Samza, and the Unix Philosophy of Distributed Data

Image
This paper is very related to the "Realtime Data Processing at Facebook" paper I reviewed in my previous post. As I mentioned there Kafka does basically the same thing as Facebook's Scribe, and Samza is a stream processing system on Kafka. This paper is very easy to read. It is delightful in its simplicity. It summarizes the design of Apache Kafka and Apache Samza and compares their design principles to the design philosophy of Unix, in particular, Unix pipes. Who says plumbing can't be sexy? (Seriously, don't Google this.) So without further ado, I present to you Mike Rowe of distributed systems. Motivation/Applications I had talked about the motivation and applications of stream processing in the Facebook post. The application domain is basically building web services that adapt to your behaviour and personalize on the fly, including Facebook, Quora, Linkedin, Twitter, Youtube, Amazon, etc. These webservices take in your most recent actions (likes, cli

Realtime Data Processing at Facebook

Image
Recently there has been a lot of development in realtime data processing systems, including Twitter's Storm and Heron, Google's Millwheel, and LinkedIn's Samza. This paper presents Facebook's Realtime data processing system architecture and its Puma, Swift, and Stylus stream processing systems. The paper is titled "Realtime Data Processing at Facebook" and it appeared at Sigmod'16, June 26-July 1. Motivation and applications Facebook runs hundreds of realtime data pipelines in productions. As a motivation of the realtime data processing system the paper gives Chorus as an example. The Chorus data pipeline transforms a stream of individual Facebook posts into aggregated, anonymized, and annotated visual summaries. E.g., what are the top 5 topics being discussed for the election today? What are the demographic breakdowns (age, gender, country) of World Cup fans? Another big application is the mobile analytics pipelines that provide realtime feedback fo

Efficient Replication of Large Data Objects

Image
This paper appeared in DISC 2003 , and describes an application of the ABD replicated atomic storage algorithm  for replication of large objects. When objects being replicated is much larger than the size of the metadata (such as tags or pointers), it is efficient to tradeoff performing cheaper operations on the metadata in order to avoid expensive operations on the data itself. The basic idea of the algorithm is to separately store copies of the data objects in replica servers , and information about where the most up-to-date copies are located in directory servers . This Layered Data Replication (LDR) approach adopts the ABD algorithm for atomic fault-tolerant replication of the metadata, and prescribes how the replication of the data objects in the replica servers can accompany replication of the metadata in directory servers in a concurrent and consistent fashion: In order to read the data, a client first reads the directories to find the set of up-to-date replicas, then reads

Replex: A Scalable, Highly Available Multi-Index Data Store

Image
This paper received the best paper award at Usenix ATC'16 last week. It considers a timely important problem. With NoSQL databases, we got scalability, availability, and performance, but we lost secondary keys.  How do we put back the secondary indices, without compromising scalability, availability, and performance. The paper mentions that previous work on Hyperdex  did a good job of re-introducing secondary keys to NoSQL , but with overhead: Hyperdex generates and partitions an additional copy of the datastore for each key. This introduces overhead for both storage and performance: supporting just one secondary key doubles storage requirements and write latencies. Replex adds secondary keys to NoSQL databases without that overhead. The key insight of Replex is to combine the need to replicate for fault-tolerance and the need to replicate for index availability. After replication, Replex has both replicated and indexed a row, so there is no need for explicit indexing. How

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Understanding the Performance Implications of Storage-Disaggregated Databases

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Designing Data Intensive Applications (DDIA) Book