Posts

Showing posts from March, 2014

How to write your research paper

The legend of Köroğlu I will start with a story of oppression and an uprising, involving a mythical horse in the process. Way to start a post about writing research papers, huh? In 16th century Anatolia, there was a corrupt and oppressor mayor in the Bolu state. The mayor one day decided that he should find and gift the best horse in the world to the Sultan. He contacted a very skilled horse breeder. The breeder said the horse that is deserving of the Sultan should be very special, and said none of his horses is worthy of this. He went on a quest for this horse himself. One day he saw some street kids abusing a feeble and awkward-looking foal. He immediately recognized the potential in this foal and bought the foal, and headed for the mayor's palace. The mayor got outraged, being the ignorant oppressor he is, he thought the breeder is mocking him by offering this weak awkward foal. The mayor immediately ordered the breeder to be blinded. The breeder had a young son, who beca

Dapper, a Large-Scale Distributed Systems Tracing Infrastructure

Image
This paper is from Google. This is a refreshingly honest and humble paper. The paper is not pretending to be sophisticated and it doesn't have the "we have it all, we know it all" attitude. The paper presents the Dapper tool which is trying to solve a real problem, and it honestly represents how this simple straightforward solution fares and where it can be improved. This is the attitude of genuine researchers and seekers of truth. It is sad to see that this paper did not get published in any conferences and is still listed as a Google Technical Report since April 2010. What was the problem? Not enough novelty? Not enough graphs? Use case: Performance monitoring tail at scale Dapper is Google's production distributed systems tracing infrastructure. The primary application for Dapper is performance monitoring to identify the sources of latency tails at scale. A front-end service may distribute a web query to many hundreds of query servers. An engineer looking on

Naiad: A timely dataflow system

Image
What is in a name? Naiad is from Microsoft Research.  Dryad , a general purpose runtime for execution of data parallel applications, was also from Microsoft Research. An application written for Dryad is modeled as a directed acyclic graph (DAG) and Dryad is the " tree nymph " in Greek mythology. Naiad is a stream processing platform and Naiad is the " stream nymph " in Greek mythology. Naiad is an opensource project that has been receiving a lot of attention recently. I expect we will hear more about Naiad, because it is very useful for low-latency real-time querying and high-throughput incremental-processing of streaming big data. What is not to like? Naiad is useful especially in incremental processing of graphs. As has been observed before, MapReduce is inappropriate for graph processing because of the large number of iterations needed in graph applications. MapReduce is a functional language, so using MapReduce requires passing the entire state of the grap

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book