Posts

Showing posts from November, 2012

From Clarity to Efficiency for Distributed Algorithms

Image
Y. A. Liu, S. D. Stoller, B. Lin, M. Gorbovitski, Appeared in OOPSLA'12. This paper describes a high-level language, DistAlgo, for clear description of distributed algorithms (e.g., Paxos, mutual execution). In contrast to embarrassingly parallel tasks which involve very regular and simple synchronization patterns (e.g., MapReduce tasks), distributed algorithms generally involve complex and tight synchronization patterns. DistAlgo specifies these complex synchronization conditions via using logical quantifications over message histories of processes involved in the algorithm. But, unfortunately, this turns out to be extremely inefficient if executed straightforwardly: each quantifier will cause a linear factor in running time, and any use of the history of messages sent and received will cause space usage to be unbounded. To address this inefficiency, this paper focuses on optimizing implementations of DistAlgo programs by incrementalization of these expensive synchronization co

Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary

Image
Appeared in OSDI '12, Cheng Li, Daniel Porto, Allen Clement, Johannes Gehrke, Nuno Preguica, and Rodrigo Rodrigues. In order to reduce latencies to geographically distributed users, big webservices companies (Google, Yahoo, Facebook, Twitter) replicate data across geographical regions. But replication across datacenters, create cross-site consistency problems, which is further complicated with the huge WAN latency delays. If you want to have strong consistent updates across sites, you have to pay the price in terms of latency (basically you revert to doing synchronous replication via say Paxos as in MDCC). This is the ELC part in the PACELC . To alleviate this latency versus consistency tension, this paper proposes RedBlue consistency, which enables blue operations to be fast/asynchronous (and eventually consistent) while the remaining red operations are strongly-consistent/synchronous (and slow). So a program is partitioned into red and blue operations, which run with differ

Sensys'12 (day2)

Here are my much anticipated (at least by Ted :-) notes from the second day of Sensys. Kyun queue: a sensor network system to monitor road traffic queues I missed this talk, but this sounds like an interesting paper. The idea seems to be to deploy a transmitter and a receiver node on the opposite sides of a road. Depending on the density and the speed of the cars passing between the transmitter and the receiver, the received radio signal has different signatures. The work studies these signals and figures out the road conditions based on these. Low cost crowdcounting using audio tones Crowd counting can be useful for event planning, learning popularity of aisles, etc. This work assumes that everyone in the crowd has a smartphone and this counting app running on their phone. (This is the biggest limitation of the work.) This work uses the audio tones (again ultrasound tones) emitted by the phones to count the people. The app employs the smartphone mic and speaker, no additional

Sensys'12 (day 1)

I am attending Sensys'12  at Toronto, which is a short distance from Buffalo. Sensys is a premiere conference for wireless sensor network and, more recently, smartphone sensing research. Here are my rough notes from the first day of conference. Energy-efficient GPS sensing with cloud offloading This paper is really good, it blew me away. Later at the conference banquet Wednesday night, the paper got selected as the Best Paper . This work aims to break up the GPS as a blackbox and see how we can make it more energy-efficient. The paper goes into fairly technical detail about how GPS works, which I cannot replicate here. Some basics are: GPS uses CDMA to transmit 50bps information over a carrier signal of 1.575Ghz. Seeing more satellites improves accuracy. GPS localization requires the following phases on the device: acquisition, tracking, decoding, code-phase, least-square calculation. The insight of the paper is to offload some of these phases to the cloud, so the device d

An abundance of solutions

My son started the Kindergarten this year. I try to be a good parent and work with him closely on his sight-words and handwriting. Today we were reviewing his weekly reader magazine, and I read to him the following question from the magazine. There is only one pumpkin left at the pumpkin patch. Mingo and Zip both want the pumpkin. What should they do? I guess the easy answer (maybe, the answer the magazine tries to teach) is that they share the pumpkin. But my son thought about a little bit, and said they should ask the farmer if the farmer has more pumpkins somewhere. I went along, and together we thought of several more solutions to the problem. Mingo and Zip carve the pumpkin nicely, and sell the carved pumpkin at a good price at the bazaar and buy two pumpkins with that money. They divide the pumpkin into half. For dividing fairly, you know the trick right. One of them gets to cut in two pieces, and the other gets to choose. (However, my wife argues this is not fair if the

MDCC: Multi-Data Center Consistency

Image
In order to reduce latencies to geographically distributed users, Google, Yahoo, Facebook, Twitter, etc., replicate data across geographical regions. But replication across datacenters, over the WAN, is expensive. WAN delays are in the hundreds of milliseconds and vary significantly, so common wisdom is that synchronous wide-area replication is unfeasible, which means strong consistency should be diluted/relaxed. (See COPS paper for an example.) In the MDCC work (Tim Kraska, Gene Pang, Michael J. Franklin, Samuel Madden, March 2012), the authors describe an " optimistic commit protocol, that does not require a master or partitioning, and is strongly consistent at a cost similar to eventually consistent protocols ". Optimistic and strongly-consistent is an odd/unlikely couple. The way MDCC achieves this is by starting with Paxos, which is a strongly consistent protocol. MDCC then adds optimizations to Paxos to obtain an optimistic commit protocol that does not require a m

Auto-dropboxify all web-downloaded documents

I often download pdf presentations from the web, but then when I reopen them in my laptop later, I wonder if they got updated/changed in the meanwhile. It would be nice to have a service where my local copy also gets automatically updated/synced (a la dropbox) when the authoritative source copy gets changed. (The difference from the dropbox model is here the source does not need to cooperate by deploying a sync software, e.g. dropbox, or even be aware that its document is being synced.) So here is my project idea: auto-sync all web/url downloaded documents, so the downloaded local copy of a file is kept up-to-date with the authoritative copy on the web. I think many people will find this useful. Some other examples of documents downloaded from the web and would benefit from auto-updating include: city documents (e.g., street-parking rules, garbage collection rules, etc.), event calendars, tax documents, CVs. (Do you have any other examples?) This shouldn't be too hard to build

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book