Showing posts from May, 2011

Online migration for geodistributed storage systems, Usenix ATC 2011

This paper investigates the problem of migrating data between data centers. Data needs to be moved from one center to another based on the access patterns, for example, the user may have moved from East to West coast. The problem is complicated by the large size of data that needs to be moved, the requirement to perform the migration online without blocking access to any part of the data from anywhere, and finally that the data can be accessed/modified concurrently in different locations.
To address this problem, the paper proposes an overlay abstraction. The goal of the abstraction is to implement migration as a service, so that the developer does not have to deal with the race conditions that may result while migrating data in ad hoc ways. The analogy of overlay is a sheet of transparencies. Remember the old days before powerpoint? The presenters used to print the slides on transparencies, and do animation by overlaying one transparency over another. The overlay idea is similar. &quo…

Refuse to Crash with Re-FUSE

This paper appeared in Eurosys'10. This is a well written paper: the paper holds your hand, and takes you for a walk in the park. At each step of the path, you can easily predict what is coming next. I like this kind of easy-reading papers, compared to the cryptic or ambiguous papers which make you wander around or try to guess which paths to take through a jungle of junctions.
The goal of this work is to provide support for restartable user-level filesystems. But, before I can tell you more about that, we first need to discuss user-filesystems. User-filesystems provides a way to add custom features (such as encryption, deduplication, access to databases, access to Amazon S3, etc.) on top of existing kernel-level filesystems. FUSE is a popular software that facilitates building user-filesystems on top of kernel-level filesystems. FUSE is available for Linux, FreeBSD, NetBSD, and MacOSX, and more than 200 userfilesystems have already been implemented using FUSE. GlusterFS, HDFS, ZFS…

Sabbatical help

My tenure at University at Bufffalo, SUNY has just become official after the President signed on it. Having completed my 6 years at UB, I am now eligible to spend a sabbatical year. I am planning to spend 6 months of my sabbatical in the industry/research-lab working on cloud computing challenges. If you have any suggestions/connections to make this happen, please send me an email.

Why are algorithms not scalable?

Recently, a colleague emailed me the following: Since you have been reading so much about clouds, CAP, and presumably lots of consensus things, you can answer better the question of algorithm scalability. How scalable are the popular algorithms? Can they do a reasonable job of consensus with 100,000 processes? Is this even a reasonable question? What are the fundamental problems, the algorithms or the lower level communication issues? These are actually the right kind of questions to ask probing for deeper CS concepts. Here are my preliminary answers to these questions.
From what I read, consensus with 100K processes is really out of question. Paxos consensus was deployed on 5 nodes for GFS and similar systems: Zookeper, Megastore, etc. As another example Sinfonia's participant nodes in a transaction is also around limited to 5-10.
So what is wrong with algorithms, why are they unscalable? I guess one obstacle against scalability is the "online" processing requirement, …

On designing and deploying Internet scale services

This 2008 paper presents hard-earned lessons from Jim Hamilton's experience over the last 20 years in high-scale data-centric software systems and internet-scale services. I liken this paper to "Elements of Style" for the domain of Internet scale services. Like the "Elements of Style" this paper is also not to be consumed at once, it is to be visited again and again every so often.
There are three main overarching principles: expect failures, keep things simple, automate everything. We will see reflections of these three principles in several subareas pertaining to Internet-scale services below.
Overall Application Design Low-cost administration correlates highly with how closely the development, test, and operations teams work together. Some of the operations-friendly basics that have the biggest impact on overall service design are as follows.
/Design for failure/ Armando Fox had argued that the best way to test the failure path is never to shut the service down …

Popular posts from this blog

I have seen things

SOSP19 File Systems Unfit as Distributed Storage Backends: Lessons from 10 Years of Ceph Evolution

PigPaxos: Devouring the communication bottlenecks in distributed consensus

Frugal computing

Fine-Grained Replicated State Machines for a Cluster Storage System

Learning about distributed systems: where to start?

My Distributed Systems Seminar's reading list for Spring 2020

Cross-chain Deals and Adversarial Commerce

Book review. Tiny Habits (2020)

Zoom Distributed Systems Reading Group