Metadata

Posts

Showing posts from May, 2012

My advice to the 2012 class

- May 23, 2012

I feel sad at graduations. When I graduated college in 1997, all I felt was sadness and melancholy, rather than joy. I chose not to attend my MS and PhD graduation ceremonies, but I remember I felt sad after both defenses. It turns out, I also feel sad at my students' graduations. I graduated 3 PhD students and a couple MS students with the thesis option. It was always sad to see them depart. This semester, I have been teaching distributed systems to the graduating class (seniors) at my sabbatical institute, Bilkent University. They are bright and talented (for example, Ollaa was developed by 3 of them). For the last class of their semester and their college lives as well, instead of discussing yet more about distributed systems, we walked out of the classroom, sat on the grass outside, and had a nice chat together. I gave my students some parting advice. I tried to convey what I thought would be most useful to them as they pursue careers as knowledge/IT workers and research

Replicated/Fault-tolerant atomic storage

- May 02, 2012

The cloud computing community has been showing a lot of love for replicated/fault-tolerant storage these days. Examples of replicated storage at the datacenter level are GFS, Dynamo, Cassandra, and at the WAN level PNUTS, COPS, and Walter. I was searching for foundational distributed algorithms on this topic, and found this nice tutorial paper on replicated atomic storage: Reconfiguring Replicated Atomic Storage: A Tutorial , M. K. Aguilera, I. Keidar, D. Malkhi, J-P Martin, and A. Shraer, 2010. Replication provides masking fault-tolerance to crash failures. However, this would be a limited/transient fault-tolerance unless you reconfigure your storage service to add a new replica to replace the crashed node. It turns out, this on-the-fly reconfiguration of a replicated storage service is a subtle/complicated issue due to the concurrency and fault-tolerance issues involved. This team at MSR @ Slicon Valley has been working on reconfiguration issues in replicated storage for some ti