Posts

Showing posts from February, 2019

Paper review. Sharding the Shards: Managing Datastore Locality at Scale with Akkio

Image
This paper by Facebook, which appeared in OSDI'18, describes the data locality management service, Akkio. Akkio has been in production use at Facebook since 2014. It manages over 100PB of data, and processes over 10 million data accesses per second. Why do we need to manage locality?  Replicating all data to all datacenters is difficult to justify economically (due to the extra storage and WAN networking costs) when acceptable durability and request serving latency could be achieved with 3 replicas. It looks like Facebook had been doing full replication (at least for ViewState and AccessState applications discussed in the evaluation) to all the 6 datacenters back-in-the-day, but as the operation and the number of datacenters grew, this became untenable. So, let's find suitable home-bases for data, instead of fully replicating it to all datacenters. But the problem is access locality is not static. What was a good location/configuration for the data ceases to become suita

Popular posts from this blog

Foundational distributed systems papers

Your attitude determines your success

My Distributed Systems Seminar's reading list for Fall 2020

Silent data corruptions at scale

I have seen things

Learning about distributed systems: where to start?

Read papers, Not too much, Mostly foundational ones

PigPaxos: Devouring the communication bottlenecks in distributed consensus

Sundial: Fault-tolerant Clock Synchronization for Datacenters

Facebook's software architecture