Showing posts from September, 2012

Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS

Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, David G. Andersen Proc. 23rd ACM Symposium on Operating Systems Principles (SOSP ’11) Cascais, Portugal, October 2011. Geo-replicated, distributed data stores are all the rage now. These stores provide wide area network (WAN) replication of all data across geographic regions in all the datacenters. Geo-replication is very useful for global-scale web-applications, especially social networking applications. In social networking, your updates to your account (tweets, posts, etc) are replicated across several regions because you may have followers there, and they should get low-latency access to your updates. The paper refers to properties desired of such a geo-replication service with the acronym ALPS (Availability, low Latency, Partition-tolerance, and high Scalability). Of course, in addition to ALPS, we need to add some consistency requirement to the geo-replication service, because otherwise the updates to a record can arriv

Scalable distributed data structures for internet service construction

I think this 2000 paper (by Gribble, Brewer, Hellerstein, and Culler) may as well be the original NoSQL paper. The paper starts off by identifying the problems with RDBMS that prohibit scalability. (This is how you would motivate a NoSQL key-value store system even today :-) RDBMSs have not been designed with Internet service workloads, service properties, and cluster environments in mind, and as a result they fail to provide the right scaling, consistency, or availability guarantees.  RDBMSs permit users to decouple the logical structure of their data from its physical layout, which is a good thing, but excessive data independence (isolation of application from modifying the layout of data definition and organization) can make parallelization and therefore scaling hard.  RDBMSs always choose consistency over availability. The paper then advocates a design sweet-point for achieving scalable and consistent data management for web services. RDBMS is out because it provides a too-h

Popular posts from this blog

Learning about distributed systems: where to start?

Hints for Distributed Systems Design

Foundational distributed systems papers

Metastable failures in the wild

Scalable OLTP in the Cloud: What’s the BIG DEAL?

SIGMOD panel: Future of Database System Architectures

The end of a myth: Distributed transactions can scale

There is plenty of room at the bottom

Distributed Transactions at Scale in Amazon DynamoDB

Dude, where's my Emacs?