Showing posts from August, 2015

New directions for distributed systems research in cloud computing

This post is a continuation of my earlier post on "a distributed systems research agenda for cloud computing". Here are some directions I think are useful directions for distributed systems research in cloud computing.

Data-driven/data-aware algorithms Please check the Facebook and Google software architecture diagrams in these two links: Facebook Software Stack, Google Software Stack. You will notice that the architecture is all about data: almost all components are about either data processing or data storage.

This trend may indicate that the distributed algorithms should need to adopt to the data it operates on to improve performance. So, we may see the adoption of machine-learning as input/feedback to the algorithms, and the algorithms becoming data-driven and data-aware. (For example, this could be a good way to attack the tail-latency problem discussed here.)

Similarly, driven by the demand from the large-scale cloud computing services, we may see power-management, ene…

How to go for 10X

I think the 10X term originated from this book. (Correct me if I am wrong. I didn't check this.)

It seems like Larry and Sergey are a fan of this concept (so should you!). Actually reading this January 2013 piece, you can sense that the Alphabet transition was in the works by then.

10X doesn't just mean go fast, get quick results, and get 10X more done in the same time. If you think about it, that is actually a pretty incremental mode of operation. And that is how you incur technical debt. That means it was just a matter of time for others to do the same thing, and probably much better and more complete. Trading off quality for time is often not a good deal (at least in the academic research domain).

10X means transformative rather than incremental improvement. Peter Thiel explains this well in his book Zero to One, 2014. The main theme in the book is: Don't do incremental business, invent a new transformational product/approach. Technology is 0-1, globalization is 1-n. Mos…

A distributed systems research agenda for cloud computing

Distributed systems (a.k.a. distributed algorithms) is an old field of almost 40 years old. It gave us impossibility proofs on the theory side, and also algorithms like Paxos, logical/vector clocks, 2/3-phase commit, leader election, dining philosophers, graph coloring, spanning tree construction which are adopted in practice widely. Cloud computing is a relatively new field in contrast. It provides new opportunities as well as new challenges for the distributed systems/algorithms area. Below I briefly discuss some of these opportunities and challenges.

Opportunities Cloud computing provides abundance. Nodes are replaceable, even hot swappable. You can dedicate several nodes for running customized support services, such as monitoring, logging, storage/recovery service. These opportunities are likely to have impact on how fault-tolerance is considered in distributed systems/algorithms work.

Programmatic interfaces and service-oriented architecture are also hallmarks of cloud computing …

Popular posts from this blog

I have seen things

SOSP19 File Systems Unfit as Distributed Storage Backends: Lessons from 10 Years of Ceph Evolution

PigPaxos: Devouring the communication bottlenecks in distributed consensus

Frugal computing

Learning about distributed systems: where to start?

Fine-Grained Replicated State Machines for a Cluster Storage System

My Distributed Systems Seminar's reading list for Spring 2020

My Distributed Systems Seminar's reading list for Fall 2020

Cross-chain Deals and Adversarial Commerce

Book review. Tiny Habits (2020)