Posts

Showing posts from December, 2016

Selected blog posts from 2016

This is my 42nd post for the year. As is the custom in a year-end post, I mention some highlights among my 41 posts in 2016.

Machine Learning
Learning Machine Learning: A beginner's journey. I wrote this to quickly recap about what I found confusing and useful while I learned about machine learning. I had no idea this would blow up. This received the most pageviews: 45,000 and counting.TensorFlow: A system for large-scale machine learning. 

Facebook papers
Realtime Data Processing at Facebook.Measuring and Understanding Consistency at Facebook.Holistic Configuration Management at Facebook.

Fault-tolerance
Why Does the Cloud Stop Computing? Lessons from Hundreds of Service Outages. TaxDC: A Taxonomy of nondeterministic concurrency bugs in datacenter distributed systems.

Distributed Coordination
Modular Composition of Coordination Services. Make sure you read the comment at the end, where  one of the authors Kfir Lev-Ari provides answers the questions raised in my post. Implementing Lin…

Learning Machine Learning: A beginner's journey

Image
I have been learning about machine learning and deep learning (ML/DL) for the last year. I think ML/DL is here to stay. I don't think this is a fad or bubble! Here is why:

ML/DL has results. It is hard to argue against success.ML/DL has been on the rise organically since 1985 (with the backpropagation algorithms) and went through another phase of acceleration after 2005 (with the wide availability of big data and distributed data processing platforms). The rise of ML/DL is following a rising curve pattern, not the pattern for a hyped ephemeral bubble. Since it grew gradually over many years, I am betting it will be around for at least the same amount of time. ML/DL has been co-developed with applications. It has developed very much on the practice side with trial and error, and its theory is still lagging a bit behind and is unable explain many things. According to Nassim Taleb's heuristics ML/DL is antifragile.ML/DL has the market behind it. Big money provides big incentive an…

Book review: Intuition pumps and other tools for thinking

The title of this book grabbed my attention immediately. Intuition pumps is a very visual term, and who doesn't like to learn about tools for thinking. The premise of the book is given in the first quote:
"You can't do much carpentry with your bare hands and you can't do much thinking with your bare brain." -- Bo Dahlbom.
The book is by Philosopher Daniel Dennett. The book is surprisingly readable for a philosophy book, which are full of jargon and big words. Dennett took special care in writing in a simple an clean way. For this, he recruited help from undergraduate students in his university, Tufts. The book content was discussed at an undergraduate seminar Dennett offered, and he then got help from these students review the book. This is later revealed as one of Dennett's thinking tools: "Explain to nonexperts: use a decoy audience". I think that worked: the book is accessible to an undergraduate, but a motivated one.

The first chapter explained …

TLA+ project2 solution (2016)

Image
In a previous post, I had given the description of project2.Here is the TLA+ project2 solution made available on github. Below I will briefly explain the solution.

If you are not familiar with the chain-replication protocol, read this brief summary. 

The setup and variables
The first part is about the setup. We declare the process IDs for storage nodes (denoted as Nodes), the clients, and the configurator process.


The nodes maintain "db" each, where the item is updated and queried. For simplicity, our distributed key-value store maintains only a single item, and initialized to ver=-1, val=-1, cli=-1.

The nodes also have an auxiliary variable "up", which is used for modeling the status of the node. A node can go "down" at any time provided that less than FAILNUM nodes are currently down.

Each node and client has a message box, initially all message boxes are empty.

Finally, there is a global variable chain. Initially it is an empty sequence. As the configur…

Feedback from my distributed systems class

I am done with my distributed systems class this semester. At the end of the semester, I polled my students for feedback about the class. I was especially interested in hearing about their feedback on using TLA+ in the course and course projects, and using research paper reviewing for homework assignments. Of course, I didn't mention these when I solicited their feedback. I asked them to write a brief review for what they have learned in my distributed systems class this semester.

I received overwhelmingly positive feedback. Below I share some of the comments about TLA+ and the paper review assignments. (I removed the flattering feedback about my teaching, while I enjoyed reading them.  I put a lot of effort in my teaching and I am happy when the students recognize how much I care. That keeps me motivated.)

Feedback about using TLA+ I assigned two TLA+ projects. The first project was modeling Voldemort key-value store with client-side routing, and the second project was about exte…

Modeling a Replicated Storage System in TLA+, Project 2

Image
Two weeks ago, I had described phase 1 of the TLA+ project I assigned in my distributed systems class. I promised to give you the solution soon. I put the solution for phase1 on github.

In this post, I will describe the phase 2 of the project. In phase 2 of the project, we use Chain Replication to ensure persistence and consistency of the objects stored. Read my chain replication summary to refresh on the algorithm.


Before performing the write, the client reads from the tail of chain to learn the highest versioned write completed. The client then increments the version number, and performs the write to the head of chain with this version number. Since a node failure may lead to a request being lost (be it a read request, or update request), the client needs to repeat the request until a response is returned from the tail. The client cannot start a new request before it receives a reply to its earlier request.

A configurator process (an infallible process which would in practice be imp…

Emerging trends in big data software

Image
Mike Franklin, a famous expert on data science, had visited our department at University at Buffalo in May to talk about emerging trends in big data software. I had taken some notes during his talk, and decided to summarize and share them here.

Mike recently joined University of Chicago as the chair of Computer Science Department, and before that he was the head of UC Berkeley's AMPLab (Algorithms, Machines and People Laboratory), where he was involved with the Spark and Mesos projects which had wide academic and industrial impact. Naturally, the talk included a lot of discussion about AMPLab projects, and in particular Spark.

Mike described AMPLab as a federation of faculty that collaborate around an interesting emerging trend. AMPLab has in total 90 faculty and students. Half of the funding comes from government, and the other half from industrial sponsors. The industrial sponsors also provide constant feedback about what the lab works on and how it matters for them. As an AMPLab…

Popular posts from this blog

I have seen things

SOSP19 File Systems Unfit as Distributed Storage Backends: Lessons from 10 Years of Ceph Evolution

PigPaxos: Devouring the communication bottlenecks in distributed consensus

Frugal computing

Learning about distributed systems: where to start?

Fine-Grained Replicated State Machines for a Cluster Storage System

My Distributed Systems Seminar's reading list for Spring 2020

Cross-chain Deals and Adversarial Commerce

Book review. Tiny Habits (2020)

Zoom Distributed Systems Reading Group