Monday, December 21, 2015

What technological advancements will the next decade bring?

There have been amazing progress in technology in recent years. Just to name a few, I would mention deep learning, big data, ubiquitous tablets and smartphones, advances in medicine, BitCoin, 3D printing, accurate voice recognition, Arduino, Raspberry pie, maker movement, drones, etc. I can't help but wonder what we will see next, in the coming decade. What discoveries we will be able to make? How would the world change? I am very curious and excited about the future.

To extrapolate to 2025, indulge me as I go back to the year 2005 to extrapolate. Ten years ago, iPhone was still not introduced (iPhone 1 got introduced on June 29, 2007). Smartphones were rare. Internet was not a household item. Cloud was not there. Tablets were not there. In fact, none of the technologies mentioned in the previous paragraph were there. In ten years we have come a long way.

Let me add an anectode about how I witnessed this ride in the last 10 years. In 2005, I joined University at Buffalo as an Assistant professor, and  started teaching CSE 646: Wireless and Mobile Computing, a graduate level project-based course on wireless sensor networks. In order to pique the interest of my students, I would start the class by a review-assignment on a short science-fiction story by Vernor Vinge, "Synthetic Serendipity". (If you have 10-15 minutes, do read this story; it is really good.) So, every year, we would talk about which technology mentioned in this short story may get realized soon. A funny thing started to happen around 2010. It seemed like, every year we would find a couple technologies mentioned in the book to get realized: Google Glass, delivery drones, Arduinos, robot sets for education, augmented reality, BitCoin, holograms, etc.

This leads me to be optimistic about the kind of progress we would see in the next 10 years. (Ok, maybe not as optimistic as the singularity guys and what their law of accelerating returns predict.) Let me wear my wizard hat, and look into my crystal ball. (I will stick to predicting only computing related advances with which I have some familiarity.)

Advances in machine learning 

Machine learning is all the rage now. Thanks to big data and cloud computing, machine learning stole the show in the recent years. However, machine learning is still just simple statistical learning today. In other words it is still in its infancy. I think we are going to see much better technology and accelerated progress in machine learning in the coming years. Five years ago, I would not put singularity before 2150-2200 timeline. Nowadays, I am not so sure. The AI winter is over, and AI is blooming. With all the attention it is receiving we are bound to witness more breakthroughs in machine learning.

I wager that by 2025 we can expect to have AI as smart as a 3rd-grader. This AI will of course be a savant at book/Google knowledge in specialized fields, but in addition it will be able to synthesize and contextualize the information and use cognition as well as a 3rd-grader can. (This article summarizes cognitive capabilities of an 8-10 year old.)

This will of course be accessible through our smartphones. Siri will get that good. We will have a personal asistant who has human intelligence at the level of a 3rd-grader, and who will have all the world's knowledge accessible through filtering of a 3rd grader's synthesis/creative capability. It was already uncanny when I witnessed my 4 year old having a conversation with Siri a couple months ago. I wonder how this would feel to the middle-aged and elderly. It may come as a total culture shock and be an alienating experience as predicted in the Synthetic Serendipity story.

This is the pattern of software-based innovation. Software-based innovation virtualizes a physical thing, and then improves on it every year thanks to the availability of exponentially more computing/querying power. By 2025, we will have a virtual personal assistant, virtual nanny, virtual teacher, and a virtual doctor in our smartphones.

If you have missed the movie "Her", schedule sometime to watch it. I thought it was pretty good. Come to think of it, in the most recent Terminator movie the SkyNet was rising from the smartphones. Coincidence?

Advances in Internet of Things (IoT)

Arduino and maker movement made a splash. Raspberry Pie is already getting dirt cheap. Yet, the Internet of Things domain is still looking for a killer application. I expect the killer application to arrive, and obviously I don't know what it will be. Maybe it will be in querying the physical spaces as well as we can query the cyber space with Google. Being able to query for car keys at the house, or items in schools, warehouses. I don't know. Maybe we will become a security/surveillance-obsessed community and employ IoT to record/query audio and video.

In any case, I wager that IoT will finally take off. I include 3D-printing technology as auxiliary to IoT. In 2025, you may not quite be able to download and print a car, but I believe we will start downloading and printing toy cars for our children at least. (Whatever you can't print at home will be printed at an Amazon warehouse nearby and will be drone-delivered to you within a couple hours.) Of course the toy car would be steerable from the Internet and it would streaming videos to YouTube :-) Most amazing of technologies always start as toys, then find serious use. I remember in 2005 I had an argument with a graduate student who thought "the newly released Google Maps was a silly toy idea because we already have paper maps" (I kid you not).

Printable IoT utensils/tools/gadgets may help revolutionalize/democratize retail the way cloud computing revolutionalized/democratized the IT startups domain. Walmart's business may be in jeopardy in 2025, because printable IoT technology will accelerate the innovation in consumer items/tools and bridge the gap between idea to production and delivery in retail. If you have a good idea for a smart kitchen utensil, you will be able to prototype it and get it into production and retail at large-scale quickly. This will be like Kickstarter on steroids. We may see next Instagrams, Dropboxes, Apples(?), flourish in the domain of printable IoTs.

The way cloud computing provided pay-per-use elastic-scalability to software-as-a-service, an IoT printing platform can provide pay-per-use elastic-scalability to retail-as-a-service. Amazon was the cloud computing platform provider which facilitated/accelerated IT-based startups. Can Amazon be the platform provider for IoT printing business as well? Wow, I might have figured out Amazon's convergent strategy to world domination in 2025.

Software is already eating the world, and by 2025 software will be rebuilding/reintegrating to the world. Maybe for future science-fiction reading in my classes, I should assign Neil Stephenson's "Diamond Age", which is set in a future world in which programmable-nanotechnology affects all aspects of life.

Sunday, December 20, 2015

My Distributed Systems Seminar's reading list for Spring 2016

Below is the list of papers I plan to discuss in my distributed systems seminar in Spring'16 semester. These are all very exciting papers, and I am looking forward to the Spring semester.

If you have some suggestions on other good/recent papers to cover, please let me know in the comments.

Distributed coordination

  1. No compromises: distributed transactions with consistency, availability, and performance, SOSP 15
  2. Implementing Linearizability at Large Scale and Low Latency, SOSP 15
  3. High-Performance ACID via Modular Concurrency Control, SOSP 15
  4. Existential Consistency: Measuring and Understanding Consistency at Facebook, SOSP 15
  5. Holistic Configuration Management at Facebook, SOSP 15
  6. Building Consistent Transactions with Inconsistent Replication, SOSP 15
  7. Bolt-on Causal Consistency, Sigmod 13
  8. The Design and Implementation of the Wave Transactional Filesystem

Big data

  1. Arabesque: A System for Distributed Graph Mining, SOSP 15
  2. Petuum: A New Platform for Distributed Machine Learning on Big Data, KDD 15
  3. Graphlab: A new framework for parallel machine learning
  4. Twitter Heron: Stream Processing at Scale, Sigmod 15
  5. Chimera: Large-scale classification using machine learning, rules, and crowdsourcing, VLDB 14


  1. The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services, OSDI 14
  2. Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems, SOSP 15

Formal methods

  1. IronFleet: Proving Practical Distributed Systems Correct, SOSP 15

Friday, December 18, 2015

Paper summary: Detecting global predicates in distributed systems with clocks

This is a 2000 paper by Scott Stoller. The paper is about detecting global predicates in distributed systems.

There has been a lot of previous work on predicate detection (e.g., Marzullo & Neiger WDAG 1991, Verissimo 1993), but those work considered vector clock (VC) timestamped events sorted via happened-before (hb) relationship. This paper proposes a framework for predicate detection over events timestamped with approximately-synchronized (think NTP) physical-time (PT) clocks.

This was a tough/deep paper to read and a rewarding one as well. I think this paper should receive more interest from distributed systems developers as it has applications to the cloud computing monitoring services. As you can see in Facebook stack and Google stack, monitoring services are an indispensible component of large-scale cloud computing systems.


Using PT for timestamping (and predicate detection) has several advantages over using VC. VC is O(N), whereas PT is a scalar. For capturing hb, VC assumes that all communication occur in the present system and there are no backchannels, which is not practical for today's large-scale integrated system of systems. Using PT alleviates the backchannel problem: even if there is not direct communication path, if event B happened in physical time later than event A, then we can still identify that event A can affect event B due to out-of-bound communication. Finally, using PT for timestamping allows the devops to be able to query the system in relation to physical time, whereas VC fails to support it.

It turns out that using PT also provides benefits over using VC in terms of complexity of predicate detection. The worst case complexity for predicated detection using hb captured by VC is Ω(EN), where E is the maximum number of events executed by each process, and N is the number of processes. On the other hand, the smaller the uncertainties in events' PT timestamps, i.e., the less the events overlap, the cheaper the PT-based predicate detection gets. With some assumptions on the inter-event spacing being larger than time synchronization uncertainty, it is possible to have worst-case time complexity for PT-based predicate detection to be O(3N E N2) --- linear in E.

I will say more on this point and how using our hybrid clocks can further reduce cost of predicate detection at the end of this post.

Computation model

A local computation has the form e1, s1, e2, s2, e3, s3, where the ei are events, and the si are local states. A global state of a distributed system is a collection of local states.

Each event e has an interval timestamp its(e), which is an interval with lower endpoint lwr(e) and upper endpoint upr(e). PT timestamping should satisfy the following conditions.
TS1: ∀ e: lwr(e) ≤ upr(e)
TS2: ∀ e1 with a succeeding event e2 on the same process: lwr(e1) ≤ lwr(e2) and upr(e1)≤ upr(e2)
TS3: ∀ e1,e2: if e1 occurred before e2 in real time, then lwr(e1) ≤ upr(e2)

TS3 is the strongest among these assumptions. Still this is a fairly general practical modeling of approximate time synchronization: The intervals of events do not need to be all equal across processes or even along the events on the same process.

Generic theory of consistent global states (CGS)

A consistent global state (CGS) is a global cut that could have occurred during the computation.

The paper shows that CGS lattice theory is not specific to hb but applies to any partial generic ordering gb on events, that satisfy TS2. Two local states are concurrent if they are not related by gb. A global state is consistent with respect to gb if its constituent local states are pairwise concurrent.
consisgb (g) ≡ ∀ i, j: i ≠ j: \neg (g(i) gb g(j))

By adopting PT instead of VC for timestamping and ordering, the paper defines two ordering relations db ("definitely occurred before") and pb ("possibly occurred before") among events as follows.

e1 db e2 ≡ upr(e1) < lwr(e2)

e1 pb e2 ≡ ¬ (e2 db e1)
Notice how pb is elegantly defined in terms of db.

By substituting db and pb orderings in the generic CGS framework above, the paper obtains definitions of 3 distinct detection modalities: Poss-db θ (“θ possibly held”), Def-db θ (“θ definitely held”), and Inst θ (“θ definitely held at a specific instant”). We talk about these detection modalities and developing efficient detection algorithms for them in the next two sections.

Detection based on strong event ordering, db

Substituting db ordering in the generic theory of CGS, we get Poss-db and Def-db. A computation satisfies Poss-db Q iff there exists some interleaving of events that is consistent with db and in which the system passes through a global state satisfying predicate Q. A computation satisfies Def-db Q iff for every interleaving of events that is consistent with db, Q holds. These two are analog to Poss-hb and Def-hb for the hb relation on the CGS framework. Figure 1 illustrates the nuance between Poss-db and Def-db. If Q holds only at CGS <3,1>, then Poss-db holds, but Def-db would not since computation might have taken <2,2> path instead of <3,1> path and Q does not hold at <2,2>. Def-db would hold if Q holds in both <3,1> and <2,2>. Or,  instead if Q holds only at CGS <2,1>, then Def-db still holds.

As we mentioned in the beginning, the quality of time synchronization affect the cost of predicate detection. If the bounds on the offsets are comparable to or smaller than the mean interval between events that potentially truthify or falsify θ, then the number of global states that must be checked is comparable to the number of global states that the system actually passed through during execution, which is O(NE). In contrast, the number of global states considered in the asynchronous hb case is O(EN).

Assume the interval between consecutive events at a process is at least τ.
For Poss-db and Def-db worst-case time complexities are as follows:
- if τ > 2ε, O(3N E N2)
- if τ ≤ 2ε, O((4ε/τ +2){N-1} E N2)

This second option is the problematic zone. For τ << ε the time complexity of detection algorithm grows quickly. The figure below illustrates this, and shows how number of CGS change with respect to E and the ratio μ/ε, where μ is the mean inter-event time.

Detection based on weak event ordering, pb

In contrast to db, pb is a total order and not a partial order.

Substituting pb ordering in the generic theory of CGS, we get Poss-pb and Def-pb. In fact, pb collapses Def-pb and Pos-pb into one, Inst. This is because there are no alternate-paths in the pb computation. pb looks at inevitable global states, i.e., SCGS. The below computation has only two SCGSs (1,2) and (3,2).

This is similar to Properly detection by Fromentin and Raynal 1997. The computation contains O(NE) local states and there are only O(NE) SCGSs. And, the worst case time complexity of algorithm: O((N log N) E)

This is really low cost. So, where is the catch?

It turns out Inst detection is not enough/complete. Inst may miss some CGSs as illustrated below. (The paper doesn't explicitly discuss this, so this gave me some confusion before I could figure this out.)

How would hybrid time/clocks benefit

In Stoller's world, you need to make a binary choice before hand: go either with VC- or PT- based timestamping and detection.

But, going with VC becomes disadvantageous in many cases: when there isn't enough communication to introduce enough hb restrictions. PT can also become very disadvantageous: when μ/ε << 1. (Figure 6 shows how quickly number of CGSs to consider can grow in this region.)

Even worse, within the same system you can have VC become disadvantegous in some regions/phases of computation and PT in others. (You can have excited communication caused by closeby events within ε. So for ε in 10ms, you can have several regions where μ/ε to be <<1. This increases the complexity of Deff-db and Poss-db greatly. Especially for large N.)

We had recently introduced hybrid clocks, and in particular hybrid vector clocks (HVC). With HVC you don't have to choose one over another; HVC offers you the lowest cost detection of VC and PT at any point. Moreover while VC is of  O(N) size, with HVC thanks to loosely-synchronized clock assumption it is possible to keep the sizes of HVC to be a couple entries at each process. HVC captures the communications in the timestamps and provides the best of VC and PT worlds.

We are investigating these issues with my colleague Sandeep Kulkarni at Michigan State, as part of our NSF supported project on highly auditable distributed systems.

It also looks like the db and pb relations should have applications to linearizability/serializability in distributed databases. And it will be interesting to investigate these further.

Two-phase commit and beyond

In this post, we model and explore the two-phase commit protocol using TLA+. The two-phase commit protocol is practical and is used in man...