Metadata

Posts

Showing posts from July, 2014

Hybrid Logical Clocks

- July 27, 2014

Here I will write about our recent work on Hybrid Logical Clocks, which provides a feasible alternative to Google's TrueTime. A brief history of time (in distributed systems) Logical Clocks (LC) was proposed in 1978 by Lamport for ordering events in an asynchronous distributed system. LC has several drawbacks for modern systems. Firstly, LC is divorced from physical time (PT), as a result we cannot query events in relation to real-time. Secondly, to capture happened-before relations, LC assumes that there are no backchannels and all communication occurs within the system. Physical Time (PT) leverages on physical clocks at nodes that are synchronized using the Network Time Protocol (NTP). PT also has several drawbacks. Firstly, in a geographically distributed system obtaining precise clock synchronization is very hard; there will unavoidably be uncertainty intervals. Secondly, PT has several kinks such as leap seconds and non-monotonic updates. And, when the uncertainty int

Distributed is not necessarily more scalable than centralized

- July 04, 2014

Centralized is not necessarily unscalable! Many people automatically associate centralized with unscalable, and distributed with scalable. And, this is getting ridiculous. In the Spring semester, in my seminar class, a PhD student was pitching me a project for distributed storage: syncing from phone to work/home computers and other phones. The pitch started with the sentence "Dropbox is unscalable, because it is centralized". I was flabbergasted, and I asked a couple of times "Really? Do you actually claim that Dropbox is unscalable?". The student persisted and kept repeating that "Dropbox has a bottleneck because it is a centralized storage solution, and the distributed solution doesn't have that bottleneck". I couldn't believe my ears. Dropbox already proved it is scalable: It serves files for more than 200 million users, who store 1 billion files every 24 hours. That it has a centralized architecture hosted in the cloud doesn't make i