Posts

Showing posts with the label programming

DBSP: Automatic Incremental View Maintenance for Rich Query Languages

Image
Incremental computation represents a transformative (!) approach to data processing. Instead of recomputing everything when your input changes slightly, incremental computation aims to reuse the original output and efficiently update the results. Efficiently means performing work proportional only to input and output changes. This paper introduces DBSP, a programming language inspired by signal processing (hence the name DB-SP). DBSP is  simple, yet it offers extensive computational capabilities. With just four operators, it covers complex database queries, including entire relational algebra, set and multiset computations, nested relations, aggregations, recursive queries, and streaming computations. Basic DBSP operators  The language is designed to capture computation on streams. Streams are represented as infinite vectors indexed by consecutive time. Each stream can represent values of any type and incorporates basic mathematical operations like addition, subtraction, and ...

New directions in cloud programming

Image
This paper appeared in CIDR'21 . This paper is also on operationalizing CALM theorem, and is a good companion to the CALM-CRDT paper we covered yesterday. The paper starts by pointing out the challenges of cloud programmability. It says that most developers find it hard to harness the enormous potential of the cloud, and that the cloud has yet to provide a programming environment that exposes the inherent power of the platform. The paper then lays out an agenda for providing a new generation of cloud programming environment to programmers in an evolutionary fashion. I would like to start by challenging the claim that the cloud about not yet providing a noteworthy programming environment. I think there are many examples of successful cloud programming paradigms and frameworks that have emerged in the past decade, such as MapReduce, Resilient Distributed Datasets, Hadoop environment, Spark environment, real time data processing and streaming systems, distributed machine learning syst...

Silent data corruptions at scale

Image
This paper from Facebook (Arxiv Feb 2021) is referred in the Google fail-silent Corruption Execution Errors (CEEs) paper as the most related work. Both papers discuss the same phenomenon, and say that we need to update our belief about quality-tested CPUs not having logic errors, and that if they had an error it would be a fail-stop or at least fail-noisy hardware errors triggering machine checks. This paper provides an account of how Facebook have observed CEEs over several years. After running a wide range of silent error test scenarios across 100K  machines, they found that 100s of CPUs are identified as having these errors, showing that CEEs are a systemic issue across generations. This paper, as the Google paper, does not name specific vendor or chipset types. Also the ~1/1000 ratio reported here matches the ~1/1000 mercurial core ratio that the Google paper reports. The paper claims that silent data corruptions can occur due to device characteristics and are repeatable at s...

Hybrid clocks crate in Rust

Image
I have been learning Rust in my spare time. It has a very steep learning curve, but in return you get to use an elegant, fast, and safe programming language you can use anywhere from embedded systems to datacenter computing. Rust stole like an artist the best concepts from several languages and combined them under one roof with style. Rust has seen a lot of traction recently, and I think Rust is here to stay for a long time.  In the last couple months, I have been writing small programs to practice Rust, and recently I decided I should go bigger and consider reading a real code base, but something with manageable size. I chose to focus on the hybrid clocks crate because it was the right size, and it solved an important problem.  The hybrid clocks crate showcases many of the best practices of Rust. It has a thoughtful design. It shows practical use of struct and trait implementations and trait objects . It was initially hard for me to wrap my mind around the design, because t...

Ahmet's Unity project

Image
Around the beginning of the quarantine, my son, Ahmet (age 12), has started working on the Unity framework . Unity is a popular game engine, like Unreal Engine 4, and Godot. It was launched in 2005, aiming to democratize game development. It is very versatile and beginner friendly. There are many YouTube tutorials about Unity. It is also powerful, as it includes 2D, 3D terrain engines, physics simulator, real-time dynamic shadows, graphics rendering, networked multi-player support, etc. Unity was used for building many amazing games , including Call of Duty: Mobile. Ahmet's Unity journey  I would have loved to say that I supported Ahmet in his quest to learn Unity. But I am just a professor, I am hopelessly disconnected with cool new programming environments/frameworks. I couldn't even help him install the thing, when he had difficulties in the beginning. This was a 5GB installation, and he would ran into problems in the last GB, and also had problems with package dependenc...

If You’re Not Writing a Program, Don't Use a Programming Language

This article by Leslie Lamport has appeared recently at the Distributed Computing & Education Column. This article focuses more on the distinction between the TLA+ modeling language versus programming languages. Earlier I had written a blog post about more general benefits of modeling and model-checking. This is not a technical article so instead of trying to summarize/review the main points, I just include some choice quotes from the article below. The most important takeout message is at the end, which I emphasize with the bold font. Algorithms are not programs, and they can be expressed in a simpler and more expressive language. That language is the one used by almost every branch of science and engineering to precisely describe and reason about the objects they study: the language of mathematics. Programming is too often taken to mean coding, and the algorithm is almost always developed along with the code. To understand why this is bad, imagine trying to discover Eucli...

Think before you code

Image
I have been reading Flash boys by Michael Lewis . It is a fascinating book about Wall Street and high-frequency traders . I will write a review about the book later but I can't refrain from an overview comment. Apart from the greediness, malice, and sneakiness of Wall Street people, another thing that stood out about them was the pervasive cluelessness and ineptness. It turns out nobody knew what they were doing, including the people that are supposed to be regulating things, even sometimes the people that are gaming the system. This brought to my mind Steve Jobs' quote from his 1994 interview: "Everything in this world... was created by people no smarter than you." Anyways, today I will talk about something completely different in the book that caught my eye. It is about thinking before coding. On Page 132 of the book: Russians had a reputation for being the best programmers on Wall Street, and Serge thought he knew why: They had been forced to learn to progr...

Popular posts from this blog

Hints for Distributed Systems Design

My Time at MIT

Advice to the young

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Learning about distributed systems: where to start?

Foundational distributed systems papers

Distributed Transactions at Scale in Amazon DynamoDB

Making database systems usable

Looming Liability Machines (LLMs)

Analyzing Metastable Failures in Distributed Systems