DDIA: Chp 4. Encoding and Evolution (Part 2)

This second part of Chapter 4 of the Designing Data Intensive Applications (DDIA) book discusses methods of data flow in distributed systems, covering dataflow through databases, service calls, and asynchronous message passing.

For databases, the process writing to the database encodes the data, and the reading process decodes it. We need both backward and forward compatibility, as older and newer versions of code may coexist during rolling upgrades. The book emphasizes that *data often outlives code*, hence this makes schema evolution crucial. The techniques we discussed in the first part of the chapter for encoding/decoding and backward/forward compatibility of schemas apply here. Most databases avoid rewriting large datasets when schema changes occur, instead opting for simple changes like adding nullable columns.


For service calls, the chapter primarily discusses web services, which use HTTP as the underlying protocol. Web services are used not only for client-server communication, but also for inter-service communication within an organization, and cross-organizational data exchange. Two main approaches here are REST and SOAP.

  • REST is a design philosophy built on HTTP principles. It emphasizes simple data formats and the use of HTTP features for various functions. It is popular especially for cross-organizational service integration and microservices architectures.
  • SOAP is (was?) an XML-based protocol that aims to be independent of HTTP. It comes with a complex set of related standards (WS-*) and relies heavily on tools and code generation due to its complexity.

The chapter also discusses remote procedure calls (RPCs) technology, and covers its history briefly by mentioning technologies like EJB, RMI, DCOM, and CORBA. RPC is an abstraction layer over message passing, making the underlying remote communication with a regular procedure call. The chapter criticizes the fundamental flaw of RPC: network requests are fundamentally different from local function calls due to their unpredictability and potential for failure. If you still haven't read the 1994 classic "a note on distributed computing", fix that today.

Despite these issues, RPC continues to be used. Modern systems often use futures or promises to handle asynchronous actions and potential failures. You can argue that amending RPCs with futures/promises is just another attempt at masking the impedance mismatch and hide the underlying asynchronous distributed system model, and complicates the use of RPCs further. I keep thinking whether just using asynchronous message-passing be the better and more natural option for distributed systems. I guess the big benefit from RPCs is to keep the program logic together in one place, you don't lose context trough decoupling the message-send and corresponding message-reception. And maybe momentum is also partially responsible for RPC still going strong. Similar concerns were in play in the threads versus events debate. 


The final section of this chapter discusses asynchronous message passing, particularly focusing on message brokers. Advantages of using message brokers over direct RPC include improved system reliability, automatic message redelivery, decoupling of senders and recipients, and support for multiple recipients.

The chapter mentions both commercial (TIBCO, IBM WebSphere) and open-source (RabbitMQ, ActiveMQ, HornetQ, NATS, and Apache Kafka) message broker implementations. Message brokers typically work with named queues or topics, ensuring message delivery to one or more consumers or subscribers.

The actor model is a programming model for concurrency within a single process. Distributed actor frameworks integrate a message broker with the actor programming model. Popular distributed actor frameworks (Akka, Orleans, and Erlang OTP) handle message encoding and support rolling upgrades.

Comments

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Advice to the young

Foundational distributed systems papers

Distributed Transactions at Scale in Amazon DynamoDB

Linearizability: A Correctness Condition for Concurrent Objects

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book