Transaction Processing Book, Gray/Reuter 1992

 I started reading this book as part of Alex Petrov's book club.

I am loving the book. We will do Chapter 3 soon, and I am late on reporting on this, but here are some material from the first chapters. I don't have the time to write more fulfilling reviews about the first chapters unfortunately, so I will try to give you the gist. 

Foreword: Why We Wrote this Book

The purpose of this book is to give you an understanding of how large, distributed, heterogeneous computer systems can be made to work reliably.

An integrated (and integrating) perspective and methodology is needed to approach the distributed systems problem. Our goal is to demonstrate that transactions provide this integrative conceptual framework, and that distributed transaction-oriented operating systems are the enabling technology. In a nutshell: without transactions, distributed systems cannot be made to work for typical real-life applications.

I am very much impressed by the distributed systems insight Gray provides throughout the book. Jim Gray was a distributed systems person. Fight me! Or bite me, I don't care.

Transaction processing concepts were conceived to master the complexity in single-processor online applications. If anything, these concepts are even more critical now for the successful implementation of massively distributed systems that work and fail in much more complex ways. This book shows how transaction concepts apply to distributed systems and how they allow us to build high- performance, high-availability applications with finite budgets and risks.

There are many books on database systems, both conventional and distributed; on operating systems; on computer communications; on application development—you name it.  Such presentations offer many options and alternatives, but rarely give a sense of which are the good ideas and which are the not-so-good ones, and why. More specifically, were you ever to design or build a real system, these algorithm overviews would rarely tell you how or where to start.

This book has a very practical and quantitative approach to the topics, indeed. I am also very much impressed by the quantitative approach Gray takes throughout the book. For fault-tolerance analysis, the book uses many surveys (some of which were Gray's studies) to back up the arguments/claims it puts forward. 

As I read the book, I am in constant awe of the clarity of thinking behind the book. Gray was a master of abstraction.

1. Introduction

Six thousand years ago, the Sumerians invented writing for transaction processing. An abstract system state, represented as marks on clay tablets/ledgers, was maintained. Today, we would call this the database. Scribes recorded state changes with new records (clay tablets) in the database. Today, we would call these state changes transactions.

This book contains considerably more information about the ACID properties. For now, however, a transaction can be considered a collection of actions with the following properties:

  • Atomicity. A transaction’s changes to the state are atomic: either all happen or none happen. These changes include database changes, messages, and actions on transducers.
  • Consistency. A transaction is a correct transformation of the state. The actions taken as a group do not violate any of the integrity constraints associated with the state. This requires that the transaction be a correct program.
  • Isolation. Even though transactions execute concurrently, it appears to each transaction, T, that others executed either before T or after T, but not both.
  • Durability. Once a transaction completes successfully (commits), its changes to the state survive failures.

2. Basic Computer System Terms

The Five-Minute Rule. How shall we manage these huge memories? The answers so far have been clustering and sequential access. However, there is one more useful technique for managing caches, called the five-minute rule. Given that we know what the data access patterns are, when should data be kept in main memory and when should it be kept on disk? The simple way of answering this question is, Frequently accessed data should be in main memory, while it is cheaper to store infrequently accessed data on disk. Unfortunately, the statement is a little vague: What does frequently mean? The five-minute rule says frequently means five minutes, but the rule reflects a way of reasoning that also applies to any cache-secondary memory structure. In those cases, depending on relative storage and access costs, frequently may turn out to be milliseconds, or it may turn out to be days.

The basic principles of the five-minute rule held well over the years.

Shared nothing. In a shared-nothing design, each memory is dedicated to a single processor. All accesses to that data must pass through that processor. Processors communicate by sending messages to each other via the communications network.

Shared global. In a shared-global design, each processor has some private memory not accessible to other processors. There is, however, a pool of global memory shared by the collection of processors. This global memory is usually addressed in blocks (units of a few kilobytes or more) and is RAM disk or disk.

Shared memory. In a shared-memory design, each processor has transparent access to all memory. If multiple processors access the data concurrently, the underlying hardware regulates the access to the shared data and provides each processor a cur- rent view of the data.


Unfortunately, both for today and for many years to come, software dominates the cost of databases and communications. It is possible, but not likely, that the newer, faster processors will make software less of a bottleneck.

Comments

Mark Callaghan said…
That is a great book. I read it long ago to prepare for an interview with the InnoDB team.

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Understanding the Performance Implications of Storage-Disaggregated Databases

Designing Data Intensive Applications (DDIA) Book