Wednesday, July 4, 2018

If You’re Not Writing a Program, Don't Use a Programming Language

This article by Leslie Lamport has appeared recently at the Distributed Computing & Education Column.

This article focuses more on the distinction between the TLA+ modeling language versus programming languages. Earlier I had written a blog post about more general benefits of modeling and model-checking.

This is not a technical article so instead of trying to summarize/review the main points, I just include some choice quotes from the article below. The most important takeout message is at the end, which I emphasize with the bold font.

Algorithms are not programs, and they can be expressed in a simpler and more expressive language. That language is the one used by almost every branch of science and engineering to precisely describe and reason about the objects they study: the language of mathematics.

Programming is too often taken to mean coding, and the algorithm is almost always developed along with the code. To understand why this is bad, imagine trying to discover Euclid's algorithm by thinking in terms of code rather than in terms of mathematics.

The [TLA+] abstraction helped a lot in coming to a much cleaner architecture (we witnessed first-hand the brainwashing done by years of C programming). One of the results was that the code size is about 10× less than in [the previous version].  
(Eric Verhulst, Raymond T. Boute, José Miguel Sampaio Faria, Bernard H. C. Sputh, and Vitaliy Mezhuyev. Formal Development of a Network-Centric RTOS. Springer, New York, 2011.)
The common obsession with languages might lead readers to think this result was due to some magical features of TLA+. It wasn’t. It was due to TLA+ letting users think mathematically.

A correctness property of an algorithm is often best expressed as a higher-level algorithm. Proving correctness then means proving that the original algorithm refines the higher-level one. This usually involves both data refinement and step refinement.

I expect that this kind of refinement sounds like magic to most readers, who won't believe that it can work in practice. I will simply report that among the refinement proofs I have written is a machine-checked correctness proof of the consensus algorithm at the heart of a subtle fault-tolerant distributed algorithm by Castro and Liskov that uses 3F + 1 processes, up to F of which may be malicious (Byzantine). The proof shows that the Castro-Liskov consensus algorithm refines a version of the 2F + 1 process Paxos consensus algorithm that tolerates F benignly faulty processes. Steps of malicious processes, as well as many steps taken by the good processes to prevent malicious ones from causing an incorrect execution of Paxos, refine stuttering steps of the Paxos algorithm. I found that viewing the Castro-Liskov algorithm as a refinement of Paxos was the best way to understand it.

Today, programming is generally equated with coding. It's hard to convince students who want to write code that they should learn to think mathematically, above the code level, about what they’re doing. Perhaps the following observation will give them pause. It's quite likely that during their lifetime, machine learning will completely change the nature of programming. The programming languages they are now using will seem as quaint as Cobol, and the coding skills they are learning will be of little use. But mathematics will remain the queen of science, and the ability to think mathematically will always be useful.

Related posts on TLA+

Hillel maintains a nice tutorial for TLA+. He has an upcoming book on TLA+.

I have written up a lot on TLA+. Here are two gentle introduction posts I had written in 2014 and 2015. I gave many examples of modeling with TLA+, as I think lack of examples is one of the biggest obstacles before TLA+ achieving wider adoption. Some include modeling of 2-phase commit transactionssynchronized round consensuschain replication, hygenic dining philosophers, and Paxos and Flexible Paxos. I also use TLA+ in my research papers to provide an unambiguous pseudocode for algorithms, as well as a way to model check their correctness. 

This summer, I will be joining Microsoft Azure Cosmos DB team for my sabbatical for 6 months. I will get a chance to use TLA+ in practice in the context of a very large-scale globally distributed, multi-model database service. So you can expect many more TLA+ posts in this space soon.

MAD questions

1. Instead of the succinctness distinction between a programming language and a specification language, maybe the difference in benefits comes more from the process. You should approach programming in a two phase manner: think before you code. Before you start writing any pretensibly executable code, think first and write a design/algorithm/model first using that programming language as pseudocode.

But then, wouldn't this have worked for the RTOS project cited above? They could have written the design/algorithm in pseudocode C first, and would have achieved the same benefits. But maybe trying to write the design/algorithm in the programming language bring a lot of baggage with it (brainwashing as the quotation above mentions) that this is impossible. Maybe this is the curse of the Whorfian syndrome Lamport wrote about in another article. "Computer scientists collectively suffer from what I call the Whorfian syndrome -- the confusion of language with reality."

2. Would a succinct programming language nullify (or weaken) the arguments in this article? What is your favorite succinct programming language?

Paul Graham swears by the benefits of Lisp. He quotes ESR there: "Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot."

Maybe the benefits come from declarative versus operational thinking? So declarative languages would have an advantage over operational languages.

I am not a programming language guy. And I don't want to start a PL frame war, so maybe I should stop. (This may be the most dangerous MAD question I had written :-)

No comments: