Cabbage, Goat, and Wolf Puzzle in TLA+

Over the past couple of months, I have been harping on the value of TLA+ for teaching engineers the art of abstraction. It is important to emphasize that this is an art, not a science, and it is best learned through studying examples and practicing hands-on with modeling. TLA+ excels in providing rapid feedback on your modeling and designs, which facilitates this learning process significantly.

Modeling the "cabbage, goat, and wolf" puzzle taught me that tackling real/physical-world scenarios is a great way to practice abstraction and design -- cutting out the clutter and focusing on the core challenge.

How will you represent this world? What will be your actors, objects, and the relations between them? Will one approach result in a cleaner protocol model than another? Or, as I like to ask, does the protocol flow cleanly from the model?

The puzzle description

Here is the puzzle.

A farmer with a wolf, a goat, and a cabbage must cross a river by boat. The boat can carry only the farmer and a single item. If left unattended together, the wolf would eat the goat, or the goat would eat the cabbage. How can they cross the river without anything being eaten?

 
Write a model of this system in TLA+/Pluscal, and challenge TLC (TLA+'s model checker) that the solution state cannot be reached, and follow the counterexample generated by TLC to see the solution.

If you like to test/improve your modeling/designing/abstraction skills, give this a try before you check my solution. This is not as easy as it looks. Modeling necessitates (and cultivates) a deep understanding of the problem. Modeling is nature’s way of telling us how lousy our initial understanding is.

Acknowledgment: I recently conversed with Michel Charpentier, and asked him about what type of TLA+ projects he uses in his teaching. Some of his projects are about modeling multihreaded Java programs, like bounded buffer problem, but several of them are about modeling real/physical world setups, like a customer ticketing system, and the "cabbage, goat, and wolf" puzzle. Inspired by this, I decided to try modeling the puzzle myself.

Modeling the problem

Let's consider the river's banks first, as they are central to the puzzle. Each bank can be represented as a set. Initially, the left bank contains {Cabbage, Goat, Wolf, Farmer}, while the right bank remains empty. The boat is another crucial element since it serves as the means of transportation and communication between the two banks. It is also initially empty.

Now, do we need the farmer in our model? Why doesn't the boat move itself from one side to the other? It turns out that including the farmer is useful for formalizing safety more cleanly. When the farmer is present on a bank, it is okay to have both a predator and prey together in that bank. And, in the absence of the farmer, we need to prevent leaving a prey and a predator together.

So let's capture this condition first. Placing a subset S of objects from the universal set {C=Cabbage, G=Goat, W=Wolf, F=Farmer} in a specific location (either one of the banks or the boat) is Allowed under the following condition: Either the farmer F is there in that set S (line 12), or in the absence of the farmer (lines 13-14) it is NOT the case that we have a prey and predator together in S (meaning Cabbage & Goat or Goat & Wolf should not be present together).


Now, let's define the TypeOK condition, which serves as a valuable tool to prevent inadvertent errors in the subsequent modeling. This is particularly helpful since TLA+ is not a typed language. TypeOK asserts that the total number of objects in the system (across both banks and the boat) remains constant at 4 (line 30). It also asserts that the boat can carry at most 2 objects (line 31), and the contents of the boat and the banks must consist of nothing but the objects F, C, G, W (lines 32-34).

Now, let's establish the negation of the solution state. NotSolved simply asserts that the second bank does not contain all four objects. Note that due to the checks imposed by TypeOK, we can confidently express NotSolved solely in terms of cardinality. At the model checking time, we present a challenge to the TLC model checker by declaring that NotSolved is an invariant (i.e., it must always hold). TLC, being a model checker, rigorously examines all possible system behaviors of the system, seeking any counterexample where NotSolved is violated. If TLC identifies such a counterexample, it essentially reveals the sequence of actions required to solve the puzzle.


With this setup in place, let's consider what the system actions would be. We want to load the boat from one bank and unload it at the other bank. By treating these actions symmetrically, we ensure that the same loading and unloading process can be applied in the opposite direction as well. To achieve this symmetry, we define a river bank as a process and instantiate it with "Sides 1 and 2" (line 39).

Unloading is straightforward (lines 48-51). If the boat isn't empty (line 49), the bank can unload its contents to this location (line 50), and set the boat to be empty (line 51). But, what if this is the same boat we just loaded from this bank? Yes, unloading the boat right after loading it is wasteful, but it is still a permissible action. The model checker will explore all possible behaviors. While this action is not going to contribute to a solution (i.e., the shortest counterexample the model checker identifies), it is alright to include it in the exploration for the sake of the simplicity of the model.

Loading is straightforward as well (lines 42-47). We enable this branch, when the current bank is nonempty, and the boat is empty (line 43). We then choose a random yet safe boat content that does not leave a prey and predator unattended on this side (line 44). The randomness, which comes from the *with* keyword, enables the model checker to explore all behaviors allowed by the SafeBoats definition. The rest is easy, load the boat with that SafeBoats selection (line 45), and remove that selection from this bank (line 46).

Ok, let's unpack the SafeBoats expression. The boat B's contents should be a subset of the objects found on the bank that invoked SafeBoats (line 24). The farmer should be on the boat (line 25), and the cardinality of the boat cannot be more than 2 (line 26). Most importantly, when the boat B departs, the remaining objects in this bank should satisfy the Allowed predicate we discussed in the beginning (line 27). SafeBoats returns the set of all boat configurations B that satisfy these conditions.

We are done! Let's take a step back, and discuss our model as a whole.

Analyzing the model

The point of abstraction/modeling/designing is to remove unnecessary details that may cause distraction, and to cut to the core of the issue. So, what are the unnecessary details this model removes? (Before you read on, think about what your answer is to this question.)

The key abstraction here is the removal of the operational approach in lieu of a declarative approach. The model intentionally omits the specific ordering of actions (an operation sequence). While beginners instinctively attempt to spell out the operational steps (saying this happens and then that happens etc.), the experienced modelers establish the system model, define potential actions, and allow them to unfold naturally. Think declaratively, detached from time, like the aliens in the Arrival book/movie.

Additionally, we abstained from modeling the Farmer, Cabbage, Goat, and Wolf as active agents; instead, they are employed as inert objects without independent will. Our primary actors in this model are the two banks, realized as processes. These bank processes are responsible for updating the bank and boat variables, that we modeled as sets.


Does the protocol flow from the model?

I would say, yes. After we write the model, we didn't have to pull rabbits out of a hat. The model affords a clear path forward for the protocol.

To give an example of pulling a rabbit out of a hat, this is what I had to do with my first modeling attempt of this problem. I had painted myself to a corner, and ended up having to introduce a "swap" action between the contents of the boat and the bank. In other words, I had to hard-wire the solution logic to the model without a rationale about why this was the obvious choice. The "swap" did not logically flow from the rest of the model. I then realized that if I added the farmer object, I can get out of this special case handling, without having to use convoluted workarounds.


Can this model be improved?

The answer is yes again. A true master of abstraction would abstract to a higher plane to address the most generalized problem associated with this problem space. I am still not there.


Finally, I'd like to clarify one point. This model showcases just a small fraction of TLA+'s strengths. The model represents a nondeterministic execution (using the with keyword), and it doesn't even touch upon concurrent execution where processes yield to each other at several points in their execution. TLA+ truly excels for modeling concurrent and distributed systems.

The counterexample trace

You can find the counterexample trace below. The counterexample consists of 15 steps. Since the model has only two variables: the banks (which has two entries for side 1 and side 2), and the boat.
  1. Initially bank[1] has all 4 objects, bank[2] is empty. The boat is empty.
  2. Bank1 process loads the boat with Farmer&Goat, leaving Wolf&Cabbage behind.
  3. Bank2 process unloads the boat.
  4. Bank2 loads the boat with only the Farmer.
  5. Bank1 unloads the boat.
  6. Bank1 loads with Farmer&Cabbage, leaving the Wolf behind.
  7. Bank2 unloads the boat. Now it has Farmer, Goat, and Cabbage.
  8. Bank2 loads the boat with Farmer&Goat, leaving Cabbage behind.
  9. Bank1 unloads the boat. Now it has Farmer, Goat, and Wolf.
  10. Bank1 loads the boat with Farmer&Wolf.
  11. Bank2 unloads the boat. Now it has Farmer, Wolf, and Cabbage.
  12. Bank2 loads the boat with only the Farmer, leaving Wolf&Cabbage behind.
  13. Bank1 unloads the boat.
  14. Bank1 loads the boat with Farmer&Goat, leaving nothing behind.
  15. Bank1 unloads the boat. The property NotSolved is violated, because the solution state is achieved.



Config file

Below is the config file for the model for completeness. You can find this config file and the TLA+ model discussed above at https://github.com/muratdem/PlusCal-examples/tree/master/Puzzles


A pure TLA+ solution

Update: Michel shared his TLA+ solution, the third revision of the model as he had been using this example for a while. This is in TLA+ as contrast to PlusCal above, and this is more succinct.  I was surprised by the priming of the expression R (to stand for the right bank), but yeah that makes sense, and that makes the model more elegant. There is no boat here, the transfer of content between the banks are immediate. There is still the use of the man/farmer object to define Safe.


Another PlusCal solution


This one is inspired by Michel's TLA spec. It does not use process concept in PlusCal, rather gives a basic algorithm. Note the use of Alias for displaying counterexample more accessibly.





Comments

Ozan Sazak said…
Thanks for the post!

Popular posts from this blog

Hints for Distributed Systems Design

Learning about distributed systems: where to start?

Making database systems usable

Looming Liability Machines (LLMs)

Foundational distributed systems papers

Advice to the young

Linearizability: A Correctness Condition for Concurrent Objects

Understanding the Performance Implications of Storage-Disaggregated Databases

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Designing Data Intensive Applications (DDIA) Book