Modeling Token Buckets in PlusCal and TLA+
Retry storms are infamous in distributed systems. It is easy to run into them. Inevitably, a downstream service experiences a hiccup, so your clients automatically retry their failed requests. Those retries add more load to the struggling service, causing more failures, which trigger more retries. Before you know it, the tiny unavailability cascades into a full-blown self-inflicted denial of service. Token Bucket is a popular technique that helps with gracefully avoiding retry storms. Here is how the token bucket algorithm works for retries: Sending is always free. When a client sends a brand-new request, it doesn't need a token. It just sends it. Successes give you credit. Every time a request succeeds, the client deposits a small fraction of a token into its bucket (up to a maximum capacity). Retries costs you credit. If a request fails, the client must spend a whole token (or a large fraction of one) to attempt a retry. If the downstream service is healthy, the bucket stays ful...