Amazon’s recent announcement of their spec-driven AI tool, Kiro, inspired me to write a blog post on a completely unrelated topic: formal specifications. In particular, I wanted to write abou…| Surfing Complexity
Good UI is necessary to making correct software. If people have trouble using the product, they’re more likely to do the wrong thing. Personally, though, I can’t stand UI work. I don’t think it’s “beneath me” or anything, it just doesn’t quite mesh with my brain. Visuals and interfaces give me anxiety. Mad respect for the people who can handle it. I love formal methods. Recently my friend Kevin Lynagh released Sketch.| Hillel Wayne
If you’re a regular reader of this blog, you’ll have noticed that I tend to write about two topics in particular: Resilience engineering Formal methods I haven’t found many people…| Surfing Complexity
FizzBee is a new formal specification language, originally announced back in May of last year. FizzBee’s author, Jayaprabhakar (JP) Kadarkarai, reached out to me recently and asked me what I …| Surfing Complexity
This paper (NSDI'25) applies lightweight formal methods (hence the pun "smart casual" in contrast to formal attire) to the Confidential Con...| muratbuffalo.blogspot.com
HASLab Seminar Abstract: Modern databases underlying large-scale Internet services guarantee immediate availability and tolerate network partitions at the expense of providing only weak forms of consistency, commonly dubbed eventual consistency. Even though the folklore notion of eventual consistency is very … Continue reading →| HASlab
Analysing interactive devices based on information resource constraints Analysis of the usability of an interactive system requires both an understanding of how the system is to be used and a means of assessing the system against that understanding. Such analytic … Continue reading →| HASlab
Justine Tunney recently wrote a blog post titled The Fastest Mutexes where she describes how she implemented mutexes in Cosmopolitan Libc. The post discusses how her implementation uses futexes by …| Surfing Complexity
The first person to explain answer set programming (ASP, pronounced ‘ay ess pee’) to me (Joe Osborn) told me about the three line implementation of graph coloring and the termination guarantees. Like any PL person, I recoiled in horror: you have an always-terminating language where I can solve NP-complete problems in three lines? So every […]| weaselhat
Meet Amazon Science’s newest research area.| Amazon Science
Last year a client asked me to solve a problem: they wanted to be able to compose two large TLA+ specs as part of a larger system. Normally you’re not supposed to do this and instead write one large spec with both systems hardcoded in, but these specs were enormous and had many internal invariants of their own. They needed a way to develop the two specs independently and then integrate them with minimal overhead.| Hillel Wayne
I’ve recently done a lot of work in Alloy and it’s got me thinking about a common specification pitfall. Everything in the main post applies to all formal specifications, everything in dropdowns is for experienced Alloy users. Consider a simple model of a dependency tree. We have a set of top-level dependencies for our program, which have their own dependencies, etc. We can model it this way in Alloy:| Hillel Wayne
This is just a way of thinking about formal specification that I find really useful. The terms originally come from Michael Jackson’s Software Requirements and Specifications. In specification, the machine is the part of the system you have direct control over and the world is all the parts that you don’t. Take a simple transfer spec: ---- MODULE transfer ---- EXTENDS TLC, Integers CONSTANTS People, Money, NumTransfers (* --algorithm transfer variables acct \in [People -> Money]; define \...| Hillel Wayne
Forward and Backward Reasoning in Proof Assistants| concerningquality.com
tl;dr: online Alloy reference here. If you’ve read this blog at all you probably know I’m a big fan of Alloy. It’s a simple, powerful formal method whose entire syntax fits in just two pages. You can learn it in a day and be productive in two. And it can be used to model everything from package management to database migrations to authorization systems. Two years ago I was invited to the Workshop on the Future of Alloy.| Hillel Wayne
This is an excerpt from Michael Jackson (not the singer)’s book Software Requirements and Specifications, in his discussion of “Raw Materials”. It’s worth letting stand on its own. First, make a list of all the languages you use in all your development activities. Be honest: don’t list all of the languages you have ever heard of; include only languages you have used for real work at least once… Now, ask yourself which of the languages on your list you would choose to say these thi...| Hillel Wayne
Logical Time and Deterministic Execution| concerningquality.com
Efficient and Flexible Model-Based Testing| concerningquality.com
The Case for Models| concerningquality.com
img {border-style: groove;} Someone recently told me a project isn’t real until you do a retrospective, so I think it’s time to do one for Let’s Prove Leftpad. Short explanation: it’s a repository of proofs of leftpad, in different proof systems. Long explanation: the rest of this post. Background I’m into formal methods, the discipline of proving software correct. Consider the following contrived code: def add(x: int, y: int): int { if(x == 12976 && y == 14867) { return x - y; } re...| Hillel Wayne
Some useful mental models from the world of formal methods.| Hillel Wayne
When teaching formal methods, I often get asked how we can connect the specifications we write to the code we produce. One way we do this is with refinement. All examples are in TLA+.1 Imagine we’re designing a multithreaded counter. We don’t know how we’re going to implement it yet, so we start by specifying the abstract behavior.2 ---- MODULE abstract ---- EXTENDS Integers, Sequences VARIABLES pc, counter vars == <> \* Two threads Threads == 1.| Hillel Wayne
Recently my friend Lars Hupel and I had a discussion on why formal methods don’t compose well. You can read the conversation here. We focused mostly on composing formally-verified code. I want to talk a little more about the difficulties in composing specifications. This is half because it’s difficult for interesting reasons and half because it’s a common question from beginners. Beginners to formal specification expect specifications should be organized like programs: multiple independ...| Hillel Wayne
There’s not a whole lot on TLA+ technique out there: all the resources are either introductions or case studies. Good for people starting out, bad for people past that. I think we need to write more intermediate-level stuff, what Ben Kuhn calls Blub studies. Here’s an attempt at that. Most TLA+ properties are invariants, properties that must be true for every state in the behavior. If we have a simple counter:| Hillel Wayne
For latency, anyway. A common architecture pattern is the “task queue”: tasks enter an ordered queue, workers pop tasks and process them. This pattern is used everywhere from distributed systems to thread pools. It applies just as well to human systems, like waiting in line at a supermarket or a bank. Task queues are popular because they are simple, easy to understand, and scale well. There are two primary performance metrics for a task queue.| Hillel Wayne
My work brings me though a lot of software correctness techniques, things like type theory, test-driven development (TDD), and formal methods. The surrounding communities all have the same problem: they can’t get people using these techniques. They all ask “Don’t people care about correct software?” To which the insiders usually answer “programmers don’t care about correctness, they just care about shoveling out garbage to make money!” I’ve heard this from every single communi...| Hillel Wayne
My job these days is teaching TLA+ and formal methods: specifying designs to find bugs in them. But just knowing the syntax isn’t enough to write specs, and it helps to have examples to draw from. I recently read Chris Siebenmann’s Even in Go, concurrency is still not easy and thought it would make a good case study for writing a spec. In it, he gives an example of Go code which deadlocks:| Hillel Wayne
Decision tables are easy, simple, and powerful. You can teach them in five minutes and write one in half that time. You can look at a table and understand what it’s saying, see if it matches your problem, and check for design flaws. So it’s kinda weird that there’s basically nothing about them online. I wrote an introduction a while back, but I want something a little more formal. So this post will reintroduce the core ideas in a more formal way and then talk about some of the technique...| Hillel Wayne
I’m in the process of revising some of my workshops. I realized that one particular important aspect of TLA+, fairness, is something that I’ve struggled to explain well. Then again, everybody struggles to explain it well! It’s also a great example of how messy concurrent systems can get, so I figured it would make a good post for the blog. Stuttering Take the following spec of a clock. I used TLA+ but it should mostly be clear from context:| Hillel Wayne
In most testing frameworks, you’re expected to write a lot of low-level tests and few high-level tests. The reasoning is that end-to-end tests are slow and expensive and only cover a tiny amount of the program’s state space. It’s better to focus on testing all the parts in isolation versus testing that they all work together. In practice, global correctness does not follow from local correctness. It’s possible for every part to be individually rock-solid but the system to be broken as...| Hillel Wayne
This is an “intro packet” you can use to argue for the benefits of formal methods (FM) to your boss. It’s a short explanation, a list of benefits and case studies, and a demo. Everything’s in TLA+, but the arguments apply equally well to Alloy, B, statecharts, etc. Adapt the material to your specific needs. Quick notational note: I’m leaving out the code verification side of formal methods, mostly because design verification is a much easier sell.| Hillel Wayne
When we design programs, we usually look for two kinds of properties: that “bad things” never happen and that “good things” are guaranteed to happen. These are called safety and liveness properties, respectively. These are properties that we want to hold true for every possible program behavior. “We always complete every request” is a liveness property. If our system has it, every program trace will complete every request. If it doesn’t hold, I can give you a example behavior wh...| Hillel Wayne
Most of my formal methods examples deal with concurrency, because time is evil and I hate it. While FM is very effective for that, it gives people a limited sense of how flexible these tools are. One of the most common questions I get from people is Formal methods looks really useful for distributed systems, but I’m not making a distributed system. Is FM still useful for me?| Hillel Wayne
In my job I dig up a lot of obscure techniques. Some of them are broadly useful: formal specification, design-by-contract, etc. These I specialize in. Some of them are useful, but for a smaller group of programmers. That’s stuff like metamorphic testing, theorem proving, constraint solving. These I write articles about. Then there’s the stuff where I have no idea how to make it useful. Either they’re so incredibly specialist or they could be useful if it weren’t for some deep problems.| Hillel Wayne
A common question I get about specs is how to model bad actors. Usually this is one of two contexts: The spec involves several interacting agents sharing a protocol, but some of the nodes are faulty or malicious: they will intentionally try to subvert the system. The spec involves an agent subject to outside forces, like someone can throw a rock at your sensor. These “open world” situations are a great place to use formal methods.| Hillel Wayne
A few people have told me that they’ve enjoyed learning formal methods but aren’t sure how to actually use it. They’re mostly doing short sprints at work and aren’t building new systems from scratch. This tells me there’s some confusion about what makes specifications useful, and that we need a resource on applying them in practice. This is a short guide to using specifications at work in a way that’s accessible to beginners, applicable in many contexts, and provides solid busines...| Hillel Wayne
I saw this question on the Software Engineering Stack Exchange: What are the barriers that prevent widespread adoption of formal methods? The question was closed as opinion-based, and most of the answers were things like “its too expensive!!!” or “website isn’t airplane!!!” These are sorta kinda true but don’t explain very much. I wrote this to provide a larger historical picture of formal methods, why they’re actually so unused, and what we’re doing to make them used.| Hillel Wayne
I recently did a corporate TLA+ workshop and some people asked what TLA+ specs look like in practice. If you looked at the most common public examples, you’d probably come away thinking that people only used it for critical consensus algorithms. This is a problem for two reasons: first, it makes it harder to learn TLA+, as there aren’t simpler examples to experiment with. Second, it makes it hard for people to see how TLA+ is useful for them.| Hillel Wayne
I am delighted to announce that my book, Practical TLA+, is now available! When I stumbled into TLA+ in 2016 I had no idea it would so define my passions, my values, and my career. I definitely didn’t think I’d be writing a book. But 11 months and 220 pages later, here we are! This is the largest project I’ve ever done and I’m incredibly excited to share it with you all.| Hillel Wayne
I really like the c2 wiki as a historical artifact: a lot of big names in Agile and Extreme Programming argued with each other there.1 One thing they did was the Extreme Programming Challenge, where people tried to test the limits of these approaches. They’d find edge problems (designing databases, remote teams, etc) and see if Agile/XP still worked in that context. In almost every case they quickly decided that yes, it works just fine.| Hillel Wayne
Functional programming and immutability are hot right now. On one hand, this is pretty great as there’s lots of nice things about functional programming. On the other hand, people get a little overzealous and start claiming that imperative code is unnatural or that purity is always preferable to mutation. I think that the appropriate paradigm is heavily dependent on context, but a lot of people speak in universals. I keep hearing that it’s easier to analyze pure functional code than mutab...| Hillel Wayne
This post was commissioned by Graham Christensen as part of the Great Slate fundraiser. Thank you! I wanted to help out the Great Slate fundraiser and donated two technical essays of the purchaser’s choice. Graham bought one and asked for Something that has to do with Nix or NixOS This was a bit of a challenge, given that I didn’t even know what Nix was. I had never used it and still haven’t, nor can I use it for the near future, because my only computer right now is a Windows box.| Hillel Wayne