Verifying Distributed Systems: The Operational Approach. Thomas Ridge. In POPL 2009. [ bib | doi | project page | pdf ]
This work develops an integrated approach to the verification of behaviourally rich programs, founded directly on operational semantics. The power of the approach is demonstrated with a state-of-the-art verification of a core piece of distributed infrastructure, involving networking, a filesystem, and concurrent OCaml code. The formalization is in higher-order logic and proof support is provided by the HOL4 theorem prover.

Difficult verification problems demand a wide range of techniques. Here these include ground and symbolic evaluation, local reasoning, separation, invariants, Hoare-style assertional reasoning, rely/guarantee, inductive reasoning about protocol correctness, multiple refinement, and linearizability. While each of these techniques is useful in isolation, they are even more so in combination. The first contribution of this paper is to present the operational approach and describe how existing techniques, including all those mentioned above, may be cleanly and precisely integrated in this setting.

The second contribution is to show how to combine verifications of individual library functions with arbitrary and unknown user code in a compositional manner, focusing on the problems of private state and encapsulation.

The third contribution is the example verification itself. The infrastructure must behave correctly under arbitrary patterns of host and network failure, whilst for performance reasons the code also includes data races on shared state. Both features make the verification particularly challenging.