Many hardware and compiler optimisations introduced to speed up single-threaded programs also introduce additional, sometimes surprising, behaviours for concurrent programs with shared mutable state. How many of these extra behaviours occur in practice depends on the combination of the hardware, compiler, runtime, etc. that make up the platform. A memory model, which prescribes what values each read of a concurrent program can read, allows programmers to determine whether a program behaves as expected without having to worry about the details of the platform. However, capturing these behaviours in a memory model without also including undesirable "out-of-thin-air" behaviours that do not occur in practice has proved elusive. The memory model of C and C++ allows out-of-thin-air behaviour, while the Java memory model fails to capture some behaviours that are introduced in practice by compiler optimisations. In this thesis, we propose a memory model that forbids out-of-thin-air behaviour, yet allows the behaviours that do occur. Our memory model follows operational intuitions of how the hardware and compilers operate. We illustrate that it behaves as desired on a series of litmus tests. We show that it captures at least some of the expected behaviours, that it forms an envelope around some common compiler optimisations, and that it is implementable on common hardware using the expected compilation schemes. We also show that it supports some established programming idioms.
Relaxed-memory concurrency is now mainstream in both hardware and programming languages, but there is little support for the programmer of such systems. In this highly non-deterministic setting, ingrained assumptions like causality or the global view of memory do not hold. It is dangerous to use intuition, specifications are universally unreliable, and testing outcomes are dependent on hardware that is getting more permissive of odd behaviour with each generation. Relaxed-memory concurrency introduces complications that pervade the whole system, from processors, to compilers, programming languages and software.
There has been an effort to tame some of the mystery of relaxed-memory systems by applying a range of techniques, from exhaustive testing to mechanised formal specification. These techniques have established mathematical models of hardware architectures like x86, Power and ARM, and programming languages like Java. Formal models of these systems are superior to prose specifications: they are unambiguous, one can prove properties about them, and they can be executed, permitting one to test the model directly. The clarity of these formal models enables precise critical discussion, and has led to the discovery of bugs in processors and, in the case of Java, x86 and Power, in the specifications themselves.
In 2011, the C and C++ languages introduced relaxed-memory concurrency to the language specification. This was the culmination of a six-year process on which I had a significant impact. This thesis details my work in mathematically formalising, refining and validating the 2011 C and C++ concurrency design. It provides a mechanised formal model of C and C++ concurrency, refinements to the design that removed major errors from the specification before ratification, a proof in HOL4 (for a restricted set of programs) that the model supports a simpler interface for regular programmers, and, in collaboration with others, an online tool for testing intuitions about the model, proofs that the language is efficiently implementable above the relaxed x86 and Power architectures, a concurrent reasoning principle for proving specifications of libraries correct, and an in-depth analysis of problems that remain in the design.
When constructing complex concurrent systems, abstraction is vital: programmers should be able to reason about concurrent libraries in terms of abstract specifications that hide the implementation details. Relaxed memory models present substantial challenges in this respect, as libraries need not provide sequentially consistent abstractions: to avoid unnecessary synchronisation, they may allow clients to observe relaxed memory effects, and library specifications must capture these.
In this paper, we propose a criterion for sound library abstraction in the new C11 and C++11 memory model, generalising the standard sequentially consistent notion of linearizability. We prove that our criterion soundly captures all client-library interactions, both through call and return values, and through the subtle synchronisation effects arising from the memory model. To illustrate our approach, we verify implementations against specifications for the lock-free Treiber stack and a producer-consumer queue. Ours is the first approach to compositional reasoning for concurrent C11/C++11 programs.
Previous work formalized the C++ memory model in Isabelle/HOL in an effort to clarify the proposed standard's semantics. Here we employ the model finder Nitpick to check litmus test programs that exercise the memory model, including a simple locking algorithm. Nitpick is built on Kodkod (Alloy's backend) but understands Isabelle's richer logic; hence it can be applied directly to the C++ memory model. We only need to give it a few hints, and thanks to the underlying SAT solver it scales much better than the Cppmem explicit-state model checker. This case study inspired optimizations in Nitpick from which other formalizations can now benefit.
Current proposals for concurrent shared-memory languages, including C++ and C, provide sequential consistency only for programs without data races (the DRF guarantee). While the implications of such a contract for hardware optimisations are relatively well-understood, the correctness of compiler optimisations under the DRF guarantee is less clear, and experience with Java shows that this area is error-prone.
In this paper we give a rigorous study of optimisations that involve both reordering and elimination of memory reads and writes, covering many practically important optimisations. We first define powerful classes of transformations semantically, in a language-independent trace semantics. We prove that any composition of these transformations is sound with respect to the DRF guarantee, and moreover that they provide basic security guarantees (no thin-air reads) even for programs with data races. To give a concrete example, we apply our semantic results to a simple imperative language and prove that several syntactic transformations are safe for that language. We also discuss some surprising limitations of the DRF guarantee.