Relaxed-Memory Concurrency

Multiprocessors are now pervasive and concurrent programming is becoming mainstream, but typical multiprocessors (x86, Sparc, Power, ARM, Itanium) and programming languages (C, C++, Java) do not provide the sequentially consistent shared memory that has been assumed by most work on semantics and verification. Instead, they have subtle relaxed (or weak) memory models, exposing behaviour that arises from hardware and compiler optimisations to the programmer. Moreover, these memory models have usually described only in ambiguous (and sometimes flawed) prose, leading to widespread confusion. This page collects work by a group of people working to develop mathematically rigorous and usable semantics for multiprocessor programs. We are focussing on three processor architectures (x86, Power, and ARM), on the recent revisions of the C++ and C languages (C++11 and C11), and on reasoning and verification using these models.

All Papers, Chronologically
Software and Testing Tools
People

x86
Reasoning about x86 concurrency: TRF
Power and ARM
C11 and C++11
Compiling from C/C++11 Concurrency to Power
Compiler Verification: CompCertTSO, from Concurrent Clight (with TSO semantics) to x86-TSO
Correctness of Compiler Optimisations for DRF
Program Logics for x86-TSO

Position Papers, Tutorials and Invited Talks
Funding

All Papers, Chronologically

A Tutorial Introduction to the ARM and POWER Relaxed Memory Models. Luc Maranget, Susmit Sarkar, and Peter Sewell. Draft.
Library Abstraction for C/C++ Concurrency (extended version). Mark Batty, Mike Dodds and Alexey Gotsman. In POPL 2013.
An Axiomatic Memory Model for POWER Multiprocessors. Sela Mador-Haim, Luc Maranget, Susmit Sarkar, Kayvan Memarian, Jade Alglave, Scott Owens, Rajeev Alur, Milo M. K. Martin, Peter Sewell, Derek Williams. In CAV 2012. (more details)
CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency. Jaroslav Sevcik, Viktor Vafeiadis, Francesco Zappa Nardelli, Suresh Jagannathan, Peter Sewell. Draft. (more details)
Synchronising C/C++ and POWER. Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty, Peter Sewell, Luc Maranget, Jade Alglave, Derek Williams. In PLDI 2012. (more details)
Clarifying and Compiling C/C++ Concurrency: from C++11 to POWER. Mark Batty, Kayvan Memarian, Scott Owens, Susmit Sarkar, and Peter Sewell. In POPL 2012. (more details)
Verifying Fence Elimination Optimisations. Viktor Vafeiadis, Francesco Zappa Nardelli. In SAS 2011. (more details)
Understanding POWER Multiprocessors. Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, Derek Williams. In PLDI 2011. (more details)
Safe Optimisations for Shared-Memory Concurrent Programs. Jaroslav Ševčík. In PLDI 2011.
Nitpicking C++ Concurrency. Jasmin Christian Blanchette, Tjark Weber, Mark Batty, Scott Owens, Susmit Sarkar. In PPDP 2011
Litmus: Running Tests Against Hardware. Jade Alglave, Luc Maranget, Susmit Sarkar and Peter Sewell. Tool Demonstration Paper. In TACAS 2011.
Relaxed-Memory Concurrency and Verified Compilation. Jaroslav Sevcik, Viktor Vafeiadis, Francesco Zappa Nardelli, Suresh Jagannathan, Peter Sewell. In POPL 2011 (more details)
Mathematizing C++ Concurrency. Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. In POPL 2011 (more details)
x86-TSO: A Rigorous and Usable Programmer's Model for x86 Multiprocessors. Peter Sewell, Susmit Sarkar, Scott Owens, Francesco Zappa Nardelli, Magnus O. Myreen. In Communications of the ACM (Research Highlights) 2010 No.7.
A Rely-Guarantee proof system for x86-TSO assembly code programs. Tom Ridge. In VSTTE 2010.
Reasoning about the Implementation of Concurrency Abstractions on x86-TSO. Scott Owens. In ECOOP 2010. (more details)
Fences in Weak Memory Models. Jade Alglave, Luc Maranget, Susmit Sarkar and Peter Sewell In CAV 2010.
Fences in Weak Memory Models (Extended Version). Jade Alglave, Luc Maranget, Susmit Sarkar and Peter Sewell In FMSD, Volume 40, Number 2, April 2012.
A Better x86 Memory Model: x86-TSO. Scott Owens, Susmit Sarkar, and Peter Sewell. In TPHOLs 2009. (more details)
The Semantics of x86-CC Multiprocessor Machine Code (in POPL 2009). Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge, Thomas Braibant, Magnus O. Myreen, and Jade Alglave. (more details)
The Semantics of Power and ARM Multiprocessor Machine Code. Jade Alglave, Anthony Fox, Samin Ishtiaq, Magnus O. Myreen, Susmit Sarkar, Peter Sewell, and Francesco Zappa Nardelli. In DAMP 2009. (more details)

Software and Testing Tools

People

University of Cambridge:

University of Kent:

Scott Owens (previous Cambridge page)

University of York:

Mike Dodds

University of Leicester:

Tom Ridge

University College London:

Jade Alglave

Microsoft and MSR Cambridge:

Jaroslav Sevcik
Samin Ishtiaq

INRIA Paris-Rocquencourt:

Thomas Braibant (now INRIA Grenoble)
Luc Maranget
Pankaj Pawan
Francesco Zappa Nardelli

MPI-SWS

Viktor Vafeiadis

Uppsala University:

Tjark Weber

Purdue University

Suresh Jagannathan

IBM:

Derek Williams

University of Pennsylvania

Sela Mador-Haim

x86 (more details)

For x86, we have produced the x86-TSO model. To the best of our knowledge, x86-TSO is sound with respect to actual processor behaviour, matches the current vendor intentions, and is a good model to program above. The model is described in the CACM paper below, written to be accessible for a broad audience, and formally defined in the TPHOLs 2009 paper below. The POPL 2009 paper gives a rather different model, x86-CC, that was based on the misleading vendor documentation of the time but that is not sound with respect to observable behaviour.

x86-TSO: A Rigorous and Usable Programmer's Model for x86 Multiprocessors. Peter Sewell, Susmit Sarkar, Scott Owens, Francesco Zappa Nardelli, Magnus O. Myreen. In Communications of the ACM (Research Highlights) 2010 No.7.
A Better x86 Memory Model: x86-TSO. Scott Owens, Susmit Sarkar, and Peter Sewell. In TPHOLs 2009.
The Semantics of x86-CC Multiprocessor Machine Code (in POPL 2009). Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge, Thomas Braibant, Magnus O. Myreen, and Jade Alglave.

Reasoning about x86 concurrency: TRF

The low-level assembly-language implementations of the abstractions that support language-level concurrency, such as locks and concurrent datastructures, are particularly interesting and challenging to reason about because they invariably contain data races. The ECOOP 2010 paper below develops a novel principle for reasoning about assembly programs on our previous x86-TSO memory model, uses it to analyze concurrency abstraction implementations of two spinlocks (from Linux), a non-blocking write protocol, the double-checked locking idiom; and java.util.concurrent's Parker. This triangular race freedom principle strengthens the usual data-race freedom style of reasoning.

Reasoning about the Implementation of Concurrency Abstractions on x86-TSO. Scott Owens. In ECOOP 2010. (more details)

Power and ARM

Power and ARM have broadly similar relaxed-memory behaviour, as described in the tutorial below. We have produced several formal models. The PLDI 2011 paper below gives an abstract-machine model for Power that, to the best of our knowledge, accurately captures the architectural intent and observable processor behaviour for a wide range of subtle examples. This model was principally validated against Power but we believe is broadly applicable also to ARM. It was revised and extended to support the Power load-reserve/store-conditional operations as described in Sections 2 and 3 of our PLDI 2012 paper below. The CAV 2012 paper gives an axiomatic model that is provably equivalent to the PLDI 2011 operational model. The TACAS 2011 tool paper describes our Litmus experimental testing tool for testing the behaviour of hardware; the associated diy tool suite supports generating and running litmus tests for Power, ARM, and x86. The CAV 2010 paper gives an earlier axiomatic model that is sound with respect to current POWER processor behaviour, as far as we know, but does not match the architectural intent for some examples (as described in the PLDI 2011 paper). It is based on a general framework developed in Alglave's thesis. The DAMP 2009 paper gives a very preliminary axiomatic model, based on a naive reading of the vendor documentation. See also more details of the CAV and DAMP work.

A Tutorial Introduction to the ARM and POWER Relaxed Memory Models. Luc Maranget, Susmit Sarkar, and Peter Sewell. Draft.
Understanding POWER Multiprocessors. Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, Derek Williams. In PLDI 2011. (more details)
Synchronising C/C++ and POWER. Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty, Peter Sewell, Luc Maranget, Jade Alglave, Derek Williams. In PLDI 2012. (more details)
An Axiomatic Memory Model for POWER Multiprocessors. Sela Mador-Haim, Luc Maranget, Susmit Sarkar, Kayvan Memarian, Jade Alglave, Scott Owens, Rajeev Alur, Milo M. K. Martin, Peter Sewell, Derek Williams. In CAV 2012. (more details)
Litmus: Running Tests Against Hardware. Jade Alglave, Luc Maranget, Susmit Sarkar and Peter Sewell. Tool Demonstration Paper. In TACAS 2011. (the diy tool suite)
Fences in Weak Memory Models. Jade Alglave, Luc Maranget, Susmit Sarkar and Peter Sewell In CAV 2010.
The Semantics of Power and ARM Multiprocessor Machine Code. Jade Alglave, Anthony Fox, Samin Ishtiaq, Magnus O. Myreen, Susmit Sarkar, Peter Sewell, and Francesco Zappa Nardelli. In DAMP 2009.

C11 and C++11

C and C++ are defined by standards, but those standards have historically not covered the behaviour of concurrent programs, motivating an ongoing effort to specify concurrent behaviour in the new revisions of C and C++, C11 and C++11 [WG14, WG21, Boehm and Adve, PLDI 2008]. The key issue here is the multiprocessor relaxed-memory behaviour induced by hardware and compiler optimisations. The proposals aim to provide strong guarantees for race-free programs, together with new (but subtle) relaxed-memory atomic primitives for high-performance concurrent code. However, the draft standards (e.g. the Final Committee Draft (N3092), are prose documents: while the result of careful deliberation, they have almost inevitably been unclear on some points, and have been subject to some subtle semantic flaws. The POPL 2011 paper below gives a mathematically rigorous semantics for C++ concurrency. To the best of our knowledge, this captures the intent of the revised standards texts, incorporating changes that we suggested; see the further details for the current full version of the model. We hope that this will aid discussion of any further changes to the standard, provide an unambiguous correctness condition for compilers, and give a basis for analysis and verification of concurrent C and C++ programs. As a starting point for the latter, the paper also proves correctness of a compilation scheme from C/C++11 concurrency to x86-TSO. The PPDP 2011 paper uses the Isabelle Nitpick tool to efficiently explore C/C++11 executions.

Mathematizing C++ Concurrency. Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. In POPL 2011 (more details)
Nitpicking C++ Concurrency. Jasmin Christian Blanchette, Tjark Weber, Mark Batty, Scott Owens, Susmit Sarkar. In PPDP 2011

Compiling from C/C++11 Concurrency to Power

In these POPL 2012 and PLDI 2012 papers we prove correctness of the proposed compilation scheme from C/C++11 concurrency primitives to Power (again, the situation for ARM is similar). The first deals with the various forms of C/C++11 atomic accesses with relaxed memory orders. The second extends this to cover other synchronisation operations including C/C++11 read-modify-write operations, locks, and fences; it includes an operational model for the Power load-reserve/store-conditional operations which are used to implement the former.

Clarifying and Compiling C/C++ Concurrency: from C++11 to POWER. Mark Batty, Kayvan Memarian, Scott Owens, Susmit Sarkar, and Peter Sewell. In POPL 2012. (more details))
Synchronising C/C++ and POWER. Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty, Peter Sewell, Luc Maranget, Jade Alglave, Derek Williams. In PLDI 2012. (more details)

Compiler Verification: CompCertTSO, from Concurrent Clight (with TSO semantics) to x86-TSO (more details)

CompCertTSO is a compiler that generates x86 assembly code from ClightTSO, a large subset of the C programming language enhanced with concurrency primitives for thread management and synchronisation, and with a TSO relaxed memory model. The development is based on the CompCert 1.5 compiler from sequential Clight to PowerPC and ARM (developed by Leroy et al., INRIA). The CompCertTSO compiler is written mostly within the specification language of the Coq proof assistant, and its correctness --- the fact that, for well-defined source programs, any behaviour of the generated code (with x86-TSO semantics) is permitted by the source-language semantics --- has been proved within Coq. The POPL 2011 paper and draft below describe some of this work, and the SAS 2011 paper below describes further work on verified fence elimination.

CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency. Jaroslav Sevcik, Viktor Vafeiadis, Francesco Zappa Nardelli, Suresh Jagannathan, Peter Sewell. Draft.
Relaxed-Memory Concurrency and Verified Compilation. Jaroslav Sevcik, Viktor Vafeiadis, Francesco Zappa Nardelli, Suresh Jagannathan, Peter Sewell. In POPL 2011
Verifying Fence Elimination Optimisations. Viktor Vafeiadis, Francesco Zappa Nardelli. In SAS 2011.

Correctness of Compiler Optimisations for DRF

Current proposals for concurrent shared-memory languages, including C++ and C, give well-defined semantics (in fact, sequential consistency) only for programs without data races; this is known as the DRF guarantee. The correctness of compiler optimisations under the DRF guarantee is not completely clear, and experience with Java shows that this area is error-prone. The PLDI 2011 paper below gives a rigorous study of optimisations that involve both reordering and elimination of memory reads and writes, covering many practically important optimisations. It also discusses some surprising limitations of the DRF guarantee. Earlier work by Ševčík and Aspinall, in ECOOP 2008, showed that the Java Memory Model disallows some standard optimisations, including some performed by the Hotspot JVM. See also more details.

Safe Optimisations for Shared-Memory Concurrent Programs. Jaroslav Ševčík. In PLDI 2011.

Program Logics for x86-TSO

The paper below develops a program logic for x86-TSO assembly programs, extending rely-guarantee reasoning. It is formalised in HOL4 and illustrated with Simpson's 4-slot algorithm.

A Rely-Guarantee proof system for x86-TSO assembly code programs. Tom Ridge. In VSTTE 2010.

Position Papers, Tutorials and Invited Talks

Making Sense of Relaxed-Memory Concurrency, Invited Talk, 6th International Workshop on Systems Software Verification (SSV 2011), Peter Sewell
Making Sense of Shared-Memory Concurrency: from Architecture to Programming Language, Invited Tutorial, MEMOCODE 2011. Peter Sewell
Shared Memory: An Elusive Abstraction. Summer school lectures, UPMARC Summer School on Multicore Computing, 2011. Francesco Zappa Nardelli
Making Sense of Real-World Concurrency, Invited Talk, ARM Global Engineering Conference, May 2011. Peter Sewell
Making Sense of Multiprocessor Memory, University of Kent School of Computing Seminar, March 2011. Peter Sewell
Shared Memory: An Elusive Abstraction. Summer school lectures, ECOOP 2010. Francesco Zappa Nardelli
Relaxed Memory Models in 2010. Summer school lectures, UPMARC Summer School on Multicore Computing. Peter Sewell
Memory, an Elusive Abstraction. Invited talk, International Symposium on Memory Management (ISMM 2010). Peter Sewell
Low-level Concurrency - For Real. Summer school lectures, TIC 2010 (Third International School on Trends in Concurrency). Peter Sewell
Multiprocessor Architectures Don't Really Exist (But They Should). Invited talk, Microprocessor Test and Verification (MTV 2010). Peter Sewell and Susmit Sarkar.
Relaxed memory models must be rigorous. Position paper, in EC^2 2009. Francesco Zappa Nardelli, Peter Sewell, Jaroslav Sevcik, Susmit Sarkar, Scott Owens, Luc Maranget, Mark Batty, and Jade Alglave

Funding

We acknowledge funding from:

ANR grant ANR-11-JS02-011 WMC
Semantic Foundations for Real-World Systems. EPSRC Leadership Fellowship (Sewell) EP/H005633/1. 2009-2014.
Multiprocessors: From Microarchitecture to Semantic Theory. EPSRC Postdoctoral Research Fellowship (Sarkar) EP/H027351. 2010-2013.
Reasoning with Relaxed Memory Models . EPRSC grant EP/F036345. 2008-2012. Sewell, Parkinson, Fraser, Zappa Nardelli, Sarkar
INRIA Equipes Associées MM
ANR grant ANR-06-SETI-010-02 ParSec

[validate]