Clarifying and Compiling C/C++ Concurrency: from C++11 to POWER

The upcoming C and C++ revised standards add concurrency to the languages, for the first time, in the form of a subtle relaxed memory model (the C++11 model). This aims to permit compiler optimisation and to accommodate the differing relaxed-memory behaviours of mainstream multiprocessors, combining simple semantics for most code with high-performance low-level atomics for concurrency libraries.

In this paper, we first establish two simpler but provably equivalent models for C++11, one for the full language and another for the subset without consume operations. Subsetting further to the fragment without low-level atomics, we identify a subtlety arising from atomic initialisation and prove that, under an additional condition, the model is equivalent to sequential consistency for race-free programs.

We then prove our main result, the correctness of two proposed compilation schemes for the C++11 load and store concurrency primitives to Power assembly, having noted that an earlier proposal was flawed. (The main ideas apply also to ARM, which has a similar relaxed memory architecture.)

This should inform the ongoing development of production compilers for C++11 and C1x, clarifies what properties of the machine architecture are required, and builds confidence in the C++11 and Power semantics.

Papers

Errata

See here.

Supplementary material

The models are expressed in Lem code, a lightweight language for executable mathematical definitions.

POWER Model (for more details, see Understanding POWER Multiprocessors)

C++11 Models (for more details, see Mathematizing C++ Concurrency)

Proofs

Examples used in pointwise optimality result

People

University of Cambridge:

INRIA Paris-Rocquencourt:


[validate]