Simplifying ARM Concurrency: Multicopy-atomic Axiomatic and Operational Models for ARMv8

Christopher Pulte, Shaked Flur, Will Deacon, Jon French, Susmit Sarkar, Peter Sewell

ARM has a relaxed memory model, previously specified in informal prose for ARMv7 and ARMv8. Over time, and partly due to work building formal semantics for ARM concurrency, it has become clear that some of the complexity of the model is not justified by the potential benefits. In particular, the model was originally non-multicopy-atomic: writes could become visible to some other threads before becoming visible to all - but this has not been exploited in production implementations, the corresponding potential hardware optimisations are thought to have insufficient benefits in the ARM context, and it gives rise to subtle complications when combined with other ARMv8 features. The ARMv8 architecture has therefore been revised: it now has a multicopy-atomic model. It has also been simplified in other respects, including more straightforward notions of dependency, and the architecture now includes a formal concurrency model.

In this paper we detail these changes and discuss their motivation. We define two formal concurrency models: an operational one, simplifying the Flowing model of Flur et al., and the axiomatic model of the revised ARMv8 specification. The models were developed by an academic group and by ARM staff, respectively, and this extended collaboration partly motivated the above changes. We prove the equivalence of the two models. The operational model is integrated into an executable exploration tool with new web interface, demonstrated by exhaustively checking the possible behaviours of a loop-unrolled version of a Linux kernel lock implementation, a previously known bug due to unprevented speculation, and a fixed version.

Paper

Simplifying ARM Concurrency: Multicopy-atomic Axiomatic and Operational Models for ARMv8. Christopher Pulte, Shaked Flur, Will Deacon, Jon French, Susmit Sarkar, Peter Sewell. In POPL 2018

Errata

The paper's definition of candidate executions derived from operational model traces has an omission, and the definition of load/store-exclusive instructions in the text description in the supplementary material has a mismatch with the formal model, corrected here. See here for the details.

People

Christopher Pulte, University of Cambridge
Shaked Flur, University of Cambridge
Will Deacon, ARM Ltd.
Jon French, University of Cambridge
Susmit Sarkar, University of St Andrews
Peter Sewell, University of Cambridge

Supplementary Material

Web interface for the Flat-operational model (best used in Chrome/Chromium), and its user-guide help page.
The full text version of the Flat operational model.
The Herd version of the ARMv8 axiomatic model.
All the litmus tests that appear in the paper.
The proof that Flat-operational and ARMv8-axiomatic are equivalent. In particular, Section 1 explains, for each clause of the ARMv8 axiomatic model, why it is sound with respect to the operational model. (This has been updated after the submission to typeset the proof.)
ARMv8-axiomatic results for the non-mixed-size test suite.
Flat-operational results for the non-mixed-size test suite.
Flat-operational results for the mixed-size test suite.