C remains central to our computing infrastructure but still lacks a
clear and complete semantics. Programmers lack tools to explore the
range of behaviours they should expect; compiler development lacks
test oracles; and formal verification and analysis must make
(explicitly or implicitly) many choices about the specific C they
target. The ISO standards for C have been developed over many years
but they remain unclear in some important respects, and in some areas
there are differences between the properties of C that are relied on
by the corpus of systems code in production, the properties that
compiler implementers aim to provide, and the properties of C as it is
specified in ISO C11.
The Cerberus project is developing semantic models for a substantial
fragment of C. It has several distinctive features:
This is a step towards a clear, consistent, and unambiguous semantics for C.
- Where the ISO C11 standard is clear and corresponds with
practice, Cerberus aims to follow that.
- Where there are ambiguities or differences, chiefly in
the memory layout model (the behaviour of pointers,
pointer arithmetic, uninitialised values, etc.), we aim to
clarify the de facto standards and understand how the ISO
standard could be reconciled with them.
- Cerberus precisely defines the range of allowed behaviour, not just
that of some specific implementation.
- It is executable, to explore either all behaviours or single
paths of small test programs.
- Its thread-local semantics is factored via an elaboration into a
simpler Core language, to make it readable and conceptually and
mathematically tractable; the dynamic semantics of Core can be
linked with various memory object models
- The Cerberus front-end is written from scratch to closely follow the C11
standard, including a parser that follows the C11 standard
grammar, and a typechecker.
- The Cerberus BMC tool supports bounded model checking (for small
examples), combining support for much of the Cerberus
thread-local semantics, a modern memory object model, and an
arbitrary axiomatic concurrency model.
- A previous version of Cerberus supported integration with an
operational concurrency model, proved equivalent to the C/C++11
axiomatic concurrency model of Batty et al.
Cerberus Web Interface
The Cerberus web
interface lets one interactively, randomly, or exhaustively
explore the behaviour of small sequential C test programs in the Cerberus
Cerberus BMC Web Interface
The Cerberus BMC web
interface lets one
explore the behaviour of small concurrent C test programs with
respect to an arbitrary axiomatic concurrency model.
- Cerberus-BMC: a principled reference semantics and exploration tool
for concurrent and sequential C. Stella Lau, Victor B. F. Gomes, Kayvan Memarian, Jean Pichon-Pharabod, and
In CAV 2019: Proc. 31st International Conference on
Computer-Aided Verification, July 2019.
Exploring C Semantics and Pointer Provenance.
Kayvan Memarian, Victor B. F. Gomes, Brooks Davis, Stephen Kell,
Robert N. M. Watson, Peter Sewell.
In Proc. 46th ACM SIGPLAN Symposium on Principles of Programming Languages
Supplementary material (tests and test results).
Into the Depths of C: Elaborating the De Facto Standards.
Kayvan Memarian, Justus Matthiesen,
James Lingard, Kyndylan Nienhuis, David Chisnall,
Robert N. M. Watson, and Peter Sewell.
In Proc. 37th
ACM SIGPLAN Conference on Programming Language Design and
Implementation (PLDI 2016). Distinguished paper award.
- An operational semantics for C/C++11
Kyndylan Nienhuis, Kayvan Memarian, and Peter Sewell.
In OOPSLA 2016.
This gives an operational semantics for C11 concurrency, proved
equivalent (in Isabelle/HOL) to the axiomatic model of Batty et al.
C Memory Object Model Study Group
The C/C++ Memory Object Models: ISO WG21 papers
WG21 Cologne meeting, July 2019
The C Memory Object Model: ISO WG14 papers
WG14 virtual meeting, 2022-07
WG14 virtual meeting, 2021-03
WG14 virtual meeting, 2020-10
WG14 London meeting, 2019-04
C provenance semantics: slides (extracts from N2363).
Peter Sewell, Kayvan Memarian, Victor
B. F. Gomes, Jens Gustedt, and Martin Uecker.
ISO/IEC JTC1/SC22/WG14 N2378. April 2019.
N2362: Moving to a provenance-aware memory object model for
C, Jens Gustedt, Peter Sewell, Kayvan Memarian, Victor
B. F. Gomes, and Martin Uecker.
ISO/IEC JTC1/SC22/WG14 N2362 v1,
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2362.pdf, March 2019.
N2363: C provenance semantics: examples,
Peter Sewell, Kayvan Memarian, Victor B. F. Gomes, Jens Gustedt, and Martin
ISO/IEC JTC1/SC22/WG14 N2363,
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2363.pdf, April 2019.
- tarball of N2363
N2364: C provenance semantics: detailed semantics (for
PNVI-plain, PNVI address-exposed, PNVI address-exposed user-disambiguation,
and PVI models), Peter Sewell, Kayvan Memarian, and Victor
B. F. Gomes.
ISO/IEC JTC1/SC22/WG14 N2364,
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2364.pdf, April 2019.
N2369: Pointer lifetime-end zap, Paul E.
McKenney, Maged Michael, and Peter Sewell.
ISO/IEC JTC1/SC22/WG14 N2369,
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2369.pdf, April 2019.
WG14 Pittsburgh meeting 2018-10
WG14 Brno meeting working drafts 2018-04
WG14 Brno pre-meeting mailing, 2018-04
WG14 Pittsburgh meeting, 2016-10
WG14 London meeting, 2016-04
Here is a tarball of our test programs, including a tool to run
them and log the compile- and run-time output:
In 2013, we disseminated a web form made of 42 questions, ranging over
the semantics of pointers, the representation of objects, and the
interaction of the two with other values (e.g. through the use of cast
operators). Each question consisted of a prose description of a
programming idiom, followed by a concrete C program.
We targeted this to a small number of experts, but this survey demanded a lot offrom the respondents; it was best done by discussion in person over several hours - so this was limited to 16 participants. A (no longer active) snapshot of the questions is here.
In April-September 2015 we distributed a survey of 15 questions about
C (What is C in practice? (Cerberus survey v2)). We were asking what
C is in current mainstream practice: the behaviour that programmers
assume they can rely on, the behaviour provided by mainstream
compilers, and the idioms used in existing code, especially systems
code. We were explicitly not asking what the ISO C standard permits,
which is often more restrictive, or about obsolete or obscure hardware
or compilers. We focussed on the behaviour of memory and pointers.
We had responses from 323 people, including many compiler
and OS developers. Here
is a summary of the results and the (anonymised) detailed comments:
The main Cerberus developer is Kayvan Memarian. The experimental CN backend is by Christopher Pulte. Victor Gomes made substantial contributions across the system, and Stella Lau was the main developer of Cerberus BMC. Kyndylan Nienhuis worked on the operational semantics for C11 concurrency. Cerberus originated with Justus Matthiesen's 2010-11 Part II project dissertation and his 2011-12 MPhil dissertation. James Lingard's 2013-14 MPhil dissertation developed a certifying translation validator for simple C programs for the Clang front-end, w.r.t. the Cerberus and Vellvm semantics.
This software was developed largely within the Rigorous Engineering of
Mainstream Systems (REMS) project at the University of Cambridge. It
has received funding from the European Research Council (ERC) under
the European Union's Horizon 2020 research and innovation programme
(grant agreement No 789108, ELVER); the EPSRC Programme Grant REMS:
Rigorous Engineering of Mainstream Systems (EP/K008528/1); an EPSRC
Leadership Fellowship EP/H005633 (Sewell); a Gates Cambridge
Scholarship (Nienhuis); an MIT EECS Graduate Alumni Fellowship (Lau);
The work is also part of the CTSRD projects sponsored by the Defense
Advanced Research Projects Agency (DARPA) and the Air Force Research
Laboratory (AFRL), under contract FA8750-10-C-0237. The views,
opinions, and/or findings contained in the papers are those of the
authors and should not be interpreted as representing the official
views or policies, either expressed or implied, of the Department of
Defense or the U.S. Government.