C remains central to our computing infrastructure but still lacks a
clear and complete semantics. Programmers lack tools to explore the
range of behaviours they should expect; compiler development lacks
test oracles; and formal verification and analysis must make
(explicitly or implicitly) many choices about the specific C they
target. The ISO standards for C have been developed over many years
but they remain unclear in some important respects, and in some areas
there are differences between the properties of C that are relied on
by the corpus of systems code in production, the properties that
compiler implementers aim to provide, and the properties of C as it is
specified in ISO C11.
The Cerberus project is developing a semantic model for a substantial
fragment of C. It has several distinctive features:
This is a step towards a clear, consistent, and unambiguous semantics for C.
- Where the ISO C11 standard is clear and corresponds with
practice, Cerberus aims to follow that.
- Where there are differences, chiefly in the memory layout
model (the behaviour of pointers, pointer arithmetic,
uninitialised values, etc.), we aim to clarify the de facto
standards and understand how the ISO standard could be reconciled
- Cerberus precisely defines a range of allowed behaviour, not just
that of some specific implementation.
- It is executable, to explore all behaviours or single paths of
small test programs.
- It is designed to be integrated with an operational model for C11
concurrency, with a mechanised proof of equivalence to the
axiomatic C11 model of Batty et al.
- Its thread-local semantics is factored via an elaboration into a
simpler Core language, to make it readable and conceptually and
mathematically tractable; the dynamic semantics of Core can be
linked with our sequential memory layout model or with that
operational concurrency model.
- Its front-end is written from scratch to closely follow the C11
standard, including a parser that follows the C11 standard
grammar, and a typechecker.
The C Memory Layout Model
We have assembled a collection of questions about the C memory layout
semantics, supported by small test-case programs and
discussion with respect to the ISO and de facto standards. This note
summarises the most important questions, from an ISO-standard point of
The following discusses the questions in more detail, including the
test programs and some experimental data:
Here is a tarball of our test programs, including a tool to run
them and log the compile- and run-time output:
In April-September 2015 we distributed a survey of 15 questions about
C (What is C in practice? (Cerberus survey v2)). We were asking what
C is in current mainstream practice: the behaviour that programmers
assume they can rely on, the behaviour provided by mainstream
compilers, and the idioms used in existing code, especially systems
code. We were explicitly not asking what the ISO C standard permits,
which is often more restrictive, or about obsolete or obscure hardware
or compilers. We focussed on the behaviour of memory and pointers.
We had responses from 323 people, including many compiler
and OS developers. Here
is a summary of the results and the (anonymised) detailed comments:
The C Memory Concurrency Model
We have developed an operational semantics for C11 concurrency, proved
equivalent (in Isabelle/HOL) to the axiomatic model of Batty et al.
The current main developers are Kayvan Memarian, Victor Gomes, and Kyndylan Nienhuis.
Cerberus originated with Justus Matthiesen's 2010-11 Part II project
dissertation and his 2011-12 MPhil dissertation; James Lingard's
2013-14 MPhil dissertation developed a certifying translation
validator for simple C programs for the Clang front-end, w.r.t. the
Cerberus and Vellvm semantics.
This work is funded by REMS: Rigorous Engineering of Mainstream
Systems, EPSRC Programme Grant
EPSRC grant EP/H005633 (Leadership Fellowship, Sewell),
a Gates Cambridge Scholarship (Nienhuis).
This work is also part of the CTSRD projects sponsored by the Defense
Advanced Research Projects Agency (DARPA) and the Air Force Research
Laboratory (AFRL), under contract FA8750-10-C-0237.
The views, opinions, and/or findings contained in this paper are those of the
authors and should not be interpreted as representing the official views or
policies, either expressed or implied,
of the Department of Defense or the U.S. Government.