Cerberus

C remains central to our computing infrastructure but still lacks a clear and complete semantics. Programmers lack tools to explore the range of behaviours they should expect; compiler development lacks test oracles; and formal verification and analysis must make (explicitly or implicitly) many choices about the specific C they target. The ISO standards for C have been developed over many years but they remain unclear in some important respects, and in some areas there are differences between the properties of C that are relied on by the corpus of systems code in production, the properties that compiler implementers aim to provide, and the properties of C as it is specified in ISO C11.

The Cerberus project is developing a semantic model for a substantial fragment of C. It has several distinctive features:

This is a step towards a clear, consistent, and unambiguous semantics for C.

Papers

The C Memory Layout Model

We have assembled a collection of questions about the C memory layout semantics, supported by small test-case programs and discussion with respect to the ISO and de facto standards. This note summarises the most important questions, from an ISO-standard point of view: The following discusses the questions in more detail, including the test programs and some experimental data: Here is a tarball of our test programs, including a tool to run them and log the compile- and run-time output: In April-September 2015 we distributed a survey of 15 questions about C (What is C in practice? (Cerberus survey v2)). We were asking what C is in current mainstream practice: the behaviour that programmers assume they can rely on, the behaviour provided by mainstream compilers, and the idioms used in existing code, especially systems code. We were explicitly not asking what the ISO C standard permits, which is often more restrictive, or about obsolete or obscure hardware or compilers. We focussed on the behaviour of memory and pointers. We had responses from 323 people, including many compiler and OS developers. Here is a summary of the results and the (anonymised) detailed comments:

The C Memory Concurrency Model

We have developed an operational semantics for C11 concurrency, proved equivalent (in Isabelle/HOL) to the axiomatic model of Batty et al.

People

Contributors: The current main developers are Kayvan Memarian, Victor Gomes, and Kyndylan Nienhuis. Cerberus originated with Justus Matthiesen's 2010-11 Part II project dissertation and his 2011-12 MPhil dissertation; James Lingard's 2013-14 MPhil dissertation developed a certifying translation validator for simple C programs for the Clang front-end, w.r.t. the Cerberus and Vellvm semantics.

Funding

This work is funded by REMS: Rigorous Engineering of Mainstream Systems, EPSRC Programme Grant EP/K008528/1, EPSRC grant EP/H005633 (Leadership Fellowship, Sewell), and a Gates Cambridge Scholarship (Nienhuis). This work is also part of the CTSRD projects sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contract FA8750-10-C-0237. The views, opinions, and/or findings contained in this paper are those of the authors and should not be interpreted as representing the official views or policies, either expressed or implied, of the Department of Defense or the U.S. Government.