REMS group project suggestions for 2025-2026

These might suit good students interested in combining semantics, systems, and/or mechanised reasoning. Some are probably best suited for an MPhil/Part III project rather than a Part II project, though it might be possible to craft a Part II project in a similar vein on a smaller scale. Members of the REMS group might be interested in supervising other projects, using or building semantics for aspects of real-world computer systems that have been developed in the group.

Cerberus-Fulminate student project suggestion (added 2025-08-26)

Potential supervisors: Rini Banerjee with Kayvan Memarian / Neel Krishnaswami / Peter Sewell

Systems software (e.g. operating systems, hypervisors) forms the foundation of all modern computing infrastructure, and so ensuring its reliability is critical. The majority of such software is written in low-level, imperative programming languages like C. Reasoning about idiomatic, real-world C code is challenging for a number of reasons: (1) its semantics is complex, including undefined behaviours, complex control-flow, and implicit type coercions; and (2) C allows programmers to manually manage memory via pointers, which improves the performance of their code but also leads to complex ownership patterns which are difficult to reason about.

One tool for reasoning about low-level, pointer-manipulating programs is separation logic, which provides operators for explicitly describing sophisticated ownership patterns, such as those present in a typical C program. For example, p ↦ v reads as “the address p points to the value v in heap memory”, and the * operator allows assertions on distinct subsets of the heap, or heaplets, to be composed together easily.

Separation logic has formed the basis of a number of proof and static analysis tools in recent years. One such tool, CN [1], supports separation-logic testing and verification of C programs via user-provided specifications. CN is built on top of Cerberus [2], a well-validated executable semantics for a large subset of ISO C, meaning it can be applied to a large fragment of real C programs. The runtime testing part of the CN toolchain, known as Fulminate [3], translates the user-provided specifications into C assertions and augments the source program with checks in the expected places, which includes dynamic ownership checking. Separation-logic specification testing is a relatively recent research endeavour, due to both technical and social barriers up to this point, and we would like to explore this direction further.

For this project, we would like to extend the Cerberus runtime to perform runtime ownership checks and logical assertion checks using the existing CN specification information in the CN/Cerberus pipeline. These would morally be the same checks as what Fulminate is doing currently, with the added benefits of:

Being able to quickly add runtime checking of new CN constructs to the pipeline in the future, without having to augment the existing Fulminate machinery (which requires specialist knowledge of the codebase and additional considerations on how functional CN features can even be represented in C)
Being able to debug directly from the OCaml that CN, Fulminate and Cerberus are implemented in, rather than trying to decipher the result of running gdb/lldb on the Fulminate-instrumented C program

[1] CN: Verifying Systems C Code with Separation-Logic Refinement Types. Christopher Pulte, Dhruv C. Makwana, Thomas Sewell, Kayvan Memarian, Peter Sewell, Neel Krishnaswami. POPL 2023.

[2] The Cerberus C Semantics. Kayvan Memarian. Technical report, 2023.

[3] Fulminate: Testing CN Separation-Logic Specifications in C. Rini Banerjee, Kayvan Memarian, Dhruv Makwana, Christopher Pulte, Neel Krishnaswami, Peter Sewell. POPL 2025.

Semantics-based symbolic execution of relocatable object files in Compiler Explorer (or in VScode, or in gdb/lldb)

Compiler Explorer can compile code in a range of languages, showing the source and compiled code in various ways, for example: https://gcc.godbolt.org/z/aezKnnrTa.

It can run the code - but it can’t show you what each individual instruction does.

We have instruction-set architecture (ISA) semantics in Sail for Arm (automatically translated from the Arm-internal ASL definition), RISC-V (maintained by RISC-V International), and various CHERI architectures.

What one would actually like to see execute is the relocatable object file generated by a compiler, as the relocations, together with the generated DWARF debug information, relate the assembly and runtime state to the source program. That’s made more complex by the fact that the debug information is also subject to relocations.

Previous work has developed an executable semantics for a substantial fragment of DWARF, extended in Matej Urban’s Part II project to support some Arm relocations and a symbolic evaluator.

This project would combine work on the semantics with development of a GUI - ideally integrated with Compiler Explorer, and/or as a Visual Studio plugin, and/or integrated with the gdb or lldb command-line interface, that lets one step through the generated object file, both at the granularity of instructions (like a traditional debugger) and also within the intra-instruction behaviour defined by the Sail models. One would have to make the DWARF semantics more complete and robust, e.g. to deal with different DWARF versions and all the Arm and RISC-V relocations, and validate that that really works on a range of object files. The GUI could be of varying levels of sophistication, to let the user control stepping both between and within instructions, and to render as much of the machine state as the user wants at any time.

Testable executable specification of ELF-based dynamic linking

Supervisors: Stephen Kell (King’s College London) and Peter Sewell

A dynamic linker is involved in the execution of most programs today. However, dynamic linking remains poorly understood, both conceptually (by programmers) and semantically (in its specification). This project is about producing a testable, executable specification of ELF-based dynamic linking, or at least some subset of it, and applying it in one or more ways.

The core of the project will involve learning about dynamic linking and producing a basic specification, executable in some convenient fashion e.g. as a checker that runs immediately before control passes to user code. This would check that linker-performed mappings (of needed libraries) and bindings (applying relocations etc. to plumb data and code together) meet expectations. It might also run again later, if dynamic loading occurs. Initial testing could focus on straightforward cases and “known deficiencies” (e.g. that copy-relocation of a protected symbol does not work correctly on older versions of glibc, that DF_1_INTERPOSE is unimplemented by glibc and musl, and other examples that can be found in bug trackers).

Extension aspects could include some of the following.

Testing it on a substantial corpus of programs e.g. drawn from FreeBSD, and ideally on more than one dynamic linker (e.g. FreeBSD’s, glibc’s, musl’s).
Extending it to complex features of dynamic linking and/or different scenarios (e.g. there is an opportunity to model CHERI linking here, which makes various security-critical additions to the basic ELF model).
Extending the support for automated testing, e.g. generation of random inputs or mutations thereof, to allow fuzz testing.
Using it for one or more simple proofs, e.g. that a specific single dynamic relocation at a call site has the result of yielding the expected call instruction (with reference to an instruction set semantics).

Various starting points are available: a Lem specification for compile-time (‘static’) linking published at OOPSLA 2016 (The missing link: explaining ELF static linking, semantically. Stephen Kell, Dominic P. Mulligan, and Peter Sewell), and a prior attempt at a load-time checker, in Simon Ser’s Paris MPRI project.

Since the specification would be written in Lem, a purely functional ML-family specification language, functional programming skills are required. Familiarity with systems programming and outline familiarity with linking are also important, although it is not important to understand linking deeply at the start of the project.

Porting ArchSem, a CPU architecture semantic framework, from Rocq to Lean

Supervisors: Thibaut Pérami and Tobias/Mathieu/Leo?

We have developed a Rocq framework ArchSem to describe the semantics of model CPU architecture. It joins Rocq code produced from Sail models for Arm and RISC-V with concurrency weak memory models (TODO some links) through an interface that is essentially a free monad over a given set of architectural effects such as reading or writing registers and memory.

ArchSem allows us, using those two components, to define whole-CPU architecture models for a multicore CPU, with virtual memory, self-modifying code, etc. The goal is that those models can be extracted for execution in OCaml, although this hasn’t been done yet.

We have also recently made a new backend of Sail targeting Lean to be able to extract those Sail models to Lean. This backend currently only targets a simple sequential state monad with no concurrency support.

The goal of this project is to port ArchSem into Lean, integrating it with the Sail-to-Lean backend in a way that can be used both for proofs in Lean, and for model testing and evaluation. For latter some performance optimisation might be needed if the student wants to pursue that road.

The student will have to implement in Lean, or reuse from existing libraries, some data structures such as an effect system, free monads and their traces, relational algebra constructs, and various concrete execution monads, potentially modifying them from the Rocq implementation to they can fit idiomatically in Lean.

Then they can use those to redefine the rest of the semantics as in ArchSem. The goal would be to be able to port successfully one concurrency model from there - not necessarily all of ArchSem.

Extensions could be: - Support execution of operational and/or axiomatic models within Lean on litmus tests, gluing onto or reimplementing an existing litmus test front-end. - Improve performance to be able to run larger and larger tests against the models. - Test implemented models against large existing test suites of architectural litmus test. - Prove some basic properties of the setup or the supporting libraries, for example correctness of some basic library functions. - Implement more than one model from either the original ArchSem implementation or from the literature and test them

The student will need a good knowledge of functional programming and ideally some experience with dependent type programming. Previous familiarity with an interactive theorem prover (ideally Rocq or Lean) is a plus although no proofs are required in this project. Obviously previous experience in Lean itself would be beneficial, but is not required. Some knowledge of system programming and assembly languages (Arm or RISC-V) would also be a significant plus.

Next-generation tools for exploring relaxed-memory concurrency behaviour.

Supervisors: Some combination of Peter Sewell, Thibaut Pérami, and Alasdair Armstrong.

RMEM and isla-axiomatic are tools for exploring the relaxed-memory concurrency behaviour allowed by the ARM, IBM POWER, RISC-V, and x86 architectures, as expressed by operational models (RMEM) and axiomatic models (isla-axiomatic). Both let one take small test programs, either concurrency litmus tests or standalone ELF binaries, and explore the envelope of all their architecturally allowed behaviour. For operational models, RMEM lets one do that interactively, randomly, or exhaustively; for axiomatic models; isla-axiomatic lets one compute whether some final-state assertion can hold. Each has a command-line version and a web version for convenient use - for RMEM, compiled to JavaScript and running in the browser; for isla-axiomatic, running as a server.

However:

we now have semantics for several aspects of systems behaviour - instruction fetch and icache/dcache management, virtual memory and TLB management, and exceptions;
work is in progress on more rigorously defined versions of these models, in the Rocq theorem prover rather than just in the code of the tool, and we need command-line and web-interface support for those;
the existing RMEM web interface is old and in several respects rather awkward; and
ideally one would have one common tool and interface, that can handle both kinds of model.

This project would be to create and evaluate such tooling. The exact scope would have to be determined, but it would involve both the underlying relaxed semantics of modern architectures and engineering usable web interfaces.