Project Suggestions 2023-24

These might suit good students interested in combining semantics, systems, and/or mechanised reasoning. They are probably best suited for an MPhil/Part III project rather than a Part II project, though it might be possible to craft a Part II project in a similar vein on a smaller scale.

Testable executable specification of ELF-based dynamic linking

Supervisors: Stephen Kell (King's College London) and Peter Sewell

A dynamic linker is involved in the execution of most programs today. However, dynamic linking remains poorly understood, both conceptually (by programmers) and semantically (in its specification). This project is about producing a testable, executable specification of ELF-based dynamic linking, or at least some subset of it, and applying it in one or more ways.

The core of the project will involve learning about dynamic linking and producing a basic specification, executable in some convenient fashion e.g. as a checker that runs immediately before control passes to user code. This would check that linker-performed mappings (of needed libraries) and bindings (applying relocations etc. to plumb data and code together) meet expectations. It might also run again later, if dynamic loading occurs. Initial testing could focus on straightforward cases and "known deficiencies" (e.g. that copy-relocation of a protected symbol does not work correctly on older versions of glibc, that DF_1_INTERPOSE is unimplemented by glibc and musl, and other examples that can be found in bug trackers).

Extension aspects could include some of the following.

Various starting points are available: a Lem specification for compile-time ('static') linking published at OOPSLA 2016 (The missing link: explaining ELF static linking, semantically. Stephen Kell, Dominic P. Mulligan, and Peter Sewell), and a prior attempt at a load-time checker, in Simon Ser's Paris MPRI project.

Since the specification would be written in Lem, a purely functional ML-family specification language, functional programming skills are required. Familiarity with systems programming and outline familiarity with linking are also important, although it is not important to understand linking deeply at the start of the project.

Test infrastructure for relaxed virtual memory concurrency

Supervisors: Ben Simner + Peter Sewell

The Arm-A architecture (like most others) has a "relaxed memory model", a concurrency semantics that allows hardware optimisations such as out-of-order execution to be observable in a program's concurrency behaviour - as described in the Multicore Semantics and Programming course slides.

Recent and ongoing research is extending the scope of relaxed-memory semantics to cover systems features, including virtual memory, instruction fetch, and exceptions:

An important part of this work is experimental testing of the behaviour of hardware on small litmus tests, and comparison of that with the behaviour allowed by models. The Litmus test harness has been developed for many years to do the former:

Virtual memory tests make this especially challenging: early litmus tests could be run in user mode, but testing Stage 1 and Stage 2 address translation requires the test harness to run at operating-system and hypervisor exception levels (EL1 and EL2 in Arm-A), respectively. The Litmus tool has some support for Stage 1 testing, and Simner has a bare-metal harness described in the ESOP 2022 paper above, but neither copes with the full range of virtual memory and other systems tests we are now using, especially supporting the complex initial virtual-memory configurations that are needed and the full combination of Stage 1 and Stage 2.

This project would develop a new experimental test harness for running systems tests (perhaps building on the above, or with some other strategy to be discussed).

Test generation for relaxed virtual memory concurrency

Supervisors: Thibaut Pérami + Ben Simner + Peter Sewell

The Arm-A architecture (like most others) has a "relaxed memory model", a concurrency semantics that allows hardware optimisations such as out-of-order execution to be observable in a program's concurrency behaviour - as described in the "Multicore Semantics and Programming" course.

Those semantics have been formalized by various models, some of which are executable - i.e., for which there is a software tool that takes as input a test case and possible hardware behavior and decides whether it is allowed or not by the architecture. This is essential when developing new formal models of the architecture or extending previous versions, both for comparing models against experimentally observed hardware behaviour (to validate the model and/or to discover hardware bugs) and for comparing one model against another. Such comparisons also rely critically on having a good suite of interesting tests, hand-written and/or automatically generated. Several tools for generating interesting tests have been developed, including diy (https://diy.inria.fr/) and memalloy (https://johnwickerson.github.io/memalloy/), and related work uses SMT solvers to compare models, MemSynth (https://homes.cs.washington.edu/~emina/doc/memsynth.pldi17.pdf).

Recent and ongoing research is extending the scope of relaxed-memory semantics to cover systems features, including virtual memory, instruction fetch, and exceptions:

We have a substantial corpus of hand-written tests, symbolic evaluation tooling to compute whether a test is allowed in an arbitrary model (isla, https://www.cl.cam.ac.uk/~pes20/isla/isla-cav2021-extended.pdf), and some tooling for executing tests experimentally on hardware. We now think that it would be best to explore clean-slate automatic test generation to cover these new features. The particular aim of this project is to implement a tool similar to diy that focuses on tests and models for the virtual memory architecture of Arm processors. There is scope for creating use of SMT solvers to produce interesting tests, e.g. tests that discriminate between specific candidate models.

The student should:

Equivalence of Arm-A relaxed memory formalisations

Supervisors: Christopher Pulte + Thibaut Pérami + Ben Simner + Peter Sewell

The Arm-A architecture (like most others) has a "relaxed memory model", a concurrency semantics that allows hardware optimisations such as out-of-order execution to be observable in a program's concurrency behaviour - as described in the "Multicore Semantics and Programming" course.

To precisely define and document the semantics for hardware and software implementors and to support testing and verification, different formalisations have been developed for Armv8-A: (1) an operational semantics, expressing the concurrency behaviour using mechanisms resembling micro-architecture, and (2) an axiomatic model, capturing the concurrency behaviour abstractly using relations on memory events and axioms over them.

A detailed paper proof shows these two models to be equivalent (http://www.cl.cam.ac.uk/~pes20/armv8-mca/armv8-mca-draft.pdf, https://www.repository.cam.ac.uk/handle/1810/292229), but so far there is no mechanised proof of this - so the paper proof might contain errors, and in any case is hard to maintain as the models evolve. The goal of this project is a mechanised equivalence proof between versions of the axiomatic and operational models, cut down or simplified enough to make this viable within the project scope. This would involve formalising model definitions and mechanising the existing paper proof in a proof assistant such as Coq or Isabelle. Some initial infrastructure for these definitions will be available in Coq.

Students interested in this project should have some experience in working with a mechanised proof assistant.