These might suit good students interested in combining semantics, systems, and/or mechanised reasoning. Some are probably best suited for an MPhil/Part III project rather than a Part II project, though it might be possible to craft a Part II project in a similar vein on a smaller scale. Members of the REMS group might be interested in supervising other projects, using or building semantics for aspects of real-world computer systems that have been developed in the group.
Supervisors: Peter Sewell and Stephen Kell (King's College London)
The DWARF format is widely used to express debug information in compiled binaries to support source-level debugging, e.g. via gdb. It includes information about:
Previous work in the REMS project developed an executable specification for a large fragment of DWARF, as an extension to our specification for ELF files and compile-time ('static') linking published at OOPSLA 2016 (The missing link: explaining ELF static linking, semantically. Stephen Kell, Dominic P. Mulligan, and Peter Sewell). These are expressed in the pure functional specification language Lem, which is similar to the pure functional fragment of core OCaml and can be translated into OCaml for execution, and to theorem-prover definitions for reasoning. Previous work also developed tooling to render the debug information, as computed by this specification, into a convenient human-readable format, e.g. this example, and used the debug information for translation validation, to help prove the correspondence between source and binary for very small programs.
There are now several interesting challenges:
Supervisors: Stephen Kell (King's College London) and Peter Sewell
A dynamic linker is involved in the execution of most programs today. However, dynamic linking remains poorly understood, both conceptually (by programmers) and semantically (in its specification). This project is about producing a testable, executable specification of ELF-based dynamic linking, or at least some subset of it, and applying it in one or more ways.
The core of the project will involve learning about dynamic linking and
producing a basic specification, executable in some convenient fashion e.g. as
a checker that runs immediately before control passes to user code. This would
check that linker-performed mappings (of needed libraries) and bindings
(applying relocations etc. to plumb data and code together) meet expectations. It might also run again
later, if dynamic loading occurs. Initial testing could focus on
straightforward cases and "known deficiencies" (e.g. that copy-relocation of a
protected symbol does not work correctly on older versions of glibc, that
DF_1_INTERPOSE
is unimplemented by glibc and musl, and other examples that can
be found in bug trackers).
Extension aspects could include some of the following.
Since the specification would be written in Lem, a purely functional ML-family specification language, functional programming skills are required. Familiarity with systems programming and outline familiarity with linking are also important, although it is not important to understand linking deeply at the start of the project.