Department of Computer Science and Technology

CTSRD

Part II, Part III, and ACS student project ideas


This page documents dissertation and research project ideas based on CTSRD-related research for University of Cambridge Computer Laboratory students. The CTSRD project spans hardware, software, and theory, exploring the interactions between compatibility, performance, and security as we transform the way programs are represented in terms of instruction-set architecture, OS platform, and compilation. We have contributions in the following areas that might form the basis for a student development of research project (full project details can be found later on this page):

  • A novel security-centred capability-based CPU architecture, CHERI, that includes a hardware implementation (on FPGA), fast instruction-level emulator (based on QEMU), mature compiler suite (using Clang/LLVM), operating system (based on FreeBSD), and open-source application stack. The platform implements a new pointer type in hardware (a capability) that can provide the basis for pointer protection, fine-grained memory protection, and also fine-grained software compartmentalization. We also employ formal modeling to model and verify properties of the architecture.

    Hardware projects in this area might take our current CHERI as a starting point, exploring further hardware changes, compiler features, OS features, etc. For example, exploring how CHERI composes with DMA engines, 32-bit microcontrollers, and vector units would make for exciting hardware-software projects.

    On the purely software side, evaluating new software techniques to use CHERI, such as deploying capabilities ubiquitously in larger software packages, exploring the implications for software debugging and tracing, or developing new software compartmentalization models, would all be exciting projects.

    And, of course, projects are welcome to explore combinations of changes to hardware and software; for example, ISA extensions with corresponding compiler or OS changes, which might be explored using QEMU-CHERI and/or CHERI on FPGA as the project proceeds.

    See below for more ideas.

  • Our hardware prototype of CHERI is based on our open-source BERI 64-bit MIPS processor core, which could form the basis of other development and research projects such as exploring extensions to floating-point and vector processing, cryptographic coprocessors, debugging, and other forms of security extensions such as information-flow tracking. Projects in this area would likely be primarily hardware-facing, taking our current BERI (and perhaps CHERI) as a starting point, adding and evaluating new architectural features.

    See below for more ideas.

  • We have recently been exploring issues in I/O security from both hardware and software perspectives: when I/O peripherals are attached to buses such as USB, Thunderbolt, and so on, what attack surfaces are exposed both in terms of malicious access to other devices on the same bus, microcontrollers and other hardware elements on the path to the general-purpose system, and in the general-purpose computing environment including the host operating system, IOMMUs, and other system resources. Contact us for specific ideas in this area.

    Adversarial and defensive projects in both hardware and software would be appropriate, identifying new classes of vulnerabilities, exploring novel exploit techniques, and identifying, prototyping, and evaluating potential solutions. These projects might be done only on current hardware (e.g., Intel-based Macs, Windows PCs, and UNIX systems), or could be combined with our work on novel security architecture (e.g., using CHERI to defended against attacks arriving via DMA).

    Contact us (Robert Watson) for more specific ideas in this area.

  • We have been pursuing a number of ideas in software static and dynamic analysis and transformation to improve software security, primarily supported by new Clang and LLVM passes that rely on annotations to guide analysis and instrumentation.

    Our SOAAP project has used annotations to assist programmers in reasoning about software sandboxing for vulnerability mitigation, with recent extensions on not just analysing but also transforming software to automatically sandbox elements of larger applications (e.g., often vulnerable image processing). Potential new work might extend automatic sandboxing, or teach the SOAAP framework about other forms of sandboxing such as CHERI or Intel SGX compartmentalization.

    Our TESLA project has extended the C programming language to include annotations for temporal automata, allowing dynamic instrumentation to check temporal safety properties such as temporal memory safety for heap allocations, lock protocols, access-control checks, and protocol state machines. We have most recently been exploring using static model checking to eliminate the need for dynamic instrumentation, improving performance. Potential new work might use static checking to optimize instrumentation even where a property is too complex to fully check at compile time, or extend the set of properties being validated.

    See below for more ideas: SOAAP, TESLA.

  • We have previously worked extensively in the area of software compartmentalization based on current MMU-based processes, including our Capsicum OS capability framework, which extends conventional UNIX OS designs to be capability systems. Now widely deployed, the Capsicum model provides useful low-level primitives for software compartmentalization, but would benefit from a larger software stack providing higher-level abstraction facilitating easier deployment. Potential projects in this area might push Capsicum support into larger applications as a motivation for providing a more powerful supporting framework, as well as explore new ways to improve security, compatibility, and performance.

    See below for more ideas.

There are, of course, many other interesting projects that might be done in related areas, and you should feel to talk to us about these as well. Please contact the listed potential supervisor, Dr Robert Watson, or Professor Simon Moore for further information.

BERI: Bluespec Extensible RISC Implementation

BERI is the Bluespec Extensible RISC Implementation, a prototype hardware-software research platform consisting of a RISC processor and peripherals synthesisable to FPGA, the FreeBSD operating system, Clang/LLVM compiler suite, and a large number of open-source applications. We have not drawn out specific project ideas for BERI, but there are many interesting things to do here, including support for additional devices, characterisation of the realism of the FPGA-based implementation as compared to fabricated processor designs, etc. Please contact Robert Watson or Jon Woodruff if you are interested in a potential project in this area.

CHERI: Capability Hardware Enhanced RISC Instructions

CHERI (Capability Hardware Enhanced RISC Instructions) is a novel CPU architecture implementing a hybrid capability model: in-address-space memory protection and fine-grained compartmentalization. The CTSRD project has prototyped the CPU on FGPA, and adapted the FreeBSD operating system and Clang/LLVM to run on the platform, as well as a broad array of open-source applications. The platform is now suitable for a variety of Part-II, Part-III, and ACS operating-system, compiler, and software research and development projectsinvestigating software security, language security, program analysis and transformation, processor sucurity, and the hardware-software interface. Potential supervisors exist in all of these areas; please contact us to be pointed in the direction of a suitable supervisor for your interests.

Read these three papers on the CHERI architecture, CHERI and the C programming language, and CHERI-based software compartmentalisation to learn more.

  • Debugger support for CHERI

    The CTSRD project has extended countless parts of the hardware-software stack to support capabilities, including the ISA, processor, compiler, and operating system. However, we have only made very preliminary efforts to teach software debuggers (such as GDB or LLDB) to interpret capabilities, and the OS to provide access to capability-related features via debugging interfaces such as the ptrace() system call or core-dump mechanism. This project would seek to fill that gap by extending an extisting debugger, such as the LLVM debugger (LLDB) to support capabilities, providing a mature debugging interface for capability-based applications. Research aspects include understanding the security implications for both debugger and debugged applications, exploring the semantic implications of capabilities for OS debugging APIs and mechanisms, and assessing the performance and complexity impacts of the changes.

    Prerequisites: Strong experience programming in C, the UNIX environment; Part-II security or Part-III/ACS R209 recommended.

    Potential supervisors: Robert Watson, David Chisnall

  • Operating-System Support for Tagged Architectures

    CHERI is a "tagged architecture" meaning that each word or line of memory has additional, unaddressable, metadata associated with it. The CHERI model relies on a 1-bit tag indicating whether a 32-byte line is a capability or ordinary data. Other tagged models associate information-flow labels, pointer metadata, taint tracking, or other properties with words or lines. These novel techniques are being written about heavily in the hardware-software research community, but their implications for operating-system design are as-yet unclear: contemporary OSes depend on simpler models of memory, in which all words are addressable and interchangeable (e.g., allowing virtual memory to be paged to a swap partition or file in the filesystem, which may not have obvious places to store tags). This project will explore how contemporary OS designs respond to the introduction of tags. Development tasks might include extending the process model, debugging interfaces, virtual-memory subsystem, filesystems, etc, to better support tags in a generalisable way, although with a focus on CHERI as a useful case study. Evaluation criteria might include the functional implications, complexity of changes, and performance impact.

    Prerequisites: Strong experience programming in C, the UNIX environment; Part-II security or Part-III/ACS R209 recommended.

    Potential supervisor: Robert Watson

SOAAP: Security-Oriented Analysis of Application Programs

SOAAP is the Security-Oriented Analysis of Application Programs. Sandboxing technologies such as Capsicum and CHERI support the fine-grained compartmentalisation of large-scale applications such as web browsers and office suites, as well as multiple-component software such as the UNIX userspace. When deployed correctly, application compartmentalisation offers significant benefits by allowing policies to be imposed within applications, and in mitigating exploited vulnerabilities. However, application compartmentalisation remains an art rather than a science: identifying, implementing, and debugging partitioning strategies requires detailed expertise in both the application and security. SOAAP is exploring semi-automated techniques, grounded in static analysis, dynamic analysis, and automated program transformation, to improve the developer experience. We have an LLVM-based prototype that models Capsicum security, now being applied to applications such as the Chromium web browser.

Read this paper on SOAAP to learn more.

We have not drawn out specific project ideas for SOAAP, but there are many interesting things to do here, including developing new analysis techniques, automated program techniques, evaluating effectiveness, etc. Please contact Robert Watson or Khilan Gudka if you are interested in a potential project in this area.

Capsicum: practical capabilities for UNIX

Capsicum is a hybrid capability-system model that merges historic ideas about capability-based security with contemporary UNIX operating systems. The Capsicum API is designed to support application compartmentalization, allowing user processes to execute in capability mode, unable to access global namespaces, with added APIs for capability-like features such as capabilities based on file descriptors, process descriptors, and a service model based on the Casper daemon. Developed in collaboration between the University of Cambridge and Google, ongoing Capsicum work is being funded by Google, DARPA, and the FreeBSD Foundation; Capsicum is available out-of-the-box in FreeBSD, and Google has developed patches for the Linux kernel.

Extending Capsicum and exploring how applications can use its features remains an active area of work, with many potential Part-II development or Part-III/ACS research projects.

Read this paper on Capsicum to learn mores.

  • Capsicum Software TPM and Crypto Service Module

    Compartmentalized application components frequently require access to cryptographic services such as key management and TLS. However, this components often should not be granted direct network access, or even use of keying material to build TLS connections. Simultaneously, cryptographic code itself proves highly vulnerable, and should be sandboxed away from other sensitive application components. This project would develop a "software TPM"-like cryptographic module for Capsicum, intended to offer cryptographic services to sandboxed components while itself executing in one (or perhaps many) sandboxes. Services would include key storage, indirect use of keys (e.g., the ability to encrypt with a key while not having access to the key), and higher-level protocols such as TLS (e.g., the ability to communicate over TLS without having either socket access, or access to keying material used to authenticate the connection). Potential evaluation criteria would include security improvements and performance impact, with key questions including how to maximise security improvement while minimising overhead and code change.

    Prerequisites: Strong C programming experience, UNIX programming; Part-II security or Part-III/ACS R209 recommended.

    Potential supervisors: Robert Watson, Khilan Gudka

    This project does not require use of custom hardware.

  • Capsicum-Aware Shell

    The classic UNIX pipe programming model calls for self-contained programs with specific functions to be chained together in processing pipelines that stream data from one to another. The shell is responsible for constructing these pipelines, linking the output of one process with the input of another using UNIX pipes. Frequently, the first program accepts as its input redirection from a file, although it might also take a series of arguments passed as a wildcard. Likewise, output is frequently to one or more target files at the end of the pipeline. This structure has a strong potential alignment with Capsicum, in that access to files and data is set up by the shell rather than the individual components, which frequently require relatively little direct access to the filesystem, network, etc.

    This project is to create a Capsicum-aware shell, in which the shell itself runs with full ambient authority, but individual stages in the pipeline will run in capability model, limiting their authority. The shell must set up and pass any rights required by those stages, passed via capabilities. Ideally the shell would support both historic UNIX pipeline stages (e.g., unmodified grep) as well as least-privilege-aware stages (e.g., a version of sed intended to run without ambient authority), allowing gradual adoption and a focus on applications most suited for this mode of operation.

    There are a large number of open design choices to explore in order to understand performance, security, usability, and programmability tradeoffs. For example: How might the shell determine whether a particular program is able to understand a capability-based execution environment? Will all programs be able to start up with only explicitly delegated privileges, or might some need to enter capability model later after acquiring additional rights not easily identified by the shell? Will it be useful to add new shell syntax to allow network sockets to be set up, or to interpret not just wildcards but the need to delegate multiple files via a new ABI? To what extent can any user-visible change be avoided if starting with an existing shell such as sh, bash, or tcsh?

    Prerequisites: Strong C programming experience, UNIX programming; Part-II security or Part-III/ACS R209 recommended.

    Potential supervisors: Robert Watson, Khilan Gudka

    This project does not require use of custom hardware.

  • Capsicum Support in GNUstep

    GNUstep is an implementation of the APIs from the OpenStep specification and later from Apple's Cocoa. This project will involve modifying the framework to make use of Capsicum, a simple set of capability APIs for FreeBSD. There are several steps:

    • Modify NSWorkspace to read a plist describing required services from an application or tool's property list and ensure that they have access to them, then call cap_enter before spawning. This part requires rtld to support sandboxed-on-launch code. If this support is not finished, then a first approximation can be achieved by calling cap_enter() from NSApplicationMain().
    • Implement a file chooser service that will pass file and directory descriptors to applications that try to use the standard open / save dialogs.
    • Ensure that the distributed notification centre and pasteboard server connections are open before entering the sandbox.

    At this stage, applications should be useable with capsicum with some small modifications. There are several additional small changes that could make the porting easier:

    • Modify NSFileWrapper to use openat() and friends.
    • Create an NSString subclass that carries a file descriptor for a base path as an instance variable and allows all of the standard path modification operations to work relative to this.
    • Maintain a dictionary mapping paths to directory descriptors, so that attempts to open a file with NSFileHandle or NSFileWrapper will use openat() via this mechanism.

    Evaluation should include some example applications running in sandboxes, a comparison with the MAC-based sandboxing on OS X, and ideally an example of using Distributed Objects to implement privilege separation in an existing application.

    Prerequisites: Knowledge of Objective-C, UNIX programming.

    Potential supervisor: David Chisnall

    This project does not require use of custom hardware.

TESLA: Temporally Enhanced System Logic Assertions

  • TESLA realtime and probability distributions

    Temporally Enhanced Security Logic Assertions (TESLA) introduce C-language extensions for inline temporal assertions and automata validation, which in turn drive compiler-generated runtime instrumentation and validation of temporal effects. However, strict sequence-based properties are not the only type of temporal properties that may be of interest: we are also interested in realtime effects, such as validating assertions of realtime behaviour in protocols, and distribution effects over time, in which sampled values take on desired properties (e.g., randomness and monotonicity). This project would extend the TESLA framework to allow these properties to be declared and continuously validated during program execution.

    This project will investigate possible extensions to TESLA improving its performance and functionality:

    • Adopting new static-analysis techniques that allow: (a) optimising out local checks where it can be determined that they would always pass; (b) improving run-time instrumentation performance by lifting checks out of loops (etc); and (c) providing compile-time failures for locally validatable properties, rather than requiring runtime testing.
    • Investigating 'distribution assertions' in which tracked data-structure fields are checked temporal properties other than explicit automata -- e.g., probability distributions, mean/media/stddev, monotonic increase/ decrease, likely/unlikely, etc.
    • More complex instremented invariant structures, such as providing side-effect-free (functional?) code snippets that, at various events, validate those properties without requiring manual instrumentation; for example, the correctness of cached values that would otherwise require walking large data structures such as trees or linked lists.
    • Other interesting applications of TESLA's instrumentation and analysis approaches.

    Potential evaluation criteria might involve usefulness (by qualitative metrics), false positives/negatives, and compile-time and run-time performance impact.

    Read this paper on TESLA to learn more.

    Prerequisites: strong working knowledge of the C programming language, and an interest in both compilers and low-level system software, such as operating system kernels and network daemons.

    Potential supervisor: Robert Watson

    This project does not require use of custom hardware.