Computer Laboratory

CHERI

CHERI Frequently Asked Questions (FAQ)

Here are answers to some of the common questions we've received (or in some cases, anticipate) about CHERI. See the BERI FAQ for questions about the BERI platform.

CHERI: Capability Hardware Enhanced RISC Instructions

What is CHERI?

CHERI refers to Capability Hardware Enhanced RISC Instructions, an Instruction-Set Architecture (ISA) extension that implements a hybrid capability-system model providing fine-grained memory protection and scalable software compartmentalisation within processes. CHERI capabilities are a new hardware data type intended to support the robust and secure implementation of pointers. Capabilities hold a virtual address as well as metadata describing the memory resources referenced by the pointer (bounds, permissions, ...) and also a 1-bit tag that protects the pointer itself (integrity, valid provenance, ...). The architecture (and hence hardware) protects pointers in registers and memory, controlling their manipulation (e.g., enforcing monotonicity on bounds modifications), ensuring only authorized use (e.g., dereference within bounds), and also their in-memory integrity (e.g., detecting pointer corruption or injection). Policies are expressed by the software (OS, compiler, language runtime, and application) and enforced by the hardware.

CHERI capabilities are a low-level primitive usable for many purposes. We employ them in implementing pointer protection and fine-grained memory protection for the C and C++ programming languages, for safe inter-language interoperability, and also for scalable fine-grained software compartmentalization. CHERI is targeted by the compiler and used to represent programming-language level protection properties, in contrast to conventional memory management units (MMUs) that are used to construct page-based virtual memory by operating systems. In CHERI, the capability coprocessor and MMU live side by side, hence being a hybrid model, providing strong protection guarantees while allowing significant compatibility with current software at both binary and source-code levels -- a technique inspired by our earlier work on Capsicum, a hybrid capability-system model for UNIX.

CHERI also refers to our prototype implementation of the ISA, embodied in a capability coprocessor in the BERI implementation. We have released the BERI source code, along with adaptations of the FreeBSD operating system (CheriBSD) and LLVM compiler suite (CHERI Clang/LLVM). We have also adapted tools such as the GDB debugger and LLVM linker (LLD) to support conventional software development. Please see our hardware downloads and software downloads pages for more information. Research into BERI and CHERI has primarily been supported by DARPA and Google, but we've also had PhD students and postdocs supported by ARM, HP, EPSRC, and other government and industrial sponsors -- we are very grateful to all of these sponsors, without whom this work could not have been done!

What is a hybrid capability system?

Capabilities are unforgeable tokens of authority that may be passed from subject to subject (delegated) granting rights to objects; typically, capabilites incorporate both a reference to an object and a mask of permissions reflecting possible operations or methods on the object. Conventional capability systems constrain executing code such that executing code can access only objects as permitted via capabilities; this limitation might be enforced by constraints imposed by an ISA, operating-system API, programming language, network protocol, or even by static or dynamic limits imposed on a program using code analysis or transformation. Microkernels (such as seL4) often implement capability systems as their fundamental security model as the model provides a strong mechanism on which many different policies can be implemented.

A hybrid capability system is one in which more conventional systems designs (such as a UNIX kernel or RISC processor) are adapted to support a capability model such that some, but not all, code is limited by capability-system constraints, and a set of pragmatic tradeoffs are adopted to allow conventional system objects to be exposed via more capability-esque models. For example, Capsicum composes a capability-system model with the UNIX API, treating file descriptors as capabilities, and allowing selected processes to be marked as losing access to global system namespaces, in effect, imposing a capability system. Hybrid capability systems offer improved adoptability by allowing components of existing applications to be selectively migrated to a least-privilege programming model, although at the cost of reduced robustness and security as compared to a pure capability system and application suite written entirely with those goals in mind.

CHERI is a hybrid capability system in several senses:

  • CHERI's capability system is blended with a conventional RISC usermode architecture without disrupting the majority of key design decisions.
  • CHERI's capabililty system is cleanly and usefully composed with conventional rin-gbased privilege and virtual memory based on Memory Management Units (MMUs).
  • CHERI can be targeted by a C/C++-language compiler with strong compatibility, performance, and protection properties.
  • CHERI supports a range of OS models including conventional MMU-based virtual-memroy designs, hybridized designs that host capability-based software within multiple virtual address spaces, and pure single-address-space capability systems.
  • CHERI is incrementally adoptable: Within pieces of software, capability-based design can be disregarded, partially adopted, or fully adopted with useful and predictable semantics.
What is the difference between BERI and CHERI?

BERI is the Bluespec Extensible RISC Implementation, a hardware description of a 64-bit pipelined RISC processor, as well as debugging tools and C-language simulated buses and devices. CHERI is a set of ISA and implementation extensions providing fine-grained memory protection and support for scalable software compartmentalisation developed as part of the CTSRD Project joint between SRI International and the Universit of Cambridge Computer Laboratory. The BERI implementation includes optionally compiled support for CHERI, enabled via the CP2 flag at compile-time. CHERI occupies the coprocessor-2 instruction encoding space, and must be explicitly enabled. You may find Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 6) (UCAM-TR-907) and our various research papers useful reading if this is of interest.

Why 32 capability registers?

As our starting point was the 64-bit MIPS ISA, we made a number of design choices to maximise congruence in the CHERI ISA, including selecting 32 capability registers to correspond to the 32 general-purpose registers in the MIPS ISA. This is an arbitrary choice, and one we may revisit due to its size: we believe that a 16-entry capability register file would likely be adequate.

Why a separate capability register file?

Our decision to use a separate register file for capabilities is also one that could be revisited: We modeled CHERI as a CP2 extension to MIPS, but there is no reason to think that we could not extend a general-purpose 64-bit register file to hold 128-bit capabilities (plus tags) in the same style that 32-bit register files are extended to 64 bits in many 64-bit architectures. This might offer reduced microarchitectural overhead by avoiding additional control logic for a second register file. Our CHERI RISC-V architectural sketch (in Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 6)) utilizes this approach.

Could you do it with fewer than 256 bits?

Over CHERI ISA versions 4 to 6, we explored and developed a 128-bit compressed capability format employing fat-pointer compression techniques. This approach exploits redundancy between the two 64-bit virtual addresses representing bounds and the 64-bit pointer itself. The CHERI-128 approach retains strong C-language compatibility (e.g., out-of-bounds pointers) and retains our required security properties (e.g., monotonicity), while also achieving good microarchitectural performance (i.e., avoiding multi-cycle delays for key operations). 128-bit capabilities substantially reduce the data-cache overhead of CHERI for pointer-intensive workloads. Support for 128-bit capabilities can be found in recent versions of our CHERI FPGA prototype and also Qemu-CHERI.

Why tagged memory?

Many memory-based attacks on contemporary hardware-software designs rely on corrupting pointers or lengths. Tags provide strong pointer-integrity guarantees that are difficult to implement efficiently without hardware support. Tags add one bit of memory for every 128 or 256 bits of data, with a <1% memory overhead; they are maintained with cache lines and so obey normal cache-coherency rules. In our CHERI prototype, we partition physical memory, setting aside a portion to hold tags, rather than requiring a change to memory interfaces. Currently, that partition is hard-coded, but it would ideally be managed by the firmware or software supervisor.

How specific is CHERI to the MIPS ISA?

In short, not very: we used the 64-bit MIPS ISA as a starting point as we required large address spaces and access to a conventional software stack, but CHERI is at heart a RISC rather than MIPS-ISA approach. CHERI is "localised" to MIPS in the sense that it occupies a MIPS-ISA coprocessor encoding, and we adopt a number of design conventions congruent to the MIPS ISA to ease compiler support, but it is easy to imagine applying these ideas to other 64-bit RISC ISAs such as ARMv8 and RISC-V. The current CHERI ISA (version 6) provides a detailed elaboration of CHERI within the 64-bit MIPS ISA, and two architectural sketches (64-bit RISC-V and x86_64) suggesting how the ideas might apply in other architectures. Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 6) technical report.

How does CHERI compare with other memory-protection schemes?

Our 2014 ISCA paper includes a detailed comparison of the protection semantics and performance of CHERI as compared to other schemes, including software bounds checking, Intel MPX, HardBound, Mondriaan, and M-Machine. Each selects a different point in a larger tradeoff space. Key design choices that have motivated CHERI include a focus on providing strong protection for C-language pointers, hybridization with MMU-based virtualization, avoidance of hardware lookup tables and associative structures in the microarchitecture, and strong support for existing software stacks. More recent papers at venues such as IEEE SSP and ASPLOS have elaborated on these overheads as we have pursued utilizing capabilities in more and more aspects of software design.