Department of Computer Science and Technology

Technical reports

High-performance memory safety: optimizing the CHERI capability machine

Alexandre J. P. Joannou

May 2019, 132 pages

This technical report is based on a dissertation submitted September 2017 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Peterhouse College.

DOI: 10.48456/tr-936

Abstract

This work presents optimizations for modern capability machines and specifically for the CHERI architecture, a 64-bit MIPS instruction set extension for security, supporting fine-grained memory protection through hardware enforced capabilities.

The original CHERI model uses 256-bit capabilities to carry information required for various checks helping to enforce memory safety, leading to increased memory bandwidth requirements and cache pressure when using CHERI capabilities in place of conventional 64-bit pointers. In order to mitigate this cost, I present two new 128-bit CHERI capability formats, using different compression techniques, while preserving C-language compatibility lacking in previous pointer compression schemes. I explore the trade-offs introduced by these new formats over the 256-bit format. I produce an implementation in the L3 ISA modelling language, collaborate on the hardware implementation, and provide an evaluation of the mechanism.

Another cost related to CHERI capabilities is the memory traffic increase due to capability-validity tags: to provide unforgeable capabilities, CHERI uses a tagged memory that preserves validity tags for every 256-bit memory word in a shadow space inaccessible to software. The CHERI hardware implementation of this shadow space uses a capability-validity-tag table in memory and caches it at the end of the cache hierarchy. To efficiently implement such a shadow space and improve on CHERI’s current approach, I use sparse data structures in a hierarchical tag-cache that filters unnecessary memory accesses. I present an in-depth study of this technique through a Python implementation of the hierarchical tag-cache, and also provide a hardware implementation and evaluation. I find that validity-tag traffic is reduced for all applications and scales with tag use. For legacy applications that do not use tags, there is near zero overhead.

Removing these costs through the use of the proposed optimizations makes the CHERI architecture more affordable and appealing for industrial adoption.

Full text

PDF (3.4 MB)

BibTeX record

@TechReport{UCAM-CL-TR-936,
  author =	 {Joannou, Alexandre J. P.},
  title = 	 {{High-performance memory safety: optimizing the CHERI
         	   capability machine}},
  year = 	 2019,
  month = 	 may,
  url = 	 {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-936.pdf},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-936},
  number = 	 {UCAM-CL-TR-936}
}