Computer Laboratory

Open Source Code and Data

This page contains links to the code and data that my group has published. For code and data supporting publications, please see the table below or links against each paper on the publications page.

Dynamic Binary Modification

Our CGO 2019 and VEE 2019 papers present Janus, a framework for binary modification built on top of DynamoRIO that uses dynamic binary modification controlled through static binary analysis. We demonstrated Janus parallelising DoAll loops in SPEC CPU2006 binaries at CGO 2019, then inserting software prefetches for indirect memory accesses and vectorising loops at VEE 2019. Janus is open-source software and available on GitHub.

As part of a blog post comparing two dynamic binary modification tools for AArch64 (DynamoRIO and Mambo), I wrote a tool to instrument indirect control transfer instructions to gather statistics on their number, the number of targets they each have and the number of times each instruction transfers to the same target as it did the last time it was executed. The data is available here. The archive also contains results from running most of SPEC CPU 2006 compiled with gcc with optimisation level O2 and running with the reference input set.

Software Prefetching

Our CGO 2017 paper describes a technique for automatic insertion of software prefetch instructions for indirect memory accesses. Aside from the data repository accompanying the paper, you can also get the code directly from GitHub.

I have written a blog post about this and evaluated an additional benchmark that contains up to ten indirect array access and software prefetch for zero through to all of them. This can be downloaded here, with the same licence as the NAS parallel benchmark from which it comes. Please cite our publication if using this workload.

Lynx Queue

Lynx is a lock-free single-producer/single-consumer software queue for fine-grained communictaion. You can download C and C++ versions here, both licensed under the GPL version 2. Lynx is described in our ICS 2016 paper and there is a blog post on it too. Please cite this if you use Lynx for any other publication.

Data Repositories

Where we have released data repositories with our papers (pretty much all of them now), then these are linked against each publication on the publications page and duplicated in the table below.

PaperVenueData Repository
Duplo: A Framework for OCaml Post-Link OptimisationICFP 2020https://doi.org/10.17863/CAM.52533
Prefetching in Functional LanguagesISMM 2020https://doi.org/10.17863/CAM.51790
MuonTrap: Preventing Cross-Domain Spectre-Like Attacks by Capturing Speculative StateISCA 2020https://doi.org/10.17863/CAM.50489
MarkUs: Drop-in use-after-free prevention for low-level languagesS & P 2020https://doi.org/10.17863/CAM.46071
Cornucopia: Temporal Safety for CHERI HeapsS & P 2020https://doi.org/10.17863/CAM.51028
The Guardian Council: Parallel Programmable Hardware SecurityASPLOS 2020https://doi.org/10.17863/CAM.46514
HALO: Post-Link Heap-Layout OptimisationCGO 2020https://doi.org/10.17863/CAM.46071
CHERIvoke: Characterising Pointer Revocation using CHERI Capabilities for Temporal Memory SafetyMICRO 2019https://doi.org/10.17863/CAM.42436
ParaMedic: Heterogeneous Parallel Error CorrectionDSN 2019https://doi.org/10.17863/CAM.37963
Software Prefetching for Indirect Memory Accesses: A Microarchitectural PerspectiveTOCS 36(3) 2019https://doi.org/10.17863/CAM.37731
The Janus Triad: Exploiting Parallelism Through Dynamic Binary ModificationVEE 2019https://doi.org/10.17863/CAM.37523
Janus: Statically-Driven and Profile-Guided Automatic Dynamic Binary ParallelisationCGO 2019https://doi.org/10.17863/CAM.33893
Software prefetching for unstructured mesh applicationsIA3 2018https://bitbucket.org/ioanhadade/au3x-ia3-reproduce/
Parallel Error Detection Using Heterogeneous CoresDSN 2018https://doi.org/10.17863/CAM.21857
An Event-Triggered Programmable Prefetcher for Irregular WorkloadsASPLOS 2018https://doi.org/10.17863/CAM.17392
High Performance Fault Tolerance Through Predictive Instruction Re-ExecutionDFT 2017https://doi.org/10.17863/CAM.11957
Software Prefetching for Indirect Memory AccessesCGO 2017http://dx.doi.org/10.17863/CAM.6349
On Microarchitectural Mechanisms for Cache Wearout ReductionTVLSI 25(3) 2017http://dx.doi.org/10.17863/CAM.6183
COMET: Communication-Optimized Multi-threaded Error-detection TechniqueCASES 2016http://dx.doi.org/10.17863/CAM.590
Enhancing the L1 Data Cache Design to Mitigate HCICAL 15(2) 2016https://www.repository.cam.ac.uk/handle/1810/249006
Graph Prefetching Using Data Structure KnowledgeICS 2016https://www.repository.cam.ac.uk/handle/1810/254642
Lynx: Using OS and Hardware Support for Fast Fine-Grained Inter-Core CommunicationICS 2016https://www.repository.cam.ac.uk/handle/1810/254651
Performance Implications of Transient Loop-Carried Data Dependences in Automatically Parallelized LoopsCC 2016https://www.repository.cam.ac.uk/handle/1810/253650
Throttling Automatic Vectorization: When Less Is MorePACT 2015https://www.repository.cam.ac.uk/handle/1810/250381
REPAIR: Hard-Error Recovery via Re-ExecutionDFT 2015https://www.repository.cam.ac.uk/handle/1810/249207