Computer Laboratory

Open Source Code and Data

This page contains links to the code and data that my group has published. For code and data supporting publications, please see the table below or links against each paper on the publications page.

Dynamic Binary Modification

Our CGO 2019 and VEE 2019 papers present Janus, a framework for binary modification built on top of DynamoRIO that uses dynamic binary modification controlled through static binary analysis. We demonstrated Janus parallelising DoAll loops in SPEC CPU2006 binaries at CGO 2019, then inserting software prefetches for indirect memory accesses and vectorising loops at VEE 2019. Janus is open-source software and available on GitHub.

As part of a blog post comparing two dynamic binary modification tools for AArch64 (DynamoRIO and Mambo), I wrote a tool to instrument indirect control transfer instructions to gather statistics on their number, the number of targets they each have and the number of times each instruction transfers to the same target as it did the last time it was executed. The data is available here. The archive also contains results from running most of SPEC CPU 2006 compiled with gcc with optimisation level O2 and running with the reference input set.

Software Prefetching

Our CGO 2017 paper describes a technique for automatic insertion of software prefetch instructions for indirect memory accesses. Aside from the data repository accompanying the paper, you can also get the code directly from GitHub.

I have written a blog post about this and evaluated an additional benchmark that contains up to ten indirect array access and software prefetch for zero through to all of them. This can be downloaded here, with the same licence as the NAS parallel benchmark from which it comes. Please cite our publication if using this workload.

Lynx Queue

Lynx is a lock-free single-producer/single-consumer software queue for fine-grained communictaion. You can download C and C++ versions here, both licensed under the GPL version 2. Lynx is described in our ICS 2016 paper and there is a blog post on it too. Please cite this if you use Lynx for any other publication.

Data Repositories

Where we have released data repositories with our papers (pretty much all of them now), then these are linked against each publication on the publications page and duplicated in the table below.

PaperVenueData Repository
Duplo: A Framework for OCaml Post-Link OptimisationICFP 2020
Prefetching in Functional LanguagesISMM 2020
MuonTrap: Preventing Cross-Domain Spectre-Like Attacks by Capturing Speculative StateISCA 2020
MarkUs: Drop-in use-after-free prevention for low-level languagesS & P 2020
Cornucopia: Temporal Safety for CHERI HeapsS & P 2020
The Guardian Council: Parallel Programmable Hardware SecurityASPLOS 2020
HALO: Post-Link Heap-Layout OptimisationCGO 2020
CHERIvoke: Characterising Pointer Revocation using CHERI Capabilities for Temporal Memory SafetyMICRO 2019
ParaMedic: Heterogeneous Parallel Error CorrectionDSN 2019
Software Prefetching for Indirect Memory Accesses: A Microarchitectural PerspectiveTOCS 36(3) 2019
The Janus Triad: Exploiting Parallelism Through Dynamic Binary ModificationVEE 2019
Janus: Statically-Driven and Profile-Guided Automatic Dynamic Binary ParallelisationCGO 2019
Software prefetching for unstructured mesh applicationsIA3 2018
Parallel Error Detection Using Heterogeneous CoresDSN 2018
An Event-Triggered Programmable Prefetcher for Irregular WorkloadsASPLOS 2018
High Performance Fault Tolerance Through Predictive Instruction Re-ExecutionDFT 2017
Software Prefetching for Indirect Memory AccessesCGO 2017
On Microarchitectural Mechanisms for Cache Wearout ReductionTVLSI 25(3) 2017
COMET: Communication-Optimized Multi-threaded Error-detection TechniqueCASES 2016
Enhancing the L1 Data Cache Design to Mitigate HCICAL 15(2) 2016
Graph Prefetching Using Data Structure KnowledgeICS 2016
Lynx: Using OS and Hardware Support for Fast Fine-Grained Inter-Core CommunicationICS 2016
Performance Implications of Transient Loop-Carried Data Dependences in Automatically Parallelized LoopsCC 2016
Throttling Automatic Vectorization: When Less Is MorePACT 2015
REPAIR: Hard-Error Recovery via Re-ExecutionDFT 2015