Open Source Code and Data

This page contains links to the code and data that my group has published. For code and data supporting publications, please see the table below or links against each paper on the publications page.

Dynamic Binary Modification

Our CGO 2019 and VEE 2019 papers present Janus, a framework for binary modification built on top of DynamoRIO that uses dynamic binary modification controlled through static binary analysis. We demonstrated Janus parallelising DoAll loops in SPEC CPU2006 binaries at CGO 2019, then inserting software prefetches for indirect memory accesses and vectorising loops at VEE 2019. Janus is open-source software and available on GitHub.

As part of a blog post comparing two dynamic binary modification tools for AArch64 (DynamoRIO and Mambo), I wrote a tool to instrument indirect control transfer instructions to gather statistics on their number, the number of targets they each have and the number of times each instruction transfers to the same target as it did the last time it was executed. The data is available here. The archive also contains results from running most of SPEC CPU 2006 compiled with gcc with optimisation level O2 and running with the reference input set.

Software Prefetching

Our CGO 2017 paper describes a technique for automatic insertion of software prefetch instructions for indirect memory accesses. Aside from the data repository accompanying the paper, you can also get the code directly from GitHub.

I have written a blog post about this and evaluated an additional benchmark that contains up to ten indirect array access and software prefetch for zero through to all of them. This can be downloaded here, with the same licence as the NAS parallel benchmark from which it comes. Please cite our publication if using this workload.

Lynx Queue

Lynx is a lock-free single-producer/single-consumer software queue for fine-grained communictaion. You can download C and C++ versions here, both licensed under the GPL version 2. Lynx is described in our ICS 2016 paper and there is a blog post on it too. Please cite this if you use Lynx for any other publication.

Data Repositories

Where we have released data repositories with our papers (pretty much all of them now), then these are linked against each publication on the publications page and duplicated in the table below.

Paper	Venue	Data Repository
ParaVerser: Harnessing Heterogeneous Parallelism for Affordable Fault Detection in Data Centers	DSN 2025	https://doi.org/10.5281/zenodo.15080017
Adaptive CHERI Compartmentalization for Heterogeneous Accelerators	ISCA 2025	https://doi.org/10.5281/zenodo.15100923
PIP: An Ensemble of Programming-Idiom Predictors	CBP 2025	https://doi.org/10.17863/CAM.119036
MASCOT: Predicting memory dependencies and opportunities for speculative memory bypassing	HPCA 2025	https://doi.org/10.17863/CAM.114977
Parallaft: Runtime-based CPU Fault Tolerance via Heterogeneous Parallelism	CGO 2025	https://doi.org/10.5281/zenodo.14084708
A Deep Technical Review of nZDC Fault Tolerance	CC 2025	https://doi.org/10.5281/zenodo.14678385
Advanced Dynamic Scalarisation for RISC-V GPGPUs	ICCD 2024	https://doi.org/10.17863/CAM.111868
OptiWISE: Combining Sampling and Instrumentation for Granular CPI Analysis	CGO 2024	http://doi.org/10.17863/CAM.104277
MineSweeper: A "Clean Sweep" for Drop-In Use-after-Free Prevention	ASPLOS 2022	https://doi.org/10.17863/CAM.78150
Quantifying the Semantic Gap Between Serial and Parallel Programming	IISWC 2021	https://doi.org/10.17863/CAM.76224
Compendia: Reducing Virtual-Memory Costs via Selective Densification	ISMM 2021	https://doi.org/10.17863/CAM.68700
Speculative Vectorisation with Selective Replay	ISCA 2021	https://doi.org/10.17863/CAM.68206
ParaDox: Eliminating Voltage Margins via Heterogeneous Fault Tolerance	HPCA 2021	https://doi.org/10.17863/CAM.61808
Cinnamon: A Domain-Specific Language for Binary Profiling and Monitoring	CGO 2021	https://doi.org/10.17863/CAM.62760
Duplo: A Framework for OCaml Post-Link Optimisation	ICFP 2020	https://doi.org/10.17863/CAM.52533
Prefetching in Functional Languages	ISMM 2020	https://doi.org/10.17863/CAM.51790
MuonTrap: Preventing Cross-Domain Spectre-Like Attacks by Capturing Speculative State	ISCA 2020	https://doi.org/10.17863/CAM.50489
MarkUs: Drop-in use-after-free prevention for low-level languages	S&P 2020	https://doi.org/10.17863/CAM.46535
Cornucopia: Temporal Safety for CHERI Heaps	S&P 2020	https://doi.org/10.17863/CAM.51028
Software Prefetching for Unstructured Mesh Applications	TOPC 7(1) 2020	https://bitbucket.org/ioanhadade/au3x-ia3-reproduce/
The Guardian Council: Parallel Programmable Hardware Security	ASPLOS 2020	https://doi.org/10.17863/CAM.46514
HALO: Post-Link Heap-Layout Optimisation	CGO 2020	https://doi.org/10.17863/CAM.46071
CHERIvoke: Characterising Pointer Revocation using CHERI Capabilities for Temporal Memory Safety	MICRO 2019	https://doi.org/10.17863/CAM.42436
ParaMedic: Heterogeneous Parallel Error Correction	DSN 2019	https://doi.org/10.17863/CAM.37963
Software Prefetching for Indirect Memory Accesses: A Microarchitectural Perspective	TOCS 36(3) 2019	https://doi.org/10.17863/CAM.37731
The Janus Triad: Exploiting Parallelism Through Dynamic Binary Modification	VEE 2019	https://doi.org/10.17863/CAM.37523
Janus: Statically-Driven and Profile-Guided Automatic Dynamic Binary Parallelisation	CGO 2019	https://doi.org/10.17863/CAM.33893
Software prefetching for unstructured mesh applications	IA³ 2018	https://bitbucket.org/ioanhadade/au3x-ia3-reproduce/
Parallel Error Detection Using Heterogeneous Cores	DSN 2018	https://doi.org/10.17863/CAM.21857
An Event-Triggered Programmable Prefetcher for Irregular Workloads	ASPLOS 2018	https://doi.org/10.17863/CAM.17392
High Performance Fault Tolerance Through Predictive Instruction Re-Execution	DFT 2017	https://doi.org/10.17863/CAM.11957
On Microarchitectural Mechanisms for Cache Wearout Reduction	TVLSI 25(3) 2017	http://dx.doi.org/10.17863/CAM.6183
Software Prefetching for Indirect Memory Accesses	CGO 2017	http://dx.doi.org/10.17863/CAM.6349
COMET: Communication-Optimized Multi-threaded Error-detection Technique	CASES 2016	http://dx.doi.org/10.17863/CAM.590
Enhancing the L1 Data Cache Design to Mitigate HCI	CAL 15(2) 2016	https://www.repository.cam.ac.uk/handle/1810/249006
Graph Prefetching Using Data Structure Knowledge	ICS 2016	https://www.repository.cam.ac.uk/handle/1810/254642
Lynx: Using OS and Hardware Support for Fast Fine-Grained Inter-Core Communication	ICS 2016	https://www.repository.cam.ac.uk/handle/1810/254651
Performance Implications of Transient Loop-Carried Data Dependences in Automatically Parallelized Loops	CC 2016	https://www.repository.cam.ac.uk/handle/1810/253650
Throttling Automatic Vectorization: When Less Is More	PACT 2015	https://www.repository.cam.ac.uk/handle/1810/250381
REPAIR: Hard-Error Recovery via Re-Execution	DFT 2015	https://www.repository.cam.ac.uk/handle/1810/249207

Department of Computer Science and Technology

Open Source Code and Data

Dynamic Binary Modification

Software Prefetching

Lynx Queue

Data Repositories