Computer Laboratory

Parts II, III & ACS Projects

These are some Part II, Part III and ACS project ideas that I've put together. The groupings are only loose and many of the projects could be adapted to suit any course stage. Please contact me for more information. At the end there are a list of successful projects that have been done in the recent past, to help seed ideas.

If you have your own ideas for a project that you'd like to do with me, then it would be great to hear them so please also get in touch. I'm interested in advising anybody wanting to do a project in computer architecture or compilers.

Part II Projects

Also, check out the Computer Architecture Group projects page for other ideas. Some of these are replicated there.

Garbage Collector For C

C is a language where the programmer, for better or worse, has complete control over memory allocation and subsequent deallocation. This causes obvious issues when reasoning about memory usage is complex, whereby memory leaks occur (through forgetting to deallocate memory) and double-frees occur (when deallocating a block multiple times). Wouldn't it be great to be able to automatically manage memory in C to avoid these issues? Turns out, Hans Boehm has written a mark-and-sweep garbage collector already. The aim of this project is to write a different one, perhaps using generations or reference counting.

JavaScript Dependence Profiler

JavaScript has become the de facto language for web applications, historically just on the client but extending to server-side applications too with the development of environments such as Node.js. In addition, subsets of JavaScript, such as asm.js, allow programs written in other languages to be run as web applications. However, it is difficult currently to understand the dependences between different parts of a JavaScript application (e.g. between different loop iterations or different functions). The aim of this project, therefore, is to build a dependence profiler for JavaScript. This may be a standalone tool, or integrated into the browser or JavaScipt engine, and should provide information to the user about the dependences within an application in a useful and intuitive format.

Hardware Transactional Memory

Transactional memory systems have been gaining increasing popularity since multicores became commonplace due to their ease of use, allowing the programmer to avoid reasoning about locks while transparently supporting atomic access to data. There are a number of software transactional memory libraries available and hardware transactional memory has now become mainstream with Intel's release of the TSX extensions in Haswell processors (bugs notwithstanding). However, a significant downside is that no ordering can be enforced on transaction commit, meaning that TSX is a poor fit for techniques such as thread-level speculation. This project would start by implementing a simple version of hardware transactional memory within a simulator, such as gem5. It would then evaluate the amount of performance improvement available either through source code alterations, or automatically in hardware through speculative thread creation.

Part III / ACS Projects

Binary Alias Analysis

One of the (many!) problems with transforming application binaries is that a significant amount of information about the code has been lost during the compilation stages. Analysing and transforming binaries is an important task because it allows us to optimise programs even when we don't have access to the source code. A number of transformations rely on knowing whether two instructions access the same locations in memory, so that we can ensure we maintain application correctness when we perform our optimisations. Recently, a number of analyses have been proposed to identify the memory objects that a program uses, specifically for Intel x86 binaries. The aim of this project is to implement a subset of these analyses for ARM binaries, and develop strategies for coping with facets of the ISA that are not present in x86, like predicated execution. If time permits, this analysis could then be used to implement a code motion transformation to safely move instructions around in the binary

Vectorisation in a Dynamic Binary Translator

Modern computer processors contain short vector (SIMD) instructions to take advantage of data parallel operations. My group has a tool that supports binary modification of a program through analysis of its binary and then runtime dynamic translation to add features. We currently use it for adding thread-level parallelism and software prefetching, to improve performance. The idea of this project is to add in vectorisation support too, using an algorithm such as SLP. This is a challenging, yet achievable project, because it requires significant changes to the application binary, in addition to the static analysis required. As well as writing the analysis and transformtion, internals of the DBT may need changing to allow efficient vectorisation of the code.

Old Projects

DoAll Parallelisation - Part II Project

The idea of this project was to identify DoAll parallelism within program loops using the LLVM intermediate representation and then split up the loops into multiple threads to execute each iteration concurrently. The project was extended to extract a form of DoAcross parallelism too.

Software Error Detection - Part II Project

This project implemented SWIFT, a technique to detect transient errors by adding extra software instructions.

Dynamic Data-Dependence Analysis - Part II Project

Data dependence profiling is time consuming and uses a lot of space if taking program traces. To combat this, this project implemented SD3, an online data dependence analyser. A library implementing SD3 was written and then interfaced with a program automatically by writing a pass within LLVM to insert calls into the library.

Function Call Parallelisation - Part II Project

This project attempted to run functions in their own threads. It used software transactional memory to deal with data dependences and performed the transformation automatically in LLVM.

Ultra-Fine-Grained Parallelisation - ACS Project

Automatic program parallelisation is the process of taking a sequential application and splitting it into threads that can be run independently. One of the challenges of automatic parallelisation is defining the amount of code that each thread should execute. This project took this towards an extreme, by splitting up the application into small units of work, with threads executing many of these tasks each.

JavaScript Parallelisation - ACS Project

The main idea of this project was to transform the code generated by a JavaScript engine (we used Google's V8) to take advantage of thread-level parallelism. Essentially DoAll loops were identified by the programmer (those where all iterations can run in parallel; there are no dependences) and the V8 engine split it out into different threads, running all at the same time.