Computer Laboratory: Alan Mycroft's Masters' project suggestions
Here are some possible compiler-like project sugestions students taking the MPhil in ACS or Part III (MEng) in the Computer Science Tripos. They are all designed to have possibilities of resulting publications.
- Liveness-based Garbage Collection
- Liveness-based Pointer Analysis
- SSA-based Register Allocation in LLVM
- A Decompiler
A paper in CC'2014 on this topic shows how a simple language can be augmented with tables, similar to debugging information, about liveness which can be used to improved garbage collection by following only live data rather than reachable data. There are many ways this initial work could be extended, e.g. mutable data, avoiding multiple heap-node revisits etc.
A paper on this topic shows points-to analysis can be significantly improved (in performance and in accuracy) in GCC by storing only live points-to information. Exploring this idea for LLVM would give an additional data point.
Traditionally compilers eliminate the phi-functions of SSA form, and then do register allocation. Optimal register allocation is therefore NP-complete. A few years ago Sebastian Hack noted that register allocation directly on SSA form is linear, and NP-completeness only arises on SSA-elimination (at which time different heuristics are available).
The project is to implement Hack's work as part of LLVM.
In a previous project Fernando M Q Pereira developed an implementation of a similar idea in A Practical Approach for SSA Based Register Allocation. (Thanks to Pavel Yudin for pointing this out).
The aim of a decompiler is to produce as readable as possible high-level
output (e.g. C or higher) from low-level source (e.g. assembler or
binary). A simple system can be got working quite quickly
e.g. produce C statements like "AX=AX+BX;" for assembly code like
Doing it better has almost limitless possibilities. Cifuentes' dcc was one of the first decompilers. The Decompilation Page of the program-transformation wiki gives good background (but may be not up-to-date). There's some some work of mine: Type-based decompilation to recover struct/union/pointer type information from assembler, and Wei-Ming Khoo's Rendezvous project and up-coming PhD thesis on "decompilation as search". An excellent commercial disassembler and decompiler comes from Hex-Rays.