The Systems Research Group
Networks and Operating Systems
Part II/Diploma Project Suggestions, 1998-1999
Compiler/Language Related Projects
bdb -- Backwards Debugger
Using traditional debuggers to find ``where things go wrong'' can be frustrating
-- you must try to set a breakpoint just before the problem occurs and
then single step forwards until the problem is revealed. A more natural
way of working could be to set a breakpoint which is triggered when the
problem manifests itself and then to step backwards to reveal why execution
has taken that path.
This is non-trivial because many operations ``lose'' information (register
values/program counter/flags/etc) and so instructions are not generally
reversible. A ``backwards debugger'' would instrument the program being
debugged so that a ``step backwards'' operation is possible (at least a
few times in succession). It would need cunning data structures to hold
the ``lost'' information efficiently and (potentially) cunning optimizations
to reduce the time/space required to create this log.
Bytecode Rewriting
Implement some bytecode -> bytecode translations, probably using Java bytecode
and an existing implementation of the JVM. Some ideas are:
-
Feedback-directed inlining of method calls (maybe based on the output of
``java-prof'').
-
Antistatic for Java (removing static fields on classes) to allow multiple
applications to be executed within the same (unmodified) JVM.
-
General optimisation things (e.g. rearranging the order of local variables
so that the most frequently used ones are assigned to slot 0/1/2 (which
have specialized load/store operations)). /cite{"Optimizing Java - Theory
and practice" [Budimlic & Kennedy, Rice University]}
-
Run-time specialization of bytecode. For example creating an optimized
version of a method when either the value or more specific type information
of one of the arguments is known.
A significant chunk of this project is likely to involve developing an
extension to the existing reflection features in Java. At present
these allow code to determine information about classes and method signatures,
but do not provide ways of inspecting, modifying and re-loading classes
in an efficient and elegant manner.
JIT Compiler for Z80/6502/6510
Integrate a compiler to an existing Spectrum/C64/Amstrad CPC/Amiga emulator.
Particular things to look at might be:
-
Efficiently handling 8/16 bit values on a 32 bit processor. Maybe
combining several Z80/6502/6510 instructions into a single one.
-
Deciding when and how much to compile (the obvious approach of compiling
all 64k of code wouldn't work because the Z80/6502/6510 instruction sets
have variable length instructions and the compiler can't easily distinguish
programs from data).
``Anti-Static''
Nemesis is a single-address-space operating system. It was designed this
way to facilitate sharing of code and data. Unfortunately this presents
a problem for code generation: code that is to be shared may not access
variables through absolute memory locations.
As an example, consider the following C program:
#include <stdio.h>
static int c = 5;
static void greet(void)
{
printf("Foo!\n");
c--;
}
int main(int argc, char **argv)
{
while (c) {
greet();
}
exit(0);
}
On many architectures, the code generated to access the variable `c' will
be an instruction that reads or writes a fixed memory location. To write
shareable code for Nemesis we currently avoid all variable declarations
that result in the program having a data or bss section. An example:
#include <stdio.h>
struct hello_st {
int c;
};
static void greet(struct hello_st *st)
{
printf("Bar!\n");
st->c--;
}
int main(int argc, char **argv)
{
struct hello_st *st;
st=malloc(sizeof(*st));
if (!st) exit(1);
st->c=5;
while (st->c) {
greet(st);
}
exit(0);
}
This is easy in this trivial example, but much harder for practical programs.
We propose two approaches:
-
Modify the code generator of a compiler like egcs
to output code that always references state relative to an implicit `state
parameter' that is passed to every function.
-
Write a program that parses code such as the first example and outputs
code similar to that in the second example. This program would probably
have to sit between the preprocessor and the compiler to be able to cope
with complicated use of macros. Such a program may also have other applications.
If you're interested in the first approach talk to Stephen Early. If you're
interested in the second approach talk to Dickon Reed.
GCC/EGCS Front Ends
Nemesis coding is carried out in C in a highly stylised manner, and using
a large number of macros. This has worked so far, but can be cumbersome,
and somewhat tricky to control, particularly given the lack of typechecking
in C (and C macros), compared with the Nemesis type-system, the lack of
native support for the richer features of the Nemesis type-system, the
lack of exceptions, and so on. Given these problems, it would seem sensible
to design a language to cope with them. Two such projects are therefore
suggested:
-
Write a front-end for GCC or EGCS
to parse Nemesis style C (that is, C, but with types, method invocations,
exceptions, etc), or
-
Write a front-end for GCC or EGCS
to parse CLUless, a previously designed programming language for Nemesis
(`C and CLU smashed together'). A compiler for this already exists, and
was the subject of successful projects in two past years.
Project enquiries to:
Austin Donnelly, Cambridge Computer Laboratory, Austin.Donnelly@cl.cam.ac.uk
Stephen Early, Cambridge Computer Laboratory, Stephen.Early@cl.cam.ac.uk
Dickon Reed, Cambridge Computer Laboratory, Dickon.Reed@cl.cam.ac.uk
HTML gripes to:
Richard Mortier, Cambridge Computer Laboratory, Richard.Mortier@cl.cam.ac.uk
Last updated: $Date: 1998/09/10 15:14:43 $