Contact: Steven.Hand or Jon.Crowcroft (and cc: euan.harris@zeus.com)

TrafficScript type system

TrafficScript is the embedded scripting language in Zeus Traffic
Manager, a high performance Internet load balancer.   It can manipulate
connections passing through the Traffic Manager in a variety of ways,
controlling routing decisions, rewriting web content, gathering
statistics and so on.  Our customers run hundreds of different
TrafficScript rules on connections coming into their web sites, so it
is critical that the language is fast - we don't want to slow down
those requests - and safe - we don't want a rule to fail at runtime and
cause a connection to be aborted, especially if the error could have
been detected and corrected at compile time.

TrafficScript has a Perl-like flavour, and in particular it is
dynamically-typed.   That makes it quick to write, but means that some
errors are only discovered at run time.   The goal of this project is
to analyze the TrafficScript language and add an optional type system
which can detect common programming mistakes at compile time,
preventing unfortunate errors at run time.   We want to retain the
dynamic flavour of the language, so the type system must be optional
and must not require the use of type annotations, but it should be able
to detect type errors and ideally give the language compiler enough
information to make aggressive optimizations which would not be
possible in a pure dynamically-typed language.

We will provide an informal description of the syntax and semantics
of the TrafficScript language.   The first part of the project will
be to implement a bytecode compiler for this language, and a virtual
machine to run the bytecode.   The virtual machine could be register-
or stack-based - why not do both and compare them?   The second part
of the project will be to add type inference to the compiler, and 
detect typing violations.   You will probably find it helpful to write
a formal definition of the language semantics for this part.


I/O focused Linux scheduler

The Linux kernel developers put a lot of effort into making the
kernel's scheduling algorithm efficient and fair for a wide range of
workloads, from a single-processor laptops running interactive
applications, through multi-processor database or e-mail servers, to
supercomputers and large-scale HPC clusters.   This inevitably leads to
a trade-off between achieving the optimum performance for a particular
workload on the one hand, and generally good performance for a range of
workloads on the other.

In contrast, a traffic manager has a very predictable workload.
Typically there will be a small, fixed number of processes doing
relatively little computation but a massive amount of I/O, particularly
on the network.   Interactive response times are not important.   Given
this workload profile, is it possible to write a new scheduler, or tune
an existing one, to improve the performance of the traffic manager?

There are several ways to approach this project.   For instance:

 * Network I/O: look at interrupt handling issues and interactions with
   tcp-offload on network cards.

 * Scheduling: investigate wake-one semantics for a set of
   processes in epoll_wait, avoiding the thundering herd problem(making
   this an OS service, like process groups tied to particular
   resources/FDs).


Evaluation of load balancing algorithms

Most traffic managers provide several different load balancing
algorithms to spread incoming connections over back-end servers.
These range from basic round-robin to more sophisticated algorithms
which take factors like backend server load and response time into
account.

These algorithms are often benchmarked against heavy, worst-case
workloads.   The purpose of this project is to benchmark a selection
of load balancing algorithms against real-world network trace data.
Which algorithm performs best in the ordinary case?   Having evaluated
the existing load balancing algorithms, the second part of the project
is to create a new algorithm - perhaps a blend of the existing
approaches - which is optimized for the common-case workload.