Part III & MPhil Project Suggestions

These are some suggestions for Part III and MPhil projects within the Digital Technology Group. If you have an idea related to our group's research interests that isn't mentioned here, get in touch.



Systems projects

  1. Characterising asynchronous behavior in the Linux kernel

    Modern operating system kernels have moved towards a model where the preferred way of executing operations is in an nonblocking/asynchronous manner. Many applications are using those asynchronous features (for example from the storage/network stacks of the Linux kernel) to achieve better scalability properties.

    However, this also means that irrespective of application constraints, the exact timing of I/O operations now depends on asynchronous operation schedulers in the kernel. Those schedulers (for example, the IO scheduler) are responsible for managing the system-wide multiplexing of resources. Because of this, it becomes difficult to predict and understand how different applications will interact with each other and whether the workloads they service are synergistic or antagonistic.

    In this project, you will look at quantifying the side effects of applications executing asynchronous operations on other applications running on the same physical machine. The purpose is to understand the variance introduced in the latency/response times by interacting workloads and to discover optimisation opportunities (either in applications themselves or in the scheduling of operations).

    In order to achieve your goal, you will:

    1. Use a measurement infrastructure called Resourceful and extend it to record asynchronous behavior.
    2. Instrument a number of applications to expose their use of asynchronous operations.
    3. Run experiments to understand the interactions/variations in response times of those applications when they run concurrently on the same machine.

    If you complete this first stage of the project, we will look at extending the experiments for Linux containers.

    Interested students should have basic Operating Systems knowledge. Experience with the Linux kernel is advantageous as is programming experience in C.

    Well executed, this project will result in a top-tier publication.

    Contact:Dr R. Sohan

  2. Soroban

    We have internally created a machine-learning technique for understanding performance interference. Currently we only use this to infer the performance overhead of virtualisation on a highly-contended machine. This project would pick another area where there is performance interference and apply our machine-learning approach to comprehend it.

    Well executed, this project will result in a top-tier publication.

    Contact:Dr R. Sohan

  3. Resourceful For the HPCS Clan

    High performance computing workloads multiplex jobs between a cluster of machines. When they do so there is often performance interference between badly interacting workloads. This project would be a collaboration with the university’s high performance computing service to find how to combine some existing Linux mechanisms (eg cgroups) with an internal tool for measuring fine-grained resource consumption in order to limit bad interactions.

    Well executed, this project will result in a top-tier publication.

    Contact:Dr R. Sohan

  4. Characterizing the execution of interactive workloads on wimpy-core (Calxeda) and Tile (Tilera) architectures

    The current server/data center space is dominated by traditional x86-64 architectures. However, the drive for improved efficiency and low energy consumption has created the space for alternative architectures to exist. Calxeda (ARM) and Tilera are two such architectures, with hardware available publicly (and accessible within the DTG).

    Because of their novelty, the performance of running existing applications and the opportunities for optimising them for those architectures are not fully known.

    You will have the opportunity of choosing one of the architectures and characterizing the execution of server applications on top of it, in direct comparison to x86-64. We will try to understand things like:

    • The performance of the network and I/O stacks in comparison to x86-64.
    • The deployment of kernel-level measurement tools (perf / kprobes) or our custom probes implementation (kamprobes) to understand practical architectural differences.
    • Optimal scheduling/placement of interrupts, I/O and CPU tasks.
    • What is the scope of deploying virtualisation (Xen, containers) on top of those architectures.

    Well executed, this project will result in a top-tier publication.

    Contact:Dr R. Sohan

  5. Continuing Kamprobes

    Kamprobes is a probing system for the Linux kernel that has performance that is over 100x faster than the current Linux kernel probing system. However, at the moment it has several limitations: it can only be applied to function calls and works on Linux only. We would like someone interested in low-level internals (writing C and assembly) to research how to apply Kamprobes to a wider selection of instructions, and connect it to existing instrumentation systems such as DTrace.

    Well executed, this project will result in a top-tier publication.

    Dr R. Sohan

  6. Specialised Shadow kernels

    Shadow kernels are an award-winning idea from the DTG that allow an operating system to be specialised for individual processes by having multiple copies of the instruction stream and remapping pages using a hypervisor. We have currently investigated one use case, probing. However, there are other forms of specialisation, such as profile-guided optimisation, whereby the kernel is specialised to work faster for a specific process. This project would implement (at least) one specialisation and evaluate it.

    Well executed, this project will result in a top-tier publication.

    Dr R. Sohan

  7. Linux Kernel specialisation for advanced, virtualisation-assisted sandboxing

    We propose a new way of enforcing the sandboxing of Linux applications based on a primitive we have developed, called shadow kernels. In order to deny access to particular kernel functionalities for a given application, one can present that application with a kernel image in which the memory pages containing the restricted features are zeroed-out.

    We already have implemented the basic mechanisms for creating different kernel text sections and switching to them, under the control of the hypervisor (Xen).

    You will need to use this primitive to implement sandboxing and show that even given exploitable code (NULL pointer references), the application still can't access restricted features.

    Well executed, this project will result in a top-tier publication.

    Contact: Dr R. Sohan

  8. Disaggregating the Xen Scheduler

    Hypervisors, such as Xen, currently have an inbuilt scheduler that decides which domain (virtual machine) should operate on each core next, using coarse-grained metrics. Then each virtual machine has its own scheduler, which (sometimes) competes with the hypervisor’s. This project would investigate if this is the best approach, or if we should delegate the role of scheduling to a privileged domain (virtual machine). By delegating the hypervisor’s scheduler, it could use VM-introspection to read the state of each virtual machine in order to better understand which domain to schedule next.

    Well executed, this project will result in a top-tier publication.

    Contact: Dr R. Sohan

  9. Deep sleeping virtual machines

    In the last 10 years there has been a lot of work on mobile devices to reduce the number of wakeups that the CPU performs as each time the phone wakes up, it drains the battery. Traditionally this hasn’t been a problem for server operating systems, however with the increasing use of virtualisation that might change. Each time a virtual machine wishes to wake up it needs to be given shared resource, thus reducing the resource available to other virtual machines that are cohosted. This project would investigate how often virtual machines are waking up and attempt to reduce that by considering techniques to prevent mobile devices from waking.

    Well executed, this project will result in a top-tier publication.

    Contact: Dr R. Sohan

  10. Rethinking The Phi Scheduler

    Xeon Phi Knights Landing is a HPC architecture to be released by Intel that has 60 wimpy cores and 240 hardware threads. The chip can be used as a ‘regular’ processor (ie boot a contemporary OS). This project would consider how OS schedulers should scale with the number of cores and threads available.

    Well executed, this project will result in a top-tier publication.

    Contact: Dr R. Sohan

  11. Containers on the Phi

    Xeon Phi Knights Landing is a HPC architecture to be released by Intel that has 60 wimpy cores and 240 hardware threads. The chip can be used as a ‘regular’ processor (ie boot a contemporary OS). This project would consider how appropriate the chip is to running containers. In particular, a key concern with containers is poor networking performance. One direction for this project is to consider if the (very fast) memory interconnect can be used as a basis for a userspace memory stack that allows co-hosted containers to communicate over a high-speed network without entering the OS kernel.

    Well executed, this project will result in a top-tier publication.

    Contact: Dr R. Sohan

  12. Fine-grained lineage to Apache Spark

    In-memory processing frameworks such as Apache Spark are increasingly being adopted in industry due to their good performance for many applications. We plan to add fine-grained provenance support for Spark. In fact, Spark uses coarse-grained lineage (instead of data replication) to achieve fault tolerance: by recomputing a lost data partition. However Spark is not able to capture precise relationships between input and output as (1) lineage is coarse-grained and (2) stateful data flow is not tracked. This project will augment Spark to capture fine-grained lineage that can be leveraged effectively for data audit and debugging use cases.

    Well executed, this project will result in a top-tier publication.

    Contact: Dr R. Sohan

  13. Deep packet-level inspection using Hadoop

    Inspecting network packets are of important use in many applications such as root-cause analysis and intrusion detection. In order for these applications to scale with current data volumes, the task of packet-level inspection has to leverage distributed "big data" processing frameworks. In this project, we plan to investigate how to build deep packet inspection tools on top of Hadoop. These tools will enable "realtime" analysis of network traffic without requiring costly/specialised hardware.

    Well executed, this project will result in a top-tier publication.

    Contact: Dr R. Sohan

Indoor Smartphone and Person Tracking

  1. Bluetooth Low Energy Phone Tracking

    This project will look at positioning mobile phones using BLE beacons distributed around the environment. This is a topic in industry at the moment but the beacons are expensive and the phone platforms have only just begun to support BLE properly. We will create beacons using raspberry pis and cheap bluetooth dangles. Unknowns include what power to beacon at, what update rate can be expected, what the continuous scan costs on the phone, whether we can infer good distance estimates and how easy it will be to spoof beacons and thereby cause havoc! The project has a high chance of international publication and further PhD work. Programming for android and/or iOS needed, along with good Linux skills. Previous knowledge of Bluetooth (in any form) is valuable but not essential.

    Contact: R. Harle

  2. Bluetooth Low Energy Sensor Network

    This project will invert the typical Bluetooth tracking scenario by using BLE beacons on the person rather than in the environment building a sensor network of Raspberry Pi BLE sensors. Beacons are small and last years so each person could conceivably carry multiple to aid positioning and to minimise body attenuation. The research aim will be to establish the capabilities of such a system in terms of range, accuracy, power consumption, update rate and maximum beacon numbers. The project has a high chance of international publication and further PhD work. Experience working with Raspberry Pis or Linux environments essential. Previous knowledge of Bluetooth (in any form) is valuable but not essential.

    Contact: R. Harle

  3. Smartphone Camera-based Movement Classification

    A key problem in Pedestrian Dead Reckoning is determining the direction of motion. It is hard to distincuish between back steps, side steps and forward steps. This project will look at repurposing smartphone cameras to estimate relative movement direction based on feature tracking applied to the ceiling or floor. Many optical flow-like algorithms exist that can be trialled. The important result is not just that the direction is correct, but also that the drain on the smartphone battery is minimised. This project will be carried out using the Android platform: some experience of programming for it will be necessary. The optical algorithms may be imported from e.g. openCV (which has an Android port), or written from scratch if preferable.

    Contact: R. Harle

  4. WiFi-IR Positioning

    The dominant indoor positioning approach is the use of WiFi fingerprints. However, these are typically unable to unambiguously locate to a room since WiFi penetrates walls. We have in the past developed an InfraRed based location system (the Active Badge) that had people wear IR emitters and used IR receivers in each room. IR was very good at room localisation (IR does not penetrate walls) but we could not realistically install receivers everywhere. This project will look at exploiting the rising number of IR transmitters on recent smartphones (e.g. Galaxy S4) and simple networked IR receivers (built from e.g. Raspberry Pis or similar) to create a modern-day Active Badge that is fused with WiFi positioning data to create a more robust and ubiquitous tracking system. Experience of Android programming essential.

    Contact: R. Harle

Programming language research to support physical science researchers

The Computer Lab is currently running a research project to apply programming language research to support programming in the sciences, via tools and languages (a slightly longer synopsis can be found here). As part of this project, we are investigating augmenting code with specifications to aid verification, program comprehension and construction, and improve bug analysis.

  1. Language agnostic analysis of programming patterns

    Kythe is a project originating from Google that tries to unite software support for programming across many programming languages and environments. This project involves using Kythe to build language agnostic analyses of programming patterns. We'd like to be able to anlayse code in a variety of different languages to see what similarities we can find.

    For more details on the proposal see here

    Contact: Andrew Rice

  2. Stencil access specifications for verifying numerical code

    Stencil computations are array based transformations, where each element of an output array, at position i, is computed from a finite set of neighbouring elements at position i in some input array(s) (e.g., convolution, the Game of Life). Some stencils are complicated, detailed, and dense (see for example, this stencil computation in a Navier-Stokes fluid simulator) where errors can be easily introduced by accidentally permuting indices, offsets, arrays, and even omitting particular indices.

    The goal of this project is to design and implement a language of abstract stencil specifications, which can be attached to an existing general-purpose language, e.g. Fortran. These specifications will provide a guide to the programmer and a verification technique for the compiler.

    For more examples of why this might be useful and how it might work see here

    Contact: Dominic Orchard

Smart phone usage and energy consumption

  1. Energy consumption of web-service APIs

    It is common for smartphone apps to make requests to server-side APIs either to download information or to post notifications. Commonly this is done using XMLRPC over HTTP. However, it could well be expected that this carries a considerable energy overhead due to use of a TCP connection, the addition of HTTP headers and the text-based encoding of information. This project seeks to measure the potential energy savings of different options such as more efficient encodings (e.g. Google's Protobuf) and the use of UDP.

    Energy measurement hardware is available as are android phones for testing and equipment for building a controlled wifi testbed.

    Interested students will need to demonstrate good programming ability and application development along with a good understanding of TCP, UDP, IP, Wifi and Cellular networking.

    Contact: Andrew Rice

  2. Reality-based benchmarks

    There are a variety of benchmarking tools available for Android which can produce a performance score for a particular handset. However, these tests do not really reflect the needs of actual phone users. The idea of this project is to use the Device Analyzer dataset to come up with better benchmarks.

    The project will need to survey the various properties that current benchmarks are attempting to measure. Data analysis from Device Analyzer can then be used to work out how many users would be interested in these measurements and to see if there are more important properties to measure. We have a variety of phone handsets which can then be used for testing to see how they perform with the new designs.

    Interested students will need to demonstrate good programming ability and have an interest in systems measurement. It is expected that data analysis will require some sort of distributed processing system such as Hadoop running on the DTG cluster.

    Contact: Andrew Rice