Computer Laboratory Home Page Search A-Z Directory Help
University of Cambridge Home Computer Laboratory
Student Projects (2002)
Computer Laboratory > Research > Systems Research Group > NetOS > Student Projects (2002)

Suggestions from Jon Crowcroft

  • Please see here for suggestions on signalling protocols, instant messaging, graphical routing, Java packet filters, multicast in Java, virtual weather studio, network simulation, remote office, watermarking, multicast routing, 3G networks, the eternity service, jig-saw creation, web-to-speach, GUIs for collaborative tools, virtual departments, multi-user distributed games, distributed/replicated source control, TCP optimisation, application-level multicast, visual wireless communication, DCCP, mobile-phones as smart cards for 1-time login.

System modelling projects

  • Running a combinatorial auction
    See here for further details. Contact: Richard Gibbens

  • Performance modelling tool for enterprise IP networks
    See here for further details. Contact: Richard Gibbens

File system projects

  • Querying File System
    Contemporary filing systems map names to files by using a hierarchical lookup scheme built from nested directories. This allows a relatively efficient name resolution algorithm along with considerable flexibility for the user. Search paths or naming conventions may be used to prefer a given version of a file. In many cases, however, users would prefer a scheme which allows them to retrieve or specify files based on queries, boolean expressions relating to meta-data such modification date, owner, type, keywords, etc. Such operations may return 0, 1, or many files. This project will investigate suitable schemes for both on-disk and in-core data structures to support a query filing system (QFS). A prototype implementation will be developed at user-space level ( e.g. under Linux using the UserFS library). As a possible extension, information about queries could be used to provide "hints" to the buffer cache and prefetching scheme and so improve performance. Note: this project has been done successfully a couple of times before; interested students may like to look at the part II project dissertations in the library for accelerated background reading. Contact: Steven Hand

  • Meta NFS
    The Systems Group has 10's of NFS file servers. Currently, AutoMount Daemon (amd) is used to enable any of our workstations to mount and access any of the servers. It would be much nicer if we had a replacement for AMD that made the myriad of servers appear as though it was one monster file system. It would automatically balance the placement of files among the servers so the user no longer has to worry about which file server has free space etc. It would mean that we could have a sensible directory hierarchy and naming scheme that wasn't forced to reflect where the data is actually located, as it does now.

    This project requires plenty of good distributed systems stuff (e.g. electing a master, atomic updates of meta-filesystem state etc.), and doesn't require any kernel hacking -- it can all be done in user space. It would also be really useful thing to have. Contact: Ian Pratt

Programming language projects

  • Easy lock-free programming
    Suppose that your usual programming language was extended to allow statements such as
        atomically {
           o.method1(x, y, z);
           p.method2(a, b, c);
    and that the language run-time system would ensure that concurrent executions of such code occur as-if atomically. This project aims to build a limited version of this construct based on our research on lock-free data structures. Contact: Tim Harris

  • Automatic debugging
    A recent paper from Stanford suggests trying to automatically find bugs in complex software systems by identifying parts of the code that violate common coding patterns. This project involves implementing a similar system, possibly based on Java bytecode and operating even without the source code to the program being debugged. We have several substantial Java programs (some written by me and inevitably bug-ridden!) and open-source libraries are available for converting class files into intermediate data structures. Contact: Tim Harris

  • Pervasive debugging
    A pervasive debugger provides the user with complete control over the application being debugged by executing it in a simulated environment. This can allow deterministic re-execution (and reverse execution) of multi-threaded programs and may aid debugging complex low-level aspects of programs' behaviour, e.g. memory access re-ordering on an SMP machine. We have an initial proof-of-concept implementation put together in a fortnight for a workshop demo. This project will produce a more thorough and well-engineered implementation. Contact: Tim Harris

  • Managing elastic structures
    Applications that are elastic in their memory requirements can operate with various amounts of memory, probably performing better as more is allocated to them. For example a program may maintain various caches from which data can be discarded if memory is tight. The basic JVM provides only crude information about memory usage. This project proposes extending a JVM implementation so that the garbage collector automatically tracks the amount of memory that various data structures consume. Contact: Tim Harris

Networking projects

  • Peer-to-peer routing overlay using skip lists
    Many recent peer-to-peer applications use an overlay routing substrate such as Tapestry or Chord to perform distributed object location and message passing services. An overlay should offer efficient key lookup, and manage nodes entering and leaving the network ("churn") by repairing routing table entries at each node. Most existing structured routing substrates including the above three are based on a distributed hashtable (DHT), where each node is responsible for managing a portion of a keyspace, like a bucket in a hash table. The aim of this project is to design and implement a peer-to-peer routing substrate using skip lists, a probabilistic data structure based on layered linked lists. Because of the "best effort" nature of pointers in higher lists, the overhead of managing churn may be substantially reduced. See Ian Pratt or Tim Moreton.

  • Receiver Layered Multicast
    Network audio and video require reasonably high bandwidth and relatively low jitter (variation in delay). These requirements make it difficult to deliver high fidelity data directly from the network. However these media types may be encoded in a layered format; i.e. as a number of layers 1, 2, 3, ... n. The receiver may choose to decode any prefix of the sequence, obtaining increased fidelity in each case. It is possible to distribute audio and video in a layered format where each of n layers is given a different multicast "channel". Receivers may then decide to receive the first k of these, and to decode (and render) the first j <= k of these in order to maintain real-time rates and/or to match the fidelity to the characteristics of the receiving terminal. This project would choose or construct a layered encoding scheme for either video or audio (or both), and develop transmitting and receiving software. As a first step, broadcast on a local ethernet could be used in the place of multicast, although it would be nicer (and more valid) to use multiple networks and a "proper" multicast protocol. Contact: Steven Hand

  • Distributed low-latency DNS
    Two (so far) projects have attempted to implement DNS over a peer-to-peer overlay network, most notably DDNS over Chord. DNS is a good candidate for a peer-to-peer application in terms of its requirements for availability, load balancing, and widespread caching, and the desire to minimise the administration needed to maintain it. However, the conclusions of this paper make interesting reading. Could you do better, and reduce the latency to the levels of conventional DNS? An implementation of Pastry or Tapestry will be available. See Ian Pratt or Tim Moreton.

  • Predicate Routing
    In current IP networks, the routing state of the system is primarily represented as a set of routing tables (local to each router) and a set of filtering rules (also local to each router or firewall). In contrast, Predicate Routing represents the state of the network as a set of boolean expressions associated with links which assert which kinds of packet can appear where. From these expressions, routing tables and filter rules can be derived automatically. Conversely, the consequences of a change in network state can be calculated for any point in the network (link, router, or end system), and predicates derived from known configuration state of routers and links. This subsumes notions of both routing and firewalling. This project would involve designing and implementing a network control plane for predicate routing, probably in centralised manner. An extension would be to develop an inference tool to alllow one to derive properties of the network. A more ambitious project is to attempt to build a distributed version either in simulation (for example using SSF), or by packaging up a real implementation (e.g. Zebra). Contact: Steven Hand

  • Fun with wireless networking equipment and PDAs
    This isn't so much a project description, as note to say that the SRG has a number of GSM/GPRS-capable phones and datacards, Wireless LAN cards, and iPAQ PDAs that could be used in the execution of part II projects, providing you can come up with a cool application or service to use them for. We can provide help with configuring the kit, and hosting the server-side component of any applications. Contact: Ian Pratt

  • Infrastructure health monitoring
    The Lab comprises a complex distributed system, consisting of over a hundred machines, all running many complex software services, all interconnected by tens of switches and routers. Unsurprisingly, things don't always work all the time... It would be very handy to develop a system to monitor the health of all the machines, their services and the network, and when things go wrong, try and figure out what corrective action to take.

    The project would consist of two parts: the first is quite straight forward, developing a few tools to probe and monitor the state of individual components (e.g. network topology, whether a service is currently running etc). The second part is more fun: developing an engine to receive the streams of data from the first stage, and enable queries over the collected data. Using prolog might be good for this. By doing queries over this data it should be possible to come up with a prediction as to which component has failed. For example, if all the machines attached to a particular network switch suddenly become un-pingable, there's a good chance that the switch has failed rather than all of the hosts failing simultaneously. Contact: Ian Pratt

Hardware projects

  • Fun with solid-state accelerometers
    It's now possible to buy miniature 2 axis accelerometers that output very accurate acceleration data in digital form. This can be integrated to determine relative velocity or even position. This project would involve building some simple hardware (perhaps interfacing to a serial line on a PDA), and writing software to use the acceleration data. For example, you could do a gesture based UI, auto-updating map, magic pen, tracking system for a radio controlled car etc... Contact: Ian Pratt

  • Ethernet to IR transceiver
    This project would require construction of a simple bit of hardware, probably using a micro-controller with integrated Ethernet interface. The device would enable over-the-network control of consumer appliances such as TVs and videos. The second part of the project would be to develop XML based control interfaces for the devices, and write useful applications that manipulate them (e.g. perhaps to provide TiVo like functionality on a VCR). This could potentially be done using the Iota XML programming language developed in the Lab. Contact: Ian Pratt

  • Software AM radio / Location using AM radio signals
    High-speed analog to digital converters now mean it is possible to digitize the entire AM radio band, then performing the 'tuning' in software on a PC. The first phase of this project is to assemble the hardware and software to build such a "software" radio (we have a high-speed ADC PCI card that could be used). One direction the project could take would be to use AM radio signals for location: AM radio stations emmit a very stable carrier frequency from a fixed location (the transmitter). By using a digital radio to monitor phase changes in the carrier frequency, it should be possible to tell whether a mobile device has moved towards, or away from, the transmitter. By tuning into several AM stations simultaneously it should be possible to recover accurate position information.

    This project could take two forms: A student interested in hardware could build the digital radio, or perhaps use a standard PCI card with high-speed ADC. Alternatively, the work could be done entirely in simultaion. Either way, there's loads of cool DSP stuff to be done. Contact: Ian Pratt

Security projects

  • Secure packets
    The goal of this project is to design and build a proof of concept implementation of a networking protocol that would enable "opportunistic encryption" to be utilized whenever two Internet hosts wished to communicate. The aim is to achieve this in as efficient a way as possible. For example, the payload data contained first few packets of a new flow would be encrypted using a public key cipher while a key exchange takes place piggy-backed on top of the data packets to enable a symmetric cipher to be used for the remainder of the flow. Multiple TCP and UDP connections between the same hosts could make use of the same tunnel, which would be timed-out after being idle for some period.

    A key extension to the project would be to prevent a class of denial of service attack whereby a host or hosts bombard a receiver with flow-setup packets that it needs to decipher using an expensive PK algorithm. One way to try and avoid this problem would be to use "crypto puzzles", whereby the sender needs to do a significant amount of computation in order to generate a ticket that is attached to the packet. The puzzle is asymmetric in that the receiver is able to easily check whether the ticket is valid for this particular packet. Hence, it's much harder to create a flood of packets that overloads the receiver. Contact: Ian Pratt