Department of Computer Science and Technology

Systems Research Group – NetOS

Student Projects (2015—2016)

NetOS

This page collects together various Part II project suggestions from the Network and Operating Systems part of the Systems Research Group. In all cases there is a contact e-mail address given; please get in touch if you want more information about the project.

Under construction: Please keep checking back, as more ideas will hopefully be added to this page during the coming weeks.

Note: there are a number of stumbling blocks in Part II project selection, proposal generation and execution. Some useful guidance from a CST alumnus (and current NetOS PhD student) is here.

Current project suggestions

TripIt! for Papers

Contact: Richard Mortier

Recently introduced rules from the UK Research Councils require that all published academic outputs are made available as open access within 3 months of publication, typically via institutional repositories. Unfortunately most institutional repository submission processes are rather cumbersome, involve extensive human interaction, and are prone to being forgotten or delayed potentially rendering an academic output inadmissible for the next Research Evaluation Framework exercise.

TripIt is a rather useful service that allows a traveller to manage travel plans by forwarding email confirmations, tickets, receipts, etc to an email address ([Javascript required]). Upon receipt the TripIt service aggregates and parses the information concerned into a sequence of "trips" which can then be exported as (e.g.) a Google Calendar. This makes it quite straightforward to have trip details appear in one's calendar without needing to go through the tedious and error-prone process of re-entering all details manually.

This project will design and build a service that provides TripIt-like interaction for tracking academic outputs. Elaborating details of the workflow forms part of the project, with care needing to be paid to the requirements for REF eligibility as well as common academic working practices for conference and journal publication. The service can be implemented using any appropriate tools, though implementation as a MirageOS unikernel using the Irmin storage backend would be particularly welcome.


Fun with Bus Data

Contact: Richard Mortier

We have access to a near-realtime feed of bus data for Cambridge public buses, which presents several opportunities for a range of projects. The following two are of particular interest, but others may also be considered.

  1. At present, data is received via POST to a webserver, which then both archives and serves it. This project would design and build a scalable custom webserver to perform these tasks using MirageOS as one or more microservices. This would make use of Irmin to store and retrieve the data, and would then present a range of computed views of the data via URLs against which applications and visualisations might be developed. Extensions might include using Jitsu to auto-scale components in response to load, exploring how to manage long-term archival of data in an "append-only" store such as Irmin, and many others.
  2. The data is currently being simply visualised using Google Maps and similar mapping services. This project would explore the many more advanced data visualisations and analyses that could be carried out. Starting points include implementation of online visualisations such as for the MBTA metro dataset, or analyses as published on an earlier version of these data by Bacon et al (2011). "Using Real-Time Road Traffic Data to Evaluate Congestion". LNCS 6875:93-117.

Packet Capture on MirageOS

Contact: Richard Mortier

PCAP is a simple packet capture format, emitted by many standard tools such as tcpdump and wireshark. The commonly-used implementation of PCAP is the libpcap library which is, unfortunately, a common source of zero-day exploits at events such as DefCon. Rudimentary support for capturing and parsing PCAP format dumps already exists for MirageOS. This project would rationalise and extend this existing support to produce a working packet capture unikernel able to store captures to a MirageOS block device such as the recently introduced Archive format (where a tar file is mounted as a block device). The focus of the project will be system-level support for secure and efficient operation, able to capture large traces over long periods of time. Extensions might include playback of previously captured traces, with control over playback parameters (e.g., timing, packet metadata).

Pre-requisites: Familiarity with network package capture. Existing competence programming in OCaml, or familiarity with another functional programming language.


MSMPTCP


Contact: Jon Crowcroft, email Jon

Keywords: TCP, Multipath, Network Coding, Resource Pooling


MSMPTCP: Multisource MultiPath TCP using Random Linear Codes...

A lot of data out there in the Internet is replicated in many places (the most popular items on most streaming services are many fold replicated, both within each data center and across many data centers- whether netflix, akamai, youtube, whatever).

Current tools load balance at the single request level, and curent protocols are "point-to-point" (leaving aside one-to-many multicast for now). It has been shown that resource pooling, by striping data over multiple paths, can provide much simler and often near optimal load balancing. Multipath TCP (MPTCP) came out of this observation, and one paper also points out that Bit-torrent, which provides a swarm (sources data from multiple servers) also helps on the other side of the equation.

In this work, we want to combine multiple sourcing, like bittorrent/swarms, multiple paths, like MPTCP, with a third technique called network coding, which spreads information over both packets and flows, to maximise the degrees of freedome that resource pooling can enjoy.

The idea can be seen like this:

MPTCP has a single source and sink, but spreads packets over multiple paths. MSMPTCP adds degrees of freedom by sourcing packets from multiple servers. So we have many, many-to-one flows, dispersed over multiple paths and we can then use intermediaries (proxies, middleboxes etc) to network code flows together (c.f. also COPE or MORE in the mesh wifi/roofnet work), as we now have more degrees of freedom to do this coding.

One way to view the new protocol is like MPTCP cut in half (plus coding:)

so MPTCP

    /..........\
src -..........-dst
    \........../

thus MSMPTCP
        src1...\
        src2...-dst
        src3.../

    

The rules for rates on the subflows can be the same.

Note that the ... above contains switches/routers/proxies/middleboxes, which can implement recoding of packets from the same or different flows.

The goal here is not to replace ARQ (TCP ack/retransmit mechanism) for reliability, but to gain more resource pooling through sharing of flows/paths, and this to cope with (or rather, remove) traffic imbalances and therefore reduce hot spots in the network - in a data center (a classic deployment opportunity for MPTCP) MSMPTCP could use middleboxes (switches with network processors or even network function virtualisation) to do coding and recoding of packets from flows...

Two key ideas from existing MIT work:

  1. use random linear codes (not fountain) because they can do anyhting FC can do but not vice versa
  2. code/decode cost isn't a problem
  3. in dynamic senario - structured code wont work - because can't add /delete nodes...
  4. sequence number/ack numbers now use notion of information and degree of freedom rather than byte sequence space....
  5. Acks have to be "multicast" from dst to srcs

One question is "why xTCP"? Why not just do this on UDP or a new protocol. Main answer is "reuse machinery of TCP for other stuff (including state setup/security) plus TCP shaped packets are more likely to work in many scenerios (and not get blackholed/discarded by security/firewalls).

Things to do:

  • Implement - needs random linear codes + mptcp source..
  • Evaluate overhead versus resource pooling (also known asstatistical multiplexing) gain, for various workloads (needs realistic file/workload traces - can get from NetApp or Akamai or others).
  • Design choices for dealing with occasional packet loss

References

  • Baseline for MPTCP-- This paper also contains references to various detailed design papers on MPTCP congestion control for sub-flow rate assignment... mptcp intro
  • MIT TCP with network coding ones are: This work has baseline references on random linear codes too. “Network Coding Meets TCP: Theory and Implementation”, J. K. Sundararajan, Shah, D., Médard, M., Jakubczak, S., Mitzenmacher, M. and Barros, J., invited paper, Proceedings of the IEEE, March 2011, pp. 490 – 512 1 2
  • The MIT MPTCP with network coding paper
  • “Congestion Control for Coded Transport Layers”, M. Kim, Cloud, J., ParandehGheibi, A., Urbina, L., Fouli, K., Leith, D., and Médard, M., IEEE ICC Communication QoS, Reliability and Modeling Symposium 2014 here

Oblivious RAM on MirageOS

Contact: Richard Mortier, Nik Sultana

ORAM is a cryptographic protocol that allows clients to store and retrieve data from a server, but does not allow the server to learn the client's data (nor does it allow the server to learn whether the client is reading or writing data). The goal of this project is to implement a recent variant of this protocol [1] in OCaml to run on the Mirage operating system [2]. The protocol provides the client with a key-value store abstraction, which can be used to build other abstractions on the client side. Through this project you can get to implement privacy-friendly technology -- one of the most topical technological needs -- and make a contribution to the Mirage project on which others can build.

References

OS Support for Garbage Collection

Contact: David Chisnall

Prerequisites: A good knowledge of C and the ability to work with concurrent data structures using fine-grained locking

Microsoft Windows provides an API for receiving notifications related to which pages have been modified. This is used in the .NET runtime, but similar interfaces are lacking on other systems. This project will involve modifying the FreeBSD virtual memory subsystem to provide an API that allows garbage collectors to query this data and modifying the Boehm collector to use it.

The Boehm collector provides a platform-independent mechanism for retrieving a list of dirty pages, along with multiple implementations (the Windows API, using mmap() to mark pages as read-only and catching the faults, and a few others) so these changes will be relatively small. The OS changes will be more significant. Some important considerations include:

  • The OS uses the dirty bit to identify pages that are candidates for swapping, so its notion of a dirty page is not the same as the garbage collector’s.
  • Programs use multiple threads. As a first approximation, it would be acceptable to only query for dirty bits after calling pthread_suspend_all_np().
  • For correctness in a garbage collector, it is acceptable to provide a superset of the modified pages - scanning a page that is not modified only hurts performance - but missing a write is a serious problem.

The implementation will most likely involve adding a counter to each page that is incremented when the page moves from clean to dirty status and querying a range of pages to identify whether their counters have incremented since a previous call (make sure you handle overflow in the counters sensibly! This can be just by providing a ‘reset all counters’ API to userspace and).

Evaluation should involve running the modified Boehm collector on some benchmarks and determining whether it provides better performance. If it doesn’t, then evaluation should describe what the overheads were that offset the speedup.


40G OpenFlow Switch for Software Defined Networks

Contact: Noa Zilberman

The concept of Software Defined Networks was introduced less than a decade ago, and has quickly overtaken a significant role in today's networking. The NetFPGA [1] platform was the open-source platform used to first implement OpenFlow, the first example of SDN in practice [2] and today's common interface between the control and the data plane.
NetFPGA-SUME [3] is the third generation of NetFPGA and a technology enabler for datacentre research. In this project you will implement a 40G OpenFlow switch 1.4.0 [4] over the NetFPGA-SUME plarform based on the original NetFPGA 1G OpenFlow switch [5] and the recent NetFPGA SUME reference design flow. You will extend the 4Gps data path of the original design to support 40Gbps, change the datapath interfaces to the standard AXI interface, and add support for the recent OpenFlow standard features.

References:
[1] http://www.netfpga.org
[2] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner, Openflow: enabling innovation in campus networks, ACM SIGCOMM CCR, vol. 38, no. 2, pp. 69–74, 2008
[3] Noa Zilberman, Yury Audzevich, Adam Covington, Andrew W. Moore. NetFPGA SUME: Toward Research Commodity 100Gb/s, IEEE Micro, vol.34, no.5, pp.32,41, Sept.-Oct. 2014
[4] Open Networking Foundation, OpenFlow Switch Specification, April 2015, https://www.opennetworking.org/sdn-resources/technical-library
[5] NetFPGA-1G OpenFlow switch project

Pre-requisites: This project requires basic knowledge of computer networks and Verilog.


120G Switch

Contact: Noa Zilberman

The bandwidth of network switching silicon has increased by several orders of magnitude over the last decade. This rapid increase has called for innovation in datapath architectures, with many solutions competing for their place. NetFPGA-SUME [1] , the third generation of the NetFPGA [2] open source networking platforms, is a technology enabler for 100Gb/s and datacentre research. As a reconfigurable hardware platform, it allows rapid prototyping of various architectures, allowing to explore various trade offs.
In this project you will extend the NetFPGA SUME 40G (4x10G) reference switch to a 120G (12x10G) switch and demonstrate its operation. This includes adding 8 new interfaces to the switch design and extending its datapath to support 120Gbps bandwidth. You will use the NetFPGA SUME python-based test harness to simulate and test the hardware that you design.

References:
[1] Noa Zilberman, Yury Audzevich, Adam Covington, Andrew W. Moore. NetFPGA SUME: Toward Research Commodity 100Gb/s, IEEE Micro, vol.34, no.5, pp.32,41, Sept.-Oct. 2014
[2] http://www.netfpga.org

Pre-requisites: This project requires basic knowledge of computer networks and Verilog.


Encryption Offload Engine

Contact: Noa Zilberman

Encryption is widely used today to achieve confidentiality and maintain privacy. Still the use of encryption can be limiting and inflexible. Homomorphic encryption, where operations can be done on ciphertexts without decrypting them first (e.g. [1]), is one way to overcome these limitations. However, such types of encryption require a large amount of resources and take considerable processing time. FPGAs are rapidly gaining traction as encryption acceleration platform.
NetFPGA [2] is an open source platform used for research and rapid prototyping of devices. While originally designed for networking research, its latest platform, NetFPGA SUME [3] has extensively being adopted as an acceleration platform and offloading engine. In this project you will design a new NetFPGA based offload engine that implements an encryption algorithm or offload part of an encryption algorithm (e.g. for full homomorphic encryption). It will exercise different existing modules within the NetFPGA reference designs (such as the DMA engine) and will require designing several new modules, such a memory module and an encryption algorithm-specific block.

References: [1] Daniele Micciancio, Technical Perspective: A First Glimpse of Cryptography's Holy Grail,Communications of the ACM, Vol. 53 No. 3, Page 96, 2010
[2] http://www.netfpga.org
[3] Noa Zilberman, Yury Audzevich, Adam Covington, Andrew W. Moore. NetFPGA SUME: Toward Research Commodity 100Gb/s, IEEE Micro, vol.34, no.5, pp.32,41, Sept.-Oct. 2014

Pre-requisites: This project requires basic knowledge of Verilog, background in cryptography is an advantage.


PhantomJS Rumpkernel

Contact: Richard Mortier

Rump Kernels are a neat way to create Unikernels out of existing code, essentially via cross-compilation. Phantom.JS and the related Casper.JS are command-line wrappers around the WebKit and Gecko HTML rendering engines (Phantom is the wrapper, Casper is a set of extensions making it easier to use to write website testing scripts). It would be interesting to use these to construct a one-shot stateless web-browser that could form part of a web measurement framework.

Building on the basic component, one might investigate how to control state transfer (e.g., cookies) between instances; how to extract and merge state from multiple parallel runs (e.g., using Irmin); or how to spoof multiple different browsers and behaviours. All the while making this as high-performance and high-scalability as possible.


Spoofing TCP with MirageOS

Contact: Richard Mortier

MirageOS is a framework developed by the group in which to write unikernels: compact, application-specific OS kernels. As traditional OS services, such as the network stack, are provided as libraries, there is an opportunity to manipulate those libraries to customise them for particular application demands.

Coupled with the project above, it would be interesting to write a TCP stack (or stacks) for MirageOS that replicated different signature behaviours of other stacks (Linux, BSD, etc). This would enable the web measurement referenced above to be extended to present as different host OSs as well as simply different browsers.


Functional Network Stacks with MirageOS

Contact: Richard Mortier

The use of Irmin to store the state of network stack components in MirageOS components opens up several possibilities when it comes to auto-scaling unikernels, providing fast state transfer to replicas, and increasing inspection and manual (re-)configuration capabilities of running unikernels. See this talk by Mindy Preston for more details. It would be interesting to investigate how to apply this technique to other, larger, components in the MirageOS network stack (notably, TCP/IP), and to investigate the scaling and indirection properties thus enabled.


More Systems Projects at the DTG Project Page

Contact: Ripduman Sohan

Please see the DTG project suggestions page for a number of interesting systems projects.