Department of Computer Science and Technology

Systems Research Group – NetOS

ACS Projects (2016—2017)


This page collects together various ACS project suggestions from the Network and Operating Systems part of the Systems Research Group. In all cases there is a contact e-mail address given; please get in touch if you want more information about the project.

Under construction: please keep checking back, as more ideas will hopefully be added to this page during the coming weeks.

1.Rack-scale Fabric Topologies

Contact: Andrew Moore,Noa Zilberman (email)

Rack-scale computers contain hundreds to thousands of compute nodes, requiring a scalable and flexible interconnect topologies. The objective of this projects is to study aspects of fabric topologies within rack-scale computers and make recommendations on efficient rack-scale fabric topologies. As emerging rack-scale fabric architectures (e.g. [1][2]) differ significantly in their nature, existing simulation tools are inappropriate. This project will develop a flexible and scalable rack-scale fabric modelling tool and evaluate the performance of different rack-scale fabric topologies.

[1] Sergey Legtchenko, Nicholas Chen, Daniel Cletheroe, Antony Rowstron, Hugh Williams, and Xiaohan Zhao. XFabric: A Reconfigurable In-Rack Network for Rack-Scale Computers. NSDI 2016.
[2] Bob Alverson, Edwin Froese, Larry Kaplan and Duncan Roweth . Cray XC Series Network. Cray Inc, 2012.

Pre-requisites: This project requires basic knowledge of computer networks.

2. Modelling Interconnect Bottlenecks

Contact: Andrew Moore,Noa Zilberman (email)

New computer systems architectures seek to build hyper-converged, large systems, where a large number of compute nodes are connected through a dedicated fabric. Current computer architecture simulators provide a simplistic model of computing interconnect, failing to reflect interconnect bottlenecks.
This project aims to extend the gem5 simulator, by providing accurate modelling of the PCIe interconnect. It will then be used to show how interconnect bottleneck affects overall system performance.

[1] gem5

Pre-requisites: This project requires basic knowledge of computer networks and computer architecture.

3. Modelling Networking Bottlenecks

Contact: Andrew Moore,Noa Zilberman (email)

New computer systems architectures seek to build hyper-converged, large systems, where a large number of compute nodes are connected through a dedicated fabric. Current computer architecture simulators provide a simplistic model of networking devices, failing to reflect networking bottlenecks and properties. This project aims to extend the dist-gem5 simulator, by providing accurate modelling of networking devices. It will then be used to simulate rack-scale system performance.

[1] Mohammad Alian , Daehoon Kim, Nam Sung Kim, Gabor Dozsa, and Stephan Diestelhorst. Dist-gem5 Architecture. Micro-48 Tutorial.

Pre-requisites: This project requires basic knowledge of computer networks and computer architecture.

4. Power Analysis of Rack-Scale Systems

Contact: Andrew Moore,Noa Zilberman (email)

Rack-scale computing is an emerging technology in networked systems, replacing the server as the basic building block in enterprises and data centres ICT infrastructure. Rack-scale computing provides superior performance compared with rack enclosures fitted with stand-alone servers, providing scalable high performance networked systems. Power efficiency is of utmost importance to rack-scale computing: the power budget of the system is bounded by rack enclosure (typically 10kW-20kW), thus any increase in performance must still retain the same system power consumption. We are building an apparatus for the evaluation of rack-scale systems implementation at scale. This project will focus on the instrumentation of the system for power consumption measurement and analysis, mainly through the instrumentation existing on the NetFPGA SUME platform. The instrumentation will then be used to profile the power consumption of rack scale systems, using different architectures and fabric topologies, as well as using different workloads.

[1] NetFPGA
[2] Noa Zilberman, Yury Audzevich, Adam Covington, Andrew W. Moore. NetFPGA SUME: Toward Research Commodity 100Gb/s, IEEE Micro, vol.34, no.5, pp.32,41, Sept.-Oct. 2014
[3] Rack-scale Computing (Dagstuhl Seminar 15421)

Pre-requisites: This project requires basic knowledge of Verilog.

5. Exploring 100Gb/s Datapath Architectures

Contact: Andrew Moore,Noa Zilberman (email)

The bandwidth of network switching silicon has increased by several orders of magnitude over the last decade. This rapid increase has called for innovation in datapath architectures, with many solutions competing for their place.
NetFPGA-SUME [1] , the third generation of the NetFPGA [2] open source networking platforms, is a technology enabler for 100Gb/s and datacentre research. As a reconfigurable hardware platform, it allows rapid prototyping of various architectures, allowing to explore various trade offs.
In this project you will explore multiple 100Gb/s datapath architectures over the NetFPGA-SUME plarform and study different types of high bandwidth designs. You will research trade-offs in implementations, such as bandwidth performance, latency, power and resource usage . Real networking hardware platforms will be used for experimentation.

[1] Noa Zilberman, Yury Audzevich, Adam Covington, Andrew W. Moore. NetFPGA SUME: Toward Research Commodity 100Gb/s, IEEE Micro, vol.34, no.5, pp.32,41, Sept.-Oct. 2014

Pre-requisites: This project requires basic knowledge of computer networks and Verilog.

6. Interface Agnostic Fabrics

Contact: Andrew Moore,Noa Zilberman (email)

Computer architectures have entered a watershed as the quantity of network data generated by user applications exceeds the data-processing capacity of any individual computer. It will soon become impossible to scale existing systems. Despite this, both task-variety and task-complexity continue growing unabated. As networked computer-systems become akin to infrastructure, any limitation upon the growth in capacity and capabilities becomes an important constraint of concern to all computer users. The CAN project proposes a wholesale reconsideration in the design of computer architectures. We seek to reduce costs, save power, and increase performance in a multi-scale approach, applicable from the nano-scale to datacentre-scale computers. Highly programmable network and fabric devices are becoming the industry standard. To date, this programmability was mostly limited to the data path, with the interfaces being the known constant in any silicon device. Designing programmable interfaces, where the type of interface (e.g. networking, storage, compute interconnect) can be configured is a desirable property of SoCs. This project will study innovative ways to program SoC interfaces to work as different types of interconnects within a given system, providing unprecedented flexibility to SoCs.

[1] Noa Zilberman, Andrew W. Moore, Jon A. Crowcroft. "From Photons to Big Data Applications: Terminating Terabits", Royal Society Philosophical Transactions A, A vol.374, issue 2062, 2016

Pre-requisites: This project requires basic knowledge of computer networks and Verilog.

7. Providing QoS to VMs in the Fabric

Contact: Andrew Moore,Noa Zilberman (email)

The network is often the bottleneck for many data intensive applications. While within the host virtual machines (VMs) can be scheduled and provided a quality of service (QoS), it can not be done once the traffic is queued toward the NIC. It is possible today to provide some QoS over the PCIe express for up to a thousand VMs, but no more. In contrast, in networking it is common to provide QoS to hundreds of thousands of traffic flows. This project will look on mapping incoming traffic (to a fabric device) to VMs and jobs, in order to provide a large-scale QoS in the hardware. This will not only improve the QoS, but also improve the predictability and reduce latency of running applications.

[1] Noa Zilberman, Andrew W. Moore, Jon A. Crowcroft. "From Photons to Big Data Applications: Terminating Terabits", Royal Society Philosophical Transactions A, A vol.374, issue 2062, 2016

Pre-requisites: This project requires basic knowledge of computer networks and Verilog.

8. The Networking of Data Science

Contact: Andrew Moore,Noa Zilberman (email)

Data science has become part of our everyday life, even if we are not aware of it. Networked applications that process huge amounts of data run in the cloud and affect not only social networks and online shopping, but also finance, security and science. Due to the nature of data science applications, they usually run within data centres, and the knowledge of these application's behavior is limited to data centre operators. As data centre operators keep their data confidential, very little information was published (e.g. [1],[2]). In the lack of such ground truth, academic research is limited in its ability to develop novel system and networking solutions that fit the age of data science.
This project aims to create network profiles for different data science applications: from common key-value store to academic big-data projects (e.g. SKA - the square kilometer array), gathering ground truth data from application running in a local data centre. The outputs of this project will be used to model and assess new data centre architectures, directing future designs.

[1] Theophilus Benson, Aditya Akella and David Maltz. Network Traffic Characteristics of Data Centers in the Wild. Proceedings of the Internet Measurement Conference (IMC), 2010.
[2] Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. Workload analysis of a large-scale key-value store. SIGMETRICS 2012.

9. Graph Processing Optimisation using Structured Bayesian Optimisation

Contact: Eiko Yoneki

Keywords: Machine Learning, Graph Processing, Optimisation

Graph processing can take advantage of processor heterogeneity (i.e. CPU/GPU) to adapt to structural data patterns. The overall aim of graph processing can be seen as scheduling irregular tasks to optimise data-parallel heterogeneous graph processing, and in this project, optimal dispatching graph elements to appropriate computation units will be explored using our recent work on the extension of Bayesian Optimisation. The optimisation platform will be provided, and the task in this project is building a graph specific auto-tuner on top of it, which requires a good knowledge of graph algorithms and understanding processing patterns and task scheduling.

[1] K. Nilakant and E. Yoneki: On the Efficacy of APUs for Heterogeneous Graph Computation. EuroSys - SMFA, 2014 (
[2] Dalibard, V., Yoneki, E.: Tuning Computer Systems with Structured Bayesian Optimization (

10. Distributed Stochastic Gradient Descent on Naiad

Contact: Eiko Yoneki

Keywords: Data flow programming, Machine Learning, Parallel Computing in Distributed Systems

Naiad is a distributed system framework that allows for automatic parallelisation of programs in a distributed environment [1]. In this project, you will use the RUST implementation of NAIAD and build a distributed implementation of Stochastic Gradient Descent (SGD) [2], which is common algorithms used in Machine Learning. The project will provide efficient parallel processing for SGD in optimised fashion, where different iterations of the algorithm for different data points can be run in parallel. An efficient distributed implementation using NAIAD can take advantage of the incremental and iterative computation. You would also write an application to evaluate the project.

[1] D. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi. Naiad: a timely dataflow system. SOSP, 2013.
[2] Kevin P Murphy. Machine learning: a probabilistic perspective. MIT press, 2012.

11. Building Graph Query Function using Functional Programming

Contact: Eiko Yoneki

Keywords: Graph, Functional programming, Database, NOSQL

Demand to store and search of online data with graph structure is emerging. Such data range from online social networks to web links and it requires efficient query processing in a scalable manner. In this project, you will build a graph query function (layer) to achieve efficient graph data processing. The graph query function builds on a lazy graph loaded from multiple sources and performs queries at the same speed or faster than a relational database. The function should be written in Clojure or another functional language to support practical lazy loading from data source to an in-memory graph database. Efficient representations for graph queries should be also investigated. The evaluation includes comparisons with existing Graph Database and the query capability comparing with SQL. You can start the project by our existing work in this area.

[1] Microsoft Research, Trinity project: Distributed graph database,
[2] Neo Technology, Java graph database,

12. Dynamic Task Scheduling on Heterogeneous CPU/GPU Environment using ML for Parallel Processing

Contact: Eiko Yoneki

Keywords: GPU Clusters, Heterogeneous many/multi-core, Parallel Computing, OpenCL, Task Scheduling

In this project, various aspects of parallel processing will be explored using a new generation of CPU/GPU integrated board, where more than one GPU clusters are placed on a chip. We use ARM based Mali-T628 MP6, in Exynos 5422 [1] [2]. Using OpenCL, tasks can be dispatched to GPU and CPU code in parallel. This new GPUs makes it possible to cluster the GPU nodes for different scale of parallel processing. We use a simulator on top of the hardware to experiment various task scheduling strategies explored by the machine learning methodologies for prediction of workload, vector instructions, and mixture of model parallelism and data parallelism. Application running on top could be image analysis or graph processing. Graph processing can take advantage of processor heterogeneity to adapt to structural data patterns. The overall aim of graph processing can be seen as scheduling irregular tasks to optimise data-parallel heterogeneous graph processing, by analysing the graph at runtime and dispatching graph elements to appropriate computation units. Efficient scheduling underlies the vision of a heterogeneous runtime platform for graph computation, where a data-centric scheduler is used to achieve optimal workload.


13. Approximate Algorithms Determining Local Clustering Coefficients Anonymously

Contact: Eiko Yoneki

Keywords: Sampling, Approximation, Privacy, Cluster Coefficient

Anonymous social networks are a new phenomenon in an increasingly privacy conscious world. A natural question to ask in this setting is whether we can continue to apply known principles of network science in such settings where neighbourhood information is concealed both between nodes and external observers. This project is to work on approximate algorithms that determines clustering coefficient in such settings. Clustering coefficients provide a way to relatively order nodes by importance in a network, determine densely connected neighbourhoods and distinguishing social graphs from web graphs. Algorithms to measure clustering coefficients have hitherto required global knowledge of graph links. This requires individual nodes in the graph to give up the identity of their neighbours. This project investigates an algorithm for estimating the clustering coefficient of a vertex by exchanging only anonymised set summaries of neighbours, from which it is difficult to reverse engineer individual links. The bulk of the project will consist of working on and improving sampling techniques to arrive at accurate local clustering coefficients without exchanging explicit neighbour lists in social networks.

[1] P. Flajolet, Eric Fusy, O. Gandouet, and et al. Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm. In Proceedings of the International Conference on Analysis of Algorithms, 2007.

14. RasPi-Net: Building Stream Data Processing Platform over RasPiNET

Contact: Eiko Yoneki

Keywords: Raspberry Pi, Delay Tolerant Networks, Satellite Communication, Stream Processing

We have built a decentralised Raspberry Pi network (RasPiNET [1]), which can be deployed in wild and remote regions as a standalone network. The gateway Raspberry Pi nodes are integrated with satellite communication devices, where the light version of Delay Tolerant Network (DTN) bundle protocol is embedded. RasPiNET could consist of 10-100 nodes. As an example, a remote sensing application could be written either in RasPi or Smart phones that can connect to RasPi. Collected data could be processed within RasPiNET to reduce data size that streams over the satellite communication to the base location. The crowd sourcing application can run on top of RasPiNET, too. The goal of this project is building a stream processing platform in both directions: from data collection from RasPiNET nodes to the data processing nodes possibly via a satellite gateway and from bulk of data delivery to the satellite gateway node to disseminate necessary information to RasPiNET nodes. A good filtering function and RasPiNET in-network data aggregation could be developed.

[1] E. Yoneki: RasPiNET: Decentralised Communication and Sensing Platform with Satellite Connectivity. ACM CHANTS, 2014.
[2] Delay Tolerant Network Bundle Protocol:
[3] RockBlock Technology:

15. Clustering Entities across Multiple Documents in Massive Scale

Contact: Eiko Yoneki

Keywords: Clustering, Graph Partitioning, Random Walk, Distributed Algorithms

Many large-scale distributed problems include the optimal storage of large sets of graph structured data over several hosts - a key problem in today's Cloud infrastructure. However, in very large-scale distributed scenarios, state-of-the-art algorithms are not directly applicable, because frequent global operations over the entire graph are difficult. In [1], balanced graph partitioning is achieved by a fully distributed algorithm, called Ja-be-Ja that uses local search and simulated annealing techniques for graph partitioning annealing techniques for graph partitioning. The algorithm is massively parallel: each node is processed independently, and only the direct neighbours of the nodes and a small subset of random nodes in the graph need to be known locally. Strict synchronisation is not required. These features allow Ja-be-Ja to be easily adapted to any distributed graph processing system. This project starts by understanding Ja-be-Ja, and investigates further performance improvement. A case study: a graph-based approach to coreference resolution, where a graph representation of the documents and their context is used and applying a community detection algorithm based in [1] can speed up the task of coreference resolution by a very large degree.

[1] Fatemeh Rahimian, Amir H. Payberah, Sarunas Girdzijauskas, Mark Jelasity and Seif Haridi: JabeJa: A Distributed Algorithm for Balanced Graph Partitioning, IEEE International Conference on Self-Adaptive and Self-Organizing Systems (SASO), 2013.
[2] Fatemeh Rahimian, Amir H. Payberah, Sarunas Girdzijauskas, and Seif Haridi: Distributed Vertex-Cut Partitioning, DAIS, 2014.

16. Graph Compression in the Semi-External Memory Environment

Contact: Eiko Yoneki

Keywords: Graph Compression, Encoding

This project explores graph compression mechanisms as part of a project looking into high performance semi-external memory graph processing (see [1] to get an idea of semi-external memory approaches). The graph compression work will build on an in-house graph representation format that we have developed that allows succinct representation of graphs that show hierarchical structures. The project will explore ways to improve the representation yielding smaller graphs on disk that are less costly to traverse. A key element of the representation scheme is a recursive graph partitioning step that minimises the number of edges between partitions. This is a rich space for exploration of suitable algorithms. We are concerned primarily with experimentally evaluating I/O costs on large graphs and measuring the compression performance. However, a student with a theoretical background might consider designing algorithms with provable bounds on compression performance, which would be a big plus. If time allows you could also implement an efficient transformation tool based on the developed graph compression algorithm using parallel processing tools (e.g. map/reduce).

[1] R. Pearce, M. Gokhale and N. Amato: Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory, ACM/IEEE High Performance Computing, Networking, Storage and Analysis, 2010.

17. Air traffic management for drones in smart cities

Contact: Cecilia Mascolo, Andrea Gaglione (email)

Unmanned aircraft systems, also called drones, have become widely used in civil applications (e.g., natural disaster control and monitoring, security, traffic and crowd management, infrastructure monitoring, agriculture, environmental monitoring) and are actively contributing to the development of smart cities of the future. As the number of such devices flying in the city skies at low altitudes is constantly increasing, one of the main challenges is their traffic regulation.
In this project, you will design and build novel traffic control and management services to prevent drones to collide with buildings, larger aircraft, or one another. You will design the airspace and flight planning services, and also exploit recent advances in communication technologies to guarantee safe and efficient operations.

18. Virtual reality game-based cognitive training for people at risk of dementia

Contact: Cecilia Mascolo (email) in cooperation with Hatice Gunes (Computer lab) and Dennis Chan, Cambridge Neuroscience

The management of dementia represents one of the greatest challenges for healthcare systems worldwide. In the absence of cures for diseases such as Alzheimer's disease (AD), there is increasing interest in the prevention of dementia via lifestyle interventions. There is extensive epidemiological evidence to suggest that cognitive, physical and social activity in later life can reduce dementia incidence by enhancing cognitive reserve (CR), such that cognition is maintained in the face of AD pathology. This has stimulated major interest in the use of puzzle- and task-based "brain training" methods as means of preventing AD. However, the efficacy of these approaches have been reduced by low adherence to the training programme, due to reduced user interest in undertaking such tasks long term as a result of limited enjoyment. Furthermore, such tasks engage only cognitive activity and do not promote the physical and social activity that additionally contribute to CR.
The problem with adherence has led researchers to explore the idea of "gamification" of cognitive training programmes, on the basis that a games-based approach may be more enjoyable and thus result in improved adherence. This innovative project builds on this concept in three ways. First, it will utilise immersive virtual reality (iVR) platforms for application of game-style cognitive training programmes. It is theorised that iVR-based games are more cognitively stimulating than traditional computer-based games, as a result of the superior sensory involvement, and that their first-person immersive nature may be easier for older people unused to desktop-based computer games.  Second, the social activity resulting from future multi-user iVR games would augment the benefits to CR provided by the cognitive activity alone. Finally, our previous work has shown that there are affective, physiological and behavioural differences when people play games in solo, competitive and collaborative modes [1]. These differing responses can feed into adaptive paradigms, such that games can be tailored to individual preferences, creating personalised programmes that may be superior to current "one size fits all" games in terms of adherence and CR enhancement.
In light of the above, this project will focus on interaction and game design in VR settings for people at risk of AD, and will constitute of the following tasks:

  • 1. Extensive review of existing VR games that can be played in solo, collaborative and competitive modes
  • 2. Case study with participants to record and measure their nonverbal behaviours and engagement levels when they play existing VR games in the following conditions
    • a. Fun games vs. cognitive games aimed at improving memory
    • b. Solo vs. collaborative vs. competitive game modes
  • 3. Qualitative and quantitative analysis of data recorded
    • a. Automatic feature extraction from recorded data
    • b. Machine learning to train predictive models
  • 4. Based on the findings obtained, designing and implementing a VR game tailored to older people at risk of AD e.g., using Unity 3D software with Unityscript  and C# (or similar) for game development. Special attention to be placed on use of system resources (processing and power) as well as on use of lightweight machine learning frameworks for local computation.

Suggested reading material:
D. Gabana Arellano, L. Tokarchuck and and H. Gunes, Measuring Affective, Physiological and Behavioural Differences in Solo, Competitive and Collaborative Games, Proc. of the International Conference on Intelligent Technologies for Interactive Entertainment (INTETAIN 2016), Utrecht, The Netherlands, Jun. 2016.
A Discussion of the Use of Virtual Reality in Dementia: al_reality_in_dementia.pdf
Virtual Reality app offers unique glimpse into life with dementia:
Sending your grandparents to university increases cognitive reserve: The Tasmanian Healthy Brain Project.
Does a Combination of Virtual Reality, Neuromodulation and Neuroimaging Provide a Comprehensive Platform for Neurorehabilitation? - A Narrative Review of the Literature.

19. Learning user context from the observation of network and systems data

Contact: Richard Mortier

Online services and applications are collecting more and more personal data about users to adapt and optimize their services to each individual user. Examples of service optimization range from your personalized Facebook feed to Google maps displaying the area around you. This personalization can take a wide range of inputs from the Web pages a user visits and the films she watches to the places she visits. We are particularly interested in services that adapt to a user’s context (e.g., the geographical location in the Google maps case). User context is fairly broad; it can encompass, e.g., the location, the time of day, who the user is with, what the user is doing. Our goal is to be able to infer user context only by observing technical data that we can collect automatically by instrumenting the user’s device or the network that carries the user’s traffic. To better understand the issues involved in inferring user context, in the European project User-Centric Networking, we have conducted a user study that simultaneously collected both technical and ethnographic data from a population of 25 users across the UK and France. The technical data includes full packet traces, location information, and the set of applications used on each participant’s devices. The user observations adopted an ethnographic approach to detail each participant’s online activities, routines, and exceptions, as well as their reasoning for conducting certain activities at certain times, locations, or with certain devices. Our initial analysis of this data highlights the gap between the user’s reasoning and what we can capture with network and application level statistics alone.

The goal of this project is two-fold: (i) demonstrate and properly evidence the gap between user reasoning and technical data; and (ii) develop methods to bridge this gap by analyzing the technical data to infer user context. To demonstrate the gap, the student will analyze the transcripts made by the ethnographers after the discussions with users to identify the vocabulary and ways in which users talk about their activities. This analysis will allow us to compare the user reasoning with the technical data we can gather from users. The student will then develop methods to process the technical data to extract the user context. Our manual analysis of the data identified a few data types that help capture context: location data, visited websites, and time of day. To make sense out of the data, however, we had to conduct a fair amount of manual search and inference. The goal is that the student will identify methods to automate this process. The student should develop scientific skills on network measurements and data analysis.

Desirable skills:

  • Knowledge of data analysis techniques
  • Knowledge of network traffic measurements
  • Knowledge of matlab or GNU R

[1] P. Brundell, A. Crabtree, R. Mortier, T. Rodden, P. Tennent, P. Tolmie. The Network from Above and Below. In Proc.of the ACM SIGCOMM W-MUST workshop, 2011.
[2] A. Brown, R. Mortier, T. Rodden. An exploration of user recognition on domestic networks using NetFlow records. Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, 2014.

More Systems Projects at the DTG Project Page

Contact: Ripduman Sohan

Please see the DTG project suggestions page for a number of interesting systems projects.