NetOS
This page collects together various Part II project suggestions from
the Network and Operating Systems part of the Systems Research Group.
In all cases there is a contact e-mail address given; please get in
touch if you want more information about the project.
Note: there are a number of stumbling blocks in Part II project
selection, proposal generation and execution. Some useful guidance
from a CST alumnus (and current NetOS PhD student) is here.
-
Various web2.0ish, distributed (social) networksy projects.
Contact: Nishanth Sastry
Last update: 8/Sep/08
The general theme is post Web2.0, distributed or (social) networksy stuff. I am willing to oversee one, or at most two, of the projects below, based on interest. All other things being equal, I will choose to oversee projects that interest me most (more asterisks below=>more interest, possibility of answering research questions =>more interest).
- Secretary Bot**
- In the dark ages before the Internet, a human secretary would guard access to the important employees, organise their days, take phone calls on their behalf, remind them of important meetings etc. These days, people are deluged by an enormous amount of information in the form of email, instant messaging, phone etc.[1], and face enormous difficulties in juggling multiple tasks [2].
The goal of the project is to build a secretary bot that would mediate on behalf of the user and manage their attention. Instant messages should only be allowed if the user is not working on something important. Arrival of unimportant emails need not be notified whilst the user is engaged in a complex task. Phone calls can be forwarded to voice mail directly. At the same time, important interrupts should be allowed to go through. User should be reminded of and prepared for imminent meetings and appointments.
An example implementation might use XMPP to interface with google talk, use the google calendar API to store, and retrieve appointments, use Twitter API to send reminders to mobile phones, even when the user is away from their computers. Interaction with the bot could be in a language resembling natural english, to make it seem even more secretary-like. By integrating tightly with the OS you choose, you can detect what apps (tasks) the user is focusing on, and determine whether it is an interruptible moment or not.
[1] Models of Attention in Computing and Communication
[2] A diary study of task switching and interruptions
Skills and interests: HCI, NLP, Web 2.0 APIs
Challenge level: Simple. A safe grade would likely be easy to get. A high grade is not out of bounds, but depends entirely on your creativity in coming up with useful secretary-like features, and level of tight integration achieved with user's task flow.
- Distributed Twitter***
- Twitter is a popular micro-blogging site that sends user status updates ('tweets') to friends who subscribe to such feeds. Twitter uses SMS & Instant Messaging services to distribute these updates. The goal of this project is to instead use a pocket switched network [1] to transmit tweets. Imlement an Android app that sends tweets to people near by, who then send it to other people they meet later on, and so on, until all the intended recipients of a tweet receive it. Evaluate the idea using traces of real human contacts, which will be made available.
Extension 1: A person who has a lot of SMSes free in their monthly plan could be used as a gateway to transmit tweet SMSes of people near by.
Extension 2: Hashtags have developed as a means to provide context [2]. For instance, users can subscribe to a bus stop (e.g. JJ Thomson Ave) and be alerted that their bus is about to arrive via tweets from other passengers at previous bus stops. Ditto for current traffic conditions in major cities. Use presence at a bus stop to detect what contexts other users are interested in, and develop a method to automatically decide which contexts to subscribe to.
[1] Pocket Switched Networks: Real-world mobility and its consequences for opportunistic forwarding
[2] Usefulness of Micro-blogging (see section on Twitter as Message Bus)
Skills and interests: Web 2.0 APIs, Mobile platforms
Challenge level: Moderately simple.
- Spreading your Aura*****
- How can friends of your friends discover what you are good at, what your interests are and so forth? What about friends of friends of friends? What about friends of friends of friends of friends?...
[1] proposes a way to distribute expertise information via email in a social network using a modification of bloom filters [2]. This can easily be extended to other kinds of keyword information as well. I call this spreading your aura. In this project, you will investigate how well aura can be spread in a social network by mining friendship data from an online social network such as myspace or facebook or orkut (APIs are available to help do this).
[1] Disseminating Expertise in Social Networks
[2] Bloom Filter
Skills and interests: Implementing algorithms and cool data structures
Challenge level: Moderately simple. With proper evaluation, this could be converted into a research paper.
- Folksonomy-based routing****
- [1] suggests a way to use people's interests to route data in a pocket switched network. User interests are captured by a folksonomy, and relations between interests can be found by forming a bipartite graph with the folksonomy tags on one side and the tagged objects on the other. Related tags are then used to make routing decisions. The goal of the project is to implement and evaluate this idea.
[1] Folksonomy-based Reasoning in Pocket Switched Networks
Skills and interests: Social Networks, routing, graph algorithms
Challenge level: Moderately hard - most of the difficulty is in coming up with a convincing evaluation methodology for validating your solution. This could become a research paper.
- Statistical characterisation of online social networks****
- User generated content and the social web revolution have thrown up numerous sites where social interaction is part of the added value.
Examples:
Youtube, flickr, twitter, overstock, jaman, amazon, netflix (ilovefilm)
Analyse and characterise one of them. See [1,2] for illustrations of characterisation. Folksonomies embedded in many of these sites have not been characterised and offers an interesting opportunity.
[1] I Tube, You Tube, Everybody Tubes: Analyzing the World's Largest User Generated Content System
[2] Delivery Properties of Human Social Networks
Skills and interests: statistical skills, scripting (perl, python etc).
Challenge level: Hard - the difficulty is in finding interesting things in the characterisation. If done well, there is lot of scope to do great research. For instance, [1] got the best paper award at the 2007 Internet Measurement Conference.
- Automatic code splitting between server and client for web apps****
- Web apps have a reputation of being easier to write than systems programs, but there is a huge deal of difficulty because different parts of the code are in different languages and execute at different times (e.g. javascript on client, and JSP on the server). It is hard to imagine when writing the code that some of the code actually will not execute in a linear fashion. This complex information flow is the source of a lot of bugs and security holes.
This project would take an existing web application framework and add a preprocessor that would split code into server and client side executables. [1] is an excellent first attempt at doing this, and solves some of the most difficult problems ([1] got the SOSP 2007 best paper award). The contribution of this project would be to make it practical by embedding into a widely used web app framework (e.g. JSP, Rails, Django).
[1] Secure Web Applications via Automatic Partitioning (the PPT may be a good intro)
Skills and interests: Experience writing web apps, strong programming skills, knowledge of security
Challenge level: Very Hard!! If successful (and sufficiently improved from [1]), this will be an outstanding piece of research.
-
Social Network based Spam Filtering
Contact: Jon Crowcroft
There are a lot of social networks out there - we have a project characterising
them. These are not just the obvious online systems like facebook, bebo,
linkedin, etc, but also the networks implicit in the set of activities that are
human such as sports clubs, work, family, contacts either face-to-face, or via
communications (e-mail, IM, SMS, etc).
It is conjectured (and has been shown by several research groups) that while
spam emanates from hijacked machines and even hijacked accounts, it has at least
one characteristic which is that it is distributed to people who have little
social relationship.
This project is to build a system based on work in the Haggle project, that characterises social
graphs by co-location, and work in the spam filtering world based on afiliation
and communcation graphs. Public data sets on co-location and on afiliation (via
the web) and e-mail (believe it or not, Enron's trial evidence!), can be used to
test a tool - it loosely could be integrated with the University's own deployed system
(spamassasin), and can be readily evaluated with a number of corpuses of spam.
Note that spam filtering is realtively compute intensive, and algoriths to
(re-)compute social graphs over large datasets can be costly, so some thought
needs to be put into the implementation, which makes for an interesting as well as
useful project. [Note a social net is a "temporal graph" with nodes represernting
people, edges representing relationships - friends of friends (of friends) are 2
(3) hops away. An edge relationship could be a meeting (or a minimum number and
duration of meetings) or it could be a family relationship, or a communications
event, or co-authoring a paper (or 1B project) with someone.
This project is moderate, rather than easy or dificult. There could be some help
from Computing Service folks, but it is also nicely achieveable in a standalone
way.
Refs (papers can be got from google scholar:
-
Social Networks Project
-
Haggle Project
-
Spamassassin
-
Spam!, by
LF Cranor, BA LaMacchia - Communications of the ACM, 1988
-
Personal Email Networks: An Effective Anti-Spam Tool, by PO Boykin, V Roychowdhury -
-
SybilGuard: defending against sybil attacks via social networks, by
H Yu, M Kaminsky, PB Gibbons, A Flaxman
-
Re: Reliable Email, by
S Garriss, M Kaminsky, MJ Freedman, B Karp, et al
-
Collaborative blog spam filtering using adaptive percolation search
by S Han, Y Ahn, S Moon, H Jeong
-
MailRank: using ranking for spam detection, by
PA Chirita, J Diederich, W Nejdl
-
Collaborative Spam Filtering Using E-Mail Networks, by
JS Kong, BA Rezaei, N Sarshar, VP Roychowdhury,
An interesting add on is to see if the different social networks (characterised by
different edge types - where type is face-to-face, e-mail, telephone call,
afilliation, etc - are correlated with each other, or causal of each other.
And which is more effective as a spam predictor.
Along the way, various interesting statisitcs could be gathered - for example,
people who have more "friends" tend to be (allegedly) more linguistically capable
and so one could look at the content of their communication and estimate their
vocabulary size.
-
IDRM -Inter-Domain Routing Protocol for MANETs
Contact: Jon Crowcroft
We have a whole slew of routing algorithms for mobile ad hoc networks (MANETs).
MANETs are next years big thing when radio devices start to interconnect in-car
entertainment systems. So then we'll have lots of seperate MANETs and we'll need
to glue them together. We have a
spec for just such a protocol. We also have an simulated version, courtesy of
IBM research, but we dont have a real implementation.
This project is to
implement, and evaluate IDRM on a bunch of windows smart phones (with wifi and
other radios) and laptops and fixed wireless nodes.
One challenge in this is to devise a suitable evaluation scenario.
A Post doc who was associated with the work last year will be around to help
out as well as the proposer (see contact above!).
-
peer-to-peer online netflix (lovefilm)
Contact: Jon Crowcroft
Net2netflix (name subject to change)
Discussion document
v0
Solution has 2 parts
- delivery network:
The plan is to use an off-the-shelf structured p2p solution. Siddhu
is looking at different approaches, bittorrent, vidtorrent, zigzag for
delivery tree, etc. and evaluating the pros and cons of each one. Initial
report to be discussed around the end of September or so.
- superpeer overlay:
where most of our "new stuff" will come in to play, allowing
- the deployment (push
model) of superpeers on chosen xenoservers
- the selection of
"movie lists" by clients
- admission control
based on available resources, movies, and the probability of meeting
the delivery guarantee for a given movie list
- recommendations
of alternative movies etc
- prediction of delivery
time based on available copies of each move in the list and the estimated
bandwidth to/from the nodes
- management and monitoring
of downloads, including the superpeer "stepping in" when it looks
like a guarantee will not be met
The operation of the system
could be summarised as follows:
- a client contacts
the closest (or a close enough) superpeer, providing a list of movies
he or she wishes to view in the next few days
- the superpeer looks
in its own cache of movies as well as in the peers around it, to see
which movies (and how many copies of each one) are available in the
client's vicinity
- the superpeer combines
its own statistics with client-side measurements (?) to come up with
a prediction in terms of the probability of a successful download within
a given time for the first few movies in the list, based on
- availability of
movie copies in the superpeer's vicinity
- availability of
the movie in the superpeer's cache
- bandwidth availability
between the client and the peers that hold the movie
- bandwidth availability
between the superpeer and the client
- ?
- based on the feasibility
of transmitting one of the selected movies to the client in the
required time, and a "probability threshold", the superpeer's
admission control system accepts or denies the client movie list request
- the superpeer monitors
all downloads, and if a download looks like it is not going to meet
the delivery guarantee (in terms of time) it steps in, allowing the
movie to be downloaded directly from its own cache (at a relatively
predictable and stable throughput). This costs resources to the superpeer
and is a "fallback" plan.
could use planetlab for
a start -- latency and throughput very stable
eric: there is a good chance we can do fairly ok
challenge
4: work around firewalls?
-
NetFPGA Storage Device
Contact: Andrew Moore
Network Attached Storage is not a new thing, commodity devices that
implement NAS are increasingly common. Ever wondered how they work? What
makes them hard and expensive (or cheap and easy)?
The aim of this project is to construct a simple NAS device using a dedicated
development board based upon the NetFPGA platform.
You will be provided a development system, a linux box with NetFPGA
board and development tools. The NetFPGA board is equiped with SATA
connectors for attaching a disk system.
This project is very flexible and permits a student to make specific
decisions about the co-design of operating-system and hardware
components.
A network-attached disk may be a tiny PC, implementing an entire
NFS/CIFS system in software and accessing attached disks in the same
manner as Server
OR at the other end of the scale
To achieve higher-performance or fewer componnents, a network-attached
disk may be use a much more limited general-purpose CPU, implementing
more of the higher performance tasks in FPGA hardware.
At the other end of complexity is the implementation
You will be expected to implement at least one file-system protocol,
e.g., NFSv2 over UDP, and demonstrate its functionality.
This project is based upon the NetFPGA platform - a board
specifically intended to allow the development of software/FPGA projects
for networking.
Confidence with a hardware compiler language (e.g. Verilog) is
mandatory, familiarity with C will also help greatly.
This project is advanced and should not be undertaken by a weak student.
-
Hardware (High-Speed) network-application identification
Contact: Andrew Moore
An active piece of work in the SRG has been on how to identify
applications from their network traffic...
This project would look to create a (high-speed, hardware)
implementation of what currently is a software prototype.
This project incorporates all the classic problems of
hardware/software co-design with decisions about code and algorithm
placement being critical in order to achieve best performance.
This project is based upon the NetFPGA platform - a board
specifically intended to allow the development of software/FPGA projects
for networking.
Familiarity with a hardware compiler language (e.g. Verilog) is
mandatory, familiarity with C/C++ will help a lot.
This project is advanced and should not be undertaken by a weak student.
-
NetFPGA implementation of flow-aware packet sampling mechanism
Contact: Andrew Moore
The aim of this project is to implement a flow-aware packet sampling
mechanism using a dedicated development board based upon the NetFPGA
platform.
The mechanism is already designed and a proof-of-concept implementation
written in C++ exists.
You will be provided a development system, a Linux box with NetFPGA
board and development tools.
You will be expected to implement the sampling mechanism and demonstrate
its functionality.
This project is based upon the NetFPGA platform - a board
specifically intended to allow the development of software/FPGA projects
for networking.
Familiarity with a hardware compiler language (e.g. Verilog) is
mandatory, familiarity with C/C++ will help a lot.
This project is advanced and should not be undertaken by a weak student.
-
Build a better Network monitor
Contact: Andrew Moore
Network monitoring hardware can be very expensive - requiring both high
bandwidth I/O and
high relability/resolution clocks.
The specific aim of this project is to build a networking monitoring
system based around the NetFPGA card.
The NetFPGA platform enables the development of novel and interesting
software/FPGA projects for networking. With four 1Gbps ports and
onboard SATA controllers, this system is very capable.
This includes examining what the limits are for
network-hardware to maintain accurate/high-reliability time-stamping.
One aspect of this project is to implement an existing time-stamp solution that seems
common use in the network-capture community but previously has required
specialist hardware solutions.
This project is based upon the NetFPGA platform - a board
specifically intended to allow the development of software/FPGA projects
for networking.
Comfort with a hardware compiler language (e.g. Verilog) is
mandator, familiarity with C/C++ will also help.
-
NetFPGA Bright Idea
Contact: Andrew Moore
The NetFPGA platform enables the development of novel and interesting
software/FPGA projects for networking. With four 1Gbps ports this system is
very capable
What is your bright idea?
This project is based upon the NetFPGA platform - a board
specifically intended to allow the development of software/FPGA projects
for networking.
Familiarity with a hardware compiler language (e.g. Verilog) is
mandatory.
-
NetFPGA implementation of MOOSE
Contact: Malcolm Scott
MOOSE is an extension to Ethernet being developed in the lab which adds hierarchical MAC addressing and address rewriting to improve its scalability. An implementation of a MOOSE switch on a NetFPGA board (see above) would be an interesting project. This should be implemented in a similar way to fast, high-end Ethernet switches — i.e. a true hardware switching fabric, not a software switch.
This would suit a fairly strong student with a good knowledge of Verilog, Linux tools and network protocols. Switch fabric design will be taught in Digital Communication II but you are likely to need to read up on this before the course.
-
Simulation of MOOSE
Contact: Malcolm Scott
MOOSE is an extension to Ethernet being developed in the lab which adds hierarchical MAC addressing and address rewriting to improve its scalability. A study of MOOSE in simulation on a large network does not yet exist, and would be a useful resource.
It is very strongly recommended to make use of an existing widely-used simulation tool; writing your own introduces considerable extra requirements for validation and testing and will take much longer to produce credible results. However, it will be necessary to adapt the tool you choose in order to add an implementation of MOOSE.
This is likely to require fluency in C or other languages depending on your choice of simulator.
-
BitTorrent for Mobile Content Sharing
Contact: Cecilia Mascolo
We have developed an approach to content sharing which allows users to share the content of their Bluetooth devices with peers met, for instance on public transport, coffee shops, etc.
This project relates to the expansion of this tool with some ideas taken from BitTorrent: i.e., sessions of download do not need to contain full files but could allow the download of chunks.
The project will involve the development of this extention for both a mobile phone implementation and for a simulator called Omnet++ on which large scale performance can be tested.
Documentation on this project is available upon request (other extensions, such as trust related issues can also be added).
-
Application of the Cannon and Fox-Otto algorithms to efficient parallel
route planning methods
Lab Contact: Andrew Moore
Originator: Jagdish Modi, email: at89@dial.pipex.com
Special Resources: None
Various investigations have been made into parallel algorithms for
fastest-route determination, in particular (for example [1]) for journeys
between London Underground stations, based on the available travel
information. These involve a sequence of operations on route matrices. For
efficient computation it is useful both to avoid large sparse areas and to
minimize data movements.
One approach could be to modify the Cannon algorithm or Fox -Otto
algorithm, which work on sub-blocks and are very space efficient. The use
of Cannons algorithm, for example, avoids the need for a single processor
to collect and redistribute the data before and after each operation, and
therefore is efficient in both time and space. The Fox -Otto algorithm is
similar, where for a matrix product A*B each sub-block of matrix A is
broadcast across each row of processors and then multiplied with each
sub-block of matrix B. The B blocks are then rotated upwards and another
row broadcast is performed.
Developments might include:
novel ways of using an analogue of matrix squaring instead of
successive multiplications, to reduce the total number of steps, and
incorporating the path specification for all paths simultaneously
to investigate different data layout algorithms taking account of
the processor topology, and their effect on the overall performance; a
layout algorithm that takes an arbitrary topology and returns an optimal
data layout would be of interest
Parallel computers have as a core structure a set of interconnected
processors. The optimal choice of interconnection for a particular problem
may not always be available. In fact, in practice the interconnection
usually takes place within some sort of plane two-dimensional lattice. In
the preparation of the parallel algorithm, it will be interesting to
explore the features of the two-dimensional graph arising from the
connections of stations (nodes) and the travel times between stations
(edges), and to compare this with the lattice of processor connectivity. If
the student is mathematically inclined, it may be worthwhile to investigate
certain properties possessed by the two-dimensional graph, such as degree
of symmetry and homogeneity, as well as the extent to which they can be
embedded onto a lattice.
In addition it may be useful to implement the algorithm on a random
topology of interconnected processors, evaluate the cost component, and
then to compare this with the cost on a lattice connectivity, so as to
determine the extent to which interconnection between processors plays an
important role. In this case it would be instructive to find out what other
attributes of the parallel machine have a bearing on the overall cost.
References:
[1]Reynolds M, Improved Parallel Route Planning for the London
Underground, Computer Science Tripos, Part II, Computer Laboratory, 2008.
[2]Cormen, Leiserson, Rivest, Stein, Introduction to Algorithms,
second edition, 2003.
[3]Modi J, Parallel Algorithms and Matrix Computation, OUP,
1988.
[4]Grama, Gupta, Karypis, Kumar, Introduction to Parallel
Computing, Pearson Education, second edition, 2005.
[5]Akpan, O. Efficient parallel implementation of the Fox
algorithm. Computer Science Department, Bowie State University, 2003.
- Optimising object placement in a client-server distributed system
Contact: Stephen Kell
The goal of this project is to develop a client-server distributed system which monitors and
optimises itself with respect to the placement of its working storage and/or computation. In other
words, it shifts computation (threads) and state (objects) from client to server or vice-versa, in
order to improve overall performance. It must do this across variations in latency and throughput of
the link between client and server, and amid disparity of computational resources available at each.
If you've ever tried using a thin-client protocol (say an X11 application, or even a web
application) over a high-latency link, you will understand why this could be useful.
Usually the partitioning of computation and storage between the client and server locations is
fixed. Despite this, typically users don't care where computation happens, nor where working storage
happens. (They do care, if often only a little, about where persistent storage happens.) Performance
of the system depends mostly on the locations of computation and working storage, both because of
the differences in available resources, and because of the available communication throughput and
latency (and jitter, etc.). To take on this project, you will need a very strong set of
practical skills, and should already have ideas about how you would go about implementing such a
system. For example, you might take an X11 client and server, and build a runtime monitoring and
migration service which optimises the system by moving state (objects) or computations (threads;
harder!) between client and server. One option would then be to reengineer the communication
protocol to use an object middleware like Java RMI, and extend the JVM to support migration of
objects (i.e. atomically replacing a local object with a stub, which hands off to a migrated copy
of the original object on a remote system). However, you should expect to bring your own ideas to
the problem. Experimental evaluation is fairly straightforward with this project. There are also (if
you do it well!) lots of interesting algorithms involved. Difficulty: hard!
-
A self-organising file system
Contact: Stephen Kell
If you're anything like me, your file system tends to build up collections of assorted downloads
which ideally you'd like to sort through and file somewhere appropriate in a directory structure.
You probably also have some files which you misfiled for various reasons. It'd be neat if a tool
could learn the rules underlying our individual filing system, and then automatically file stuff as
it comes in, and point out exceptions to the rules caused by misfiling.
There are several potentially interesting aspects to this: learning rules based on a
manually-classified training set; expressing rules in some sort of logic; implementing
interesting predicates in that logic by analysing files in a semantically aware way.
Although I picked the file system as the target domain (since it's what I find hardest to keep
organised!), you could possibly do something similar with mail, bookmarks, and so on. However, the
idea is to go beyond simple syntactic rules on available metadata: you should aim to support
reasonably complex rules and sophisticated semantic discrimination, and this probably makes file
systems the most suitable domain. For example, in my file system I make several distinctions that
aren't captured simply by file name or type: distinctions among written articles based on what
they're about, distinctions among music based on whether it's from a record I own (i.e. has a
corresponding entry in my cddb cache), and so on. You might support these and other kinds of
semantic awareness using NLP techniques (using named entity recognition, perhaps) or image
recognition techniques, or music recognition techniques, and so on.
To take on this project, you should really be someone who has a need for such a system! In other
words you should have your own large file system to work with, and most importantly, some
non-trivial filing policies to implement. Given these, a thorough evaluation of the system will
involve measuring the accuracy of classification achieved by your system, and should be pretty easy.
Overall difficulty: moderate.
-
Visualisation of software
Contact: Stephen Kell
Visualisation of software is useful for a number of reasons: comprehension, prototyping, debugging,
profiling and so forth. Various projects are possible which would implement and evaluate a tool for
visualising software in some particular way, aimed towards one or other of these goals.
Specific example problems include: visualising dynamic structure of programs at the object level
(most likely for debugging); visualising static structure at the module level (for comprehension or
prototyping); visualising communication patterns within execution traces (for debugging or
profiling). The hardest part in all cases is making the visuals comprehensible to the viewer:
usually there will be too much data, and/or too dense an interconnection structure, for these to be
useful without further processing. For example, in the case of static graph-based abstractions, such
as call graphs or linkage graphs, usually the number of edges is a problem, since it makes the
layout completely incomprehensible; there are many interesting techniques for clustering nodes,
grouping related edges and eliminating uninformative ones. Three-dimensional visualisations are also
a very interesting avenue, which has been less well-explored and can possibly allow more of the raw
complexity to be usefully retained.
Evaluation of the effectiveness of a visualisation is slightly tricky, but there are a number of
reasonable proxy metrics such as reduction in size (number of nodes/edges in the graph), reduction
in edge crossings (in a two-dimensional graph layout), accuracy of approximation (i.e. how well your
edge grouping/clustering has preserved the information in the original graph). Difficulty: depends
on what specific project you choose, but probably moderate to hard.
-
A quantitative analysis of qualitative paging algorithms
Contact: Ripduman Sohan
The past decade has witnessed the introduction of a number of
different page replacement algorithms (2Q, ARC, MQ, etc). Although
implemented in simulation and bespoke OSes, no real implementations
exist for general purpose OSes. Furthermore, no clear data points
exist comparing algorithms against each other in a systematic manner.
This project may be approached in various dimensions:
- A basic approach would implement a number of common algorithms
using a simple software framework and quantitatively compare them
against run-time traces obtained from real-world data-sets.
- A more advanced project would attempt to validate the current
Linux paging algorithm (which uses heuristics to move pages to the
front of the LRU list) against an emulator where all memory can be
tracked to determine opportunities for improvement.
- A most advanced project would attempt to change the Linux paging
subsystem to create a framework where it is possible to dynamically
change the paging algorithm being used similar to the disk scheduler.
Prereq Knowledge: Linux, C, kernel knowledge useful.
-
Distributed shared memory over ethernet
Contact: Ripduman Sohan
The goal of this project is to provide a simple load-store abstraction
processes use for inter-host communication. Ideally, a process would
"export" a region of its address space and processes running on remote
hosts can "import" this region and simply read and write to it,
providing a simple, straightforward communication mechanism. The
kernel does all the heavy lifting by intercepting and synchronising
access to the shared memory region.
Prereq Knowledge: Linux, C, kernel knowledge useful.
A previous version of this project (2008) won the Data Connection project prize!
-
User-space process check-pointing save/restore
Contact: Ripduman Sohan
I would very much like to be able to checkpoint a process to disk
and restore its state at some later time. Previous projects have
achieved this but all have required some form of advanced kernel
modification. It should be possible to checkpoint entirely in
user-space for at least a subset of processes out there (with all
processes being possible with minor kernel modifications).
Prereq knowledge: C, knowledge of Linux process management useful.
|