Advanced Operating Systems
Principal lecturer: Dr Robert Watson
Taken by: MPhil ACS, Part III
Code: L41
Hours: 16
Prerequisites: Undergraduate Operating Systems course. See the full prerequisites on the syllabus page.
Aims
Systems research refers to the study of a broad range of behaviours arising from complex system design, including: low-level operating systems; resource sharing and scheduling; interactions between hardware and software; network-protocol design and implementation; separation of mutually distrusting parties on a common platform; and control of distributed-system behaviours such as concurrency and data replication. This module will:
- Teach systems-analysis methodology and practice through tracing and performance profiling experiments;
- Expose students to real-world systems artefacts such as OS schedulers and network stacks, and consider their hardware-software interactions with CPUs;
- Develop scientific writing skills through a series of laboratory assignments; and
- Assign a selection of original research papers to give insight into potential research topics and approaches.
The teaching style will blend lectures and hands-on labs that teach methodology, design principles, and practical skills. Students will be taught about (and assessed via) a series of lab assignments based on in- and out-of-classroom practical work. The systems studied are real, and all wires will be live.
Prerequisites
It is strongly recommended that students:
- Have previously (and successfully) completed an undergraduate operating-system course -- or have equivalent experience through project or open-source work.
- Have reasonable comfort with the C and Python programming languages. C is the primary implementation language for systems that we will analyse, requiring reading fluency; userspace C programs will also be written and extended as part of lab exercises. Python will be used as our data-processing language, and provides useful tools for data analysis and presentation.
- Review an undergraduate operating-system textbook (such as the 'Dinosaur Book') to ensure that basic OS concepts such as the process model, inter-process communication, filesystems, network stacks, and virtual memory are familiar.
Syllabus
The sessions are split up into three submodules:
- 
      Introduction to kernels and kernel
      tracing/analysis
      The purpose of this submodule is to introduce students to the structure of a contemporary operating system kernel through tracing and profiling. - Lecture 1 part 1: Introduction: OSes, Systems Research (1h)
- Lecture 1 part 2: TBC (1h)
- Lecture 1 part 3: TBC (1h)
- Lecture 2: Kernels and Tracing (1h)
- Lab 1: I/O
- Deliverable: Lab Assignment 1 - I/O
 
- 
      Processors, processes, and threads
      This submodule introduces students to concrete implications of the UNIX process model: processes and threads in both userspace and kernelspace, the hardware foundations for kernel and process isolation, system calls, and traps. - Lecture 3: The Process Model - 1 (1h)
- Lecture 4: The Process Model - 2 (1h)
- Lab 2: IPC
- Deliverable: Lab Assignment 2 - IPC
 
- 
      TCP/IP
      This submodule introduces students to a contemporary, multithreaded, multiprocessing network stack, with a particular interest in the TCP protocol. Labs will consider both the behaviour of a single TCP connection, exploring the TCP state machine, socket-buffer interactions with flow control, and TCP congestion control. Students will use DUMMYNET to simulate network latency and explore how TCP slow start and congestion avoidance respond to network conditions. The second marked lab assignment will be written. - Lecture 5: The Network Stack (1) (1h)
- Lecture 6: The Network Stack (2) (1h)
- Lab 3: TCP/IP
- Deliverable: Lab Assignment 3 - TCP /IP
 
Objectives
On completion of this module, students should:
- Have a good understanding of high-level OS kernel structure
- Gained insight into hardware-software interactions for compute and I/O
- Have practical skills in system tracing and performance analysis
- Have been exposed to research ideas in system structure and behaviour
- Have learned how to write systems-style performance evaluations
Recommended reading
Primary module texts
Course texts provide instruction on statistics, operating-system design and implementation, and system tracing. You will be asked to read selected chapters from these, but will likely find other content in them useful as you proceed with the labs.
Marshall Kirk McKusick, George V. Neville-Neil, and Robert N. M. Watson. The Design and Implementation of the FreeBSD Operating System, 2nd Edition, Pearson Education, Boston, MA, USA, September 2014.
Brendan Gregg and Jim Mauro. DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X and FreeBSD, Prentice Hall Press, Upper Saddle River, NJ, USA, April 2011.
Raj Jain, The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling, Wiley - Interscience, New York, NY, USA, April 1991.
Additional texts
Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne, Operating System Concepts, Eighth Edition, John Wiley and Sons, Inc., New York, NY, USA, July 2008.
Brendan Gregg. Systems Performance: Enterprise and the Cloud, Prentice Hall Press, Upper Saddle River, NJ, USA, October 2013.
Research-paper readings
Research-paper readings will be announced as the terms proceed, but will likely include original papers on BPF, DTrace, OS scheduling, OS scalability, network stacks, and systems modelling.
Coursework
Students will write and submit three lab reports to be marked by the instructor. The 'Practice report' will be used to develop lab-report structure, content, and presentation, and contributes 10% to the final mark. The remaining two reports are marked and assessed, each constituting 45% of the final mark.
Practical work
Five 2-hour in-classroom labs will ask students to develop and use skills in racing and performance analysis as applied to real-world systems artefacts. Results from these labs (and follow-up work by students outside of the classroom) will be the primary input to lab reports.
Typical labs will involve using tracing and profiling to characterise specific behaviours (e.g., file I/O in terms of system calls and traps) as well as perform root-cause analysis of application-level behaviours (e.g., exploring network clients and servers to better understand real-world TCP behaviour).
The module lecturer will give a short introductory lecture at the start of each lab, and instructors will be on-hand throughout labs to provide assistance. Lab participation is not directly included in the final mark, but lab work is a key input to lab reports that are assessed.
Assessment
Three lab reports:
- Practice report (10%)
- Report 1 (45%)
- Report 2 (45%)
Further Information
Due to COVID-19, the method of teaching for this module will be adjusted to cater for physical distancing and students who are working remotely. We will confirm precisely how the module will be taught closer to the start of term.
This course is borrowed from Part II of the Computer Science Tripos. As such, assessment will be adjusted to an appropriate level for those enrolled for Part III of the Tripos or the M.Phil in Advanced Computer Science. Further information about assessment and practicals will follow at the first lecture.