skip to primary navigationskip to content

Department of Computer Science and Technology

Masters

 

Course pages 2022–23

Advanced Operating Systems

Principal lecturer: Prof Robert Watson
Taken by: MPhil ACS, Part III
Code: L41
Term: Lent
Hours: 16
Format: In-person lectures
Class limit: max. 18 students
Prerequisites: Undergraduate Operating Systems course. See the full prerequisites on the syllabus page.
Moodle, timetable

Aims

Systems research refers to the study of a broad range of behaviours arising from complex system design, including: low-level operating systems; resource sharing and scheduling; interactions between hardware and software; network-protocol design and implementation; separation of mutually distrusting parties on a common platform; and control of distributed-system behaviours such as concurrency and data replication. This module will:

  1. Teach systems-analysis methodology and practice through tracing and performance profiling experiments;
  2. Expose students to real-world systems artefacts such as I/O, IPC,and network-stack implementations, and consider their hardware-software interactions with CPUs;
  3. Develop scientific experimentation, analysis and presentation skills through a series of laboratory assignments; and
  4. Assign a selection of original research papers to give insight into potential research topics and approaches.

The teaching style will blend lectures and hands-on labs that teach methodology, design principles, and practical skills. Students will be taught about (and assessed via) a series of lab assignments based on practical work. The systems studied are real, and all wires will be live.

Prerequisites

It is strongly recommended that students:

  1. Have previously (and successfully) completed an undergraduate operating-system course -- or have equivalent experience through project or open-source work.
  2. Have reasonable comfort with the C and Python programming languages. C is the primary implementation language for systems that we will analyse, requiring reading fluency; userspace C programs will also be written and extended as part of lab exercises. Python will be used as our data-collection and processing language, and provides useful tools for data analysis and presentation.
  3. Review an undergraduate operating-system textbook (such as the 'Dinosaur Book') to ensure that basic OS concepts such as the process model, inter-process communication, filesystems, network stacks, and virtual memory are familiar.

Syllabus

The sessions are split up into three submodules:

  1. Introduction to kernels and kernel tracing/analysis

    The purpose of this submodule is to introduce students to the structure of a contemporary operating system kernel through tracing and profiling.

    • Lecture 1: Introduction: OSes and this course (1h)
    • Lecture 2: Kernels and Tracing (1h)
    • Lecturelet 1: I/O Lab (30m)
    • Lab 1: I/O (2x2h lab sessions, if in person; otherwise short 1:1 supervisions)
    • Deliverable: Lab Assignment 1 -  I/O 
  2. Processors, processes, and threads

    This submodule introduces students to concrete implications of the UNIX process model: processes and threads in both userspace and kernelspace, the hardware foundations for kernel and process isolation, system calls, and traps.

    • Lecture 3: The Process Model - 1 (1h)
    • Lecture 4: The Process Model - 2 (1h)
    • Lecturelet 2: IPC Lab (30m)
    • Lab 2: IPC (2x2h lab sessions, if in person; otherwise short 1:1 supervisions)
    • Deliverable: Lab Assignment 2 - IPC  
  3. TCP/IP

    This submodule introduces students to a contemporary, multithreaded, multiprocessing network stack, with a particular interest in the TCP protocol. Labs will consider both the behaviour of a single TCP connection, exploring the TCP state machine, socket-buffer interactions with flow control, and TCP congestion control. Students will use DUMMYNET to simulate network latency and explore how TCP slow start and congestion avoidance respond to network conditions. The second marked lab assignment will be written.

    • Lecture 5: The Network Stack (1) (1h)
    • Lecture 6: The Network Stack (2) (1h)
    • Lecturelet 3: TCP/IP Lab (30m)
    • Lab 3: TCP/IP (2x2h lab sessions, if in person; otherwise short1:1 supervisions)
    • Deliverable: Lab Assignment 3 -  TCP /IP

Objectives

On completion of this module, students should:

  • Have a good understanding of high-level OS kernel structure
  • Gained insight into hardware-software interactions for compute and I/O
  • Have practical skills in system tracing and performance analysis
  • Have been exposed to research ideas in system structure and behaviour
  • Have learned how to perform systems-style performance evaluations
  • Have learned how to present systems evaluation results

Assessment

  • Exercise 1: Getting started with tracing 10%
    • This starting exercise gets students working with DTrace on their RPi4 through some simple interactive exercises.

  • Exercise 2: I/O 20%
    • This second exercise asks students to analyse I/O performance exploring specific performance behaviour using their now developed tracing skills.

  • Exercise 3: IPC 70%
    • This third exercise asks students to analyse IPC performance exploring specific performance behaviour using now developed analysis skills.

Recommended reading

Primary module texts

Course texts provide instruction on statistics, operating-system design and implementation, and system tracing. You will be asked to read selected chapters from these, but will likely find other content in them useful as you proceed with the labs.

Marshall Kirk McKusick, George V. Neville-Neil, and Robert N. M. Watson. The Design and Implementation of the FreeBSD Operating System, 2nd Edition, Pearson Education, Boston, MA, USA, September 2014.

Brendan Gregg and Jim Mauro. DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X and FreeBSD, Prentice Hall Press, Upper Saddle River, NJ, USA, April 2011.

Additional texts

Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne, Operating System Concepts, Eighth Edition, John Wiley and Sons, Inc., New York, NY, USA, July 2008.

Brendan Gregg. Systems Performance: Enterprise and the Cloud, Prentice Hall Press, Upper Saddle River, NJ, USA, October 2013.

Research-paper readings

Research-paper readings will be announced as the terms proceed, but will likely include original papers on BPF, DTrace, OS scheduling, OS scalability, network stacks, and systems modelling.

Coursework

Students will write and submit three lab reports to be marked by the instructor. The 'Practice report' will be used to develop lab-report structure, content, and presentation, and contributes 10% to the final mark. The remaining two reports are marked and assessed.

  • Exercise 1: Getting started with tracing 10%
    • This starting exercise gets students working with DTrace on their RPi4 through some simple interactive exercises.

  • Lab Report 1: I/O 20%
    • This first lab assignment asks students to analyse I/O performance exploring specific performance behaviour using their now developed tracing skills.

  • Lab Report 2: IPC 70%
    • This second lab assignment asks students to analyse IPC performance exploring specific performance behaviour using now developed analysis skills.

Practical work

Five 2-hour in-classroom labs will ask students to develop and use skills in racing and performance analysis as applied to real-world systems artefacts. Results from these labs (and follow-up work by students outside of the classroom) will be the primary input to lab reports.

Typical labs will involve using tracing and profiling to characterise specific behaviours (e.g., file I/O in terms of system calls and traps) as well as perform root-cause analysis of application-level behaviours (e.g., exploring network clients and servers to better understand real-world TCP behaviour).

The introductory lectures for labs will be pre-recorded for the students to watch before the lab session. During each lab, instructors will be on-hand to provide assistance. Lab participation is not directly included in the final mark, but lab work is a key input to lab reports that are assessed.

Further Information

Due to infectious respiratory diseases, the method of teaching for this module may be adjusted to cater for physical distancing and students who are working remotely. Unless otherwise advised, this module will be taught in person.

Current Cambridge undergraduate students who are continuing onto Part III or the MPhil in Advanced Computer Science may only take this module if they did NOT take it as a Unit of Assessment in Part II.

This module is shared with Part II of the Computer Science Tripos. Assessment will be adjusted for the two groups of students to be at an appropriate level for whichever course the student is enrolled on. Further information about assessment and practicals will follow at the first lecture.