Computer Laboratory

Course pages 2014–15

Advanced Operating Systems

Principal lecturer: Dr Robert Watson
Taken by: MPhil ACS, Part III
Code: L41
Hours: 16 (6 one-hour lectures, 5 two-hour practical labs)
Prerequisites: Undergraduate Operating Systems course; please see syllabus for further details

Aims

Systems research refers to the study of a broad range of behaviours arising from complex system design, including: low-level operating systems; resource sharing and scheduling; interactions between hardware and software; network-protocol design and implementation; separation of mutually distrusting parties on a common platform; and control of distributed-system behaviours such as concurrency and data replication. This module will:

  1. teach systems-analysis methodology and practice through tracing and performance profiling experiments;
  2. expose students to real-world systems artefacts such as OS schedulers and network stacks, and consider their hardware-software interactions with CPUs and network-interface cards;
  3. develop scientific writing skills through a series of laboratory reports; and
  4. assign a selection of original research papers in these areas in order to gain insight into potential research topics and approaches.

The teaching style will blend lectures and hands-on labs that teach methodology, design principles, and practical skills. Students will be taught about (and assessed via) a series of lab-report-style assignments based on in- and out-of-classroom practical work. The systems studied are real, and all wires will be live.

Prerequisites

It is strongly recommended that students:

  1. Have previously (and successfully) completed an undergraduate operating-system course with practical content -- or have equivalent experience through project or open-source work.
  2. Have reasonable comfort with the C and Python programming languages. C is the primary implementation language for systems that we will analyse, requiring reading fluency; userspace C programs will also be written and extended as part of lab exercises. Students without a Python background will wish to complete an online Python tutorial prior to term, as the language will be used in data collection, analysis, and presentation from the first lab.
  3. Review an undergraduate operating-system textbook (such as the 'Dinosaur Book') and the first four chapters of 'The Design and Implementation of the FreeBSD Operating System' prior to term. Students will also benefit from skimming the DTrace book prior to the start of the first lab.
  4. Be comfortable with the UNIX command-line environment including compiler/ debugging tools. Students without this background may wish to sit in on the undergraduate UNIX Tools lecture series in Michaelmas Term.

Syllabus

Please note: at the time of writing, this module is under active development. Some change is expected between the material and labs described here and the module to be taught in Lent Term 2015. Overall, however, the module content and approach will be as described here.

The eight-week term is split up into into three submodules:

Weeks 1-2: Introduction to kernels and kernel tracing/analysis

The purpose of this submodule is to introduce students to the structure of a contemporary operating system kernel through tracing and profiling. Students will gain familiarity with practical systems tools such as DTrace and hardware performance counters, as well as with data interpretation, analysis, and presentation using Python and GraphViz. They will write a first lab report which will receive feedback from the instructor, but not count towards the final mark.

Monday 19 January 10:00 FS07 Lecture: Course introduction / tracing and performance analysis (1h)
Tuesday 20 January 12:00 FS07 Lecture: The FreeBSD kernel and DTrace (1h)

Friday 23 January 11:00 - 13:00 SW02 Lab: Getting started with kernel tracing (2h)

Deliverable: Lab Report 1 - Whole-system CPU and memory profiling (due 28 January)

Week 2-5: Processors, processes, and threads

This submodule will introduce students to aspects of the UNIX process model (processes and threads in userspace and kernel), as well as the impact of scheduling and affinity. It will also consider local hardware-software interactions such as interrupt delivery and cache behaviour. The first marked lab report will be written.

Monday 26 January 10:00 FS07 Lecture: Processors, processes, and threads (1h)
Tuesday 27 January 12:00 FS07 Lecture: The FreeBSD process model (1h)

Friday 30 January 11:00 - 13:00 SW02 Lab: Threads, scheduling, and I/O (2h)
Monday 2 Feb 10:00 - 12:00 SW02 Lab: Hardware-software interactions (2h)

Deliverable: Lab Report 2 - Measuring and optimising scheduler behaviour (due 13 February)

Week 5-8: TCP/IP

This submodule will introduce students to a contemporary, multithreaded, multiprocessing network stack, with a particular interest in the TCP protocol. Labs will consider both the behaviour of a single TCP connection, and its interactions with other TCP connections through congestion control. Students will also look at how hardware behaviour such as caching and multiqueue work distribution affect performance. The second marked lab report will be written.

Monday 16 February 10:00 FS07 Lecture: TCP/IP (1h)
Tuesday 17 February 12:00 FS07 Lecture: The FreeBSD network stack (1h)

Friday 20 February 11:00 - 13:00 SW02 Lab: TCP single-flow behaviour (2h)
Monday 23 February 10:00 - 12:00 SW02 Lab: TCP multi-flow behaviour (2h)

Deliverable: Lab Report 3 - Analysing and optimising application-level interactions with TCP (due 6 March)

Objectives

On completion of this module, students should:

  • Have an understanding of high-level OS kernel structure
  • Gained insight into hardware-software interactions for compute and I/O
  • Have practical skills in system tracing and performance analysis
  • Have been exposed to research ideas in system structure and behaviour
  • Have learned how to write systems-style performance evaluations

Coursework

Students will write and submit three lab reports to be marked by the instructor. The first lab report is a 'practice run' intended to help students develop and analysis techniques and writing styles, and will not be included in the final mark for the module. The remaining reports are marked and assessed, each constituting 50% of the final mark.

Practical work

Five in-classroom labs will ask students to develop and use skills in tracing and performance analysis as applied to real-world systems artefacts. Results from these labs (and follow-up work by students outside of the classroom) will by the primary input to lab reports. Typical labs will involve using tracing and profiling to characterise specific behaviours (e.g., TCP congestion control) as well as diagnose and fix problems through modifications to application-level behaviours (e.g., modifying network clients and servers to better exploit real-world TCP behaviour). Students may wish to work in pairs within the lab, but must prepare lab reports independently.

The module lecturer will give a short introductory lecture at the start of each lab, and instructors will be on-hand throughout labs to provide assistance. Lab participation is not directly included in the final mark, but as lab work is a key input to lab reports that are assessed.

Assessment

Each student will write three lab reports, roughly 5-10 pages each including several figures. The first is a 'practice run' that will be used to develop lab-report structure, content, and presentation, and will not contribute to the final mark. The remaining two lab reports will each contribute 50% to the final mark.

Recommended reading

All module texts should be in the department library; if there appear to be too few copies to go around, please let the module convener and librarian know so that additional copies can be ordered. Texts should also be available via Amazon, iBooks, etc.

Primary module texts

Marshall Kirk McKusick, George V. Neville-Neil, and Robert N. M. Watson The Design and Implementation of the FreeBSD Operating System, 2nd Edition, Pearson Education, Boston, MA, USA, September 2014.

Brendan Gregg and Jim Mauro. DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X and FreeBSD, Prentice Hall Press, Upper Saddle River, NJ, USA, April 2011.

Raj Jain, The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling, Wiley - Interscience, New York, NY, USA, April 1991. Available on the author's website for free.

Additional texts

Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne, Operating System Concepts, Eighth Edition, John Wiley & Sons, Inc., New York, NY, USA, July 2008.

Brendan Gregg. Systems Performance: Enterprise and the Cloud, Prentice Hall Press, Upper Saddle River, NJ, USA, October 2013.

Research-paper readings

Research-paper readings will be announced as the term proceeds, but will likely include original papers on BPF, DTrace, OS scheduling, OS scalability, network stacks, and systems modelling.