Advanced Topics in Computer Systems
Schedule & Reading List
We'll meet in for two hours every Thursday from 11.00—13.00 during Lent term starting January 18th 2024. Each session after the first we will have 3 participant presentations which should be 15—20 minutes long. We'll start with 2 presentations and discussion followed by a short break, and then the final presentation and further discussion. The paper schedule for this year is given below, and the associated presentation schedule is here. Please be sure to check that you know which papers you are presenting, what flavour of presentation you should write, and in which slot (date and time) you are presenting. The slides from our first session are now available here.
Note! All Week 1 papers, and any marked Additional are not for assessment and you should not submit reviews on them.
Week 1: Reflections on Systems Design
-
Hints for Computer Systems Design
(Revised),
Lampson. 1983. ACM SOSP 1983, pp.33—48.
[ DOI (original) ]; [ URL (revised version) ] -
The role of motherhood in the pop art of system programming,
Neumann. 1969. ACM SOSP 1969, pp.13—18.
[ DOI ] -
Theory and practice in operating system design,
Needham and Hartley. 1969. ACM SOSP 1969, pp.8—12.
[ DOI ]
Week 2: OS Structure
-
Exokernel: an operating system architecture for application-level resource
management,
Engler et al. 1995. ACM SOSP 1995, pp.251—266.
[ DOI ] -
The Multikernel: A new OS architecture for scalable multicore
systems,
Baumann et al. 2009. ACM SOSP 2009, pp.29—44.
[ DOI ] -
Unikernels: Library Operating Systems for the Cloud,
Madhavapeddy et al. 2013. ACM ASPLOS 2013, pp.461—472.
[ DOI ]
Additional
-
The UNIX time-sharing system,
Ritchie and Thompson. 1974. Commun. ACM 17(7):365—375.
[ DOI ]
Revised version of paper appearing at ACM SOSP 1973.
Week 3: Virtualisation
-
Xen and the Art of Virtualization,
Barham et al. 2003. ACM SOSP 2003, pp.164—177.
[ DOI ] -
Container-based operating system virtualization: a scalable,
high-performance alternative to hypervisors,
Soltesz et al. 2007. ACM EuroSys 2007, pp.275—287.
[ DOI ] -
My VM is Lighter (and Safer) than your Container,
Manco et al. 2017. ACM SOSP 2017, pp.218—233.
[ DOI ]
Additional
-
Light-Weight Contexts: An OS Abstraction for Safety and
Performance,
Litton et al. 2016. USENIX OSDI 2016, pp.49—64.
[ URL ]
Week 4: Distributed Consensus
-
The Chubby lock service for loosely-coupled distributed systems,
Burrows. 2006. USENIX OSDI 2006, pp.335—350.
[ URL ] -
ZooKeeper: Wait-free coordination for Internet-scale systems,
Hunt et al. 2010. USENIX ATC 2010.
[ URL ] -
In Search of an Understandable Consensus Algorithm,
Ongaro and Ousterhout. 2014. USENIX ATC 2014, pp.305—319.
[ URL ]
Additional
-
Paxos Made Moderately Complex,
van Renesse and Altinbuken. 2015. ACM Comput. Surv. 47(3).
[ DOI ]
Week 5: Cluster Scheduling
-
Fuxi: a fault-tolerant resource management and job scheduling system at
Internet scale,
Zhang et al. 2014. Proc. VLDB Endow. 7(13):1393—1404.
[ DOI ] -
Large-scale cluster management at Google with Borg,
Verma et al. 2015. ACM EuroSys 2015.
[ DOI ] -
Firmament: Fast, Centralized Cluster Scheduling at Scale,
Gog et al. 2016. USENIX OSDI 2016, pp.99—115.
[ URL ]
Week 6: Data Intensive Computing
-
MapReduce: Simplified Data Processing on Large Clusters,
Dean and Ghemawat. 2004. USENIX OSDI 2004.
[ URL ] -
Dryad: Distributed Data Parallel Programming from Sequential Building
Blocks,
Isard et al. 2007. ACM EuroSys 2007, pp.59—72.
[ DOI ] -
Pregel: A System for Large-Scale Graph Processing,
Malewicz et al. 2010. ACM SIGMOD 2010, pp.135—146.
[ DOI ]
Week 7: Privacy
-
CryptDB: Protecting Confidentiality with Encrypted Query
Processing,
Popa et al. 2011. ACM SOSP 2011, pp.85—100.
[ DOI ] -
Ryoan: A Distributed Sandbox for Untrusted Computation on Secret
Data,
Hunt et al. 2016. USENIX OSDI 2016, pp.533—549.
[ URL ] -
Towards Federated Learning at Scale: System Design,
Bonawitz et al. 2019. Proceedings of the 2nd SysML Conference, Palo Alto, CA, USA.
[ URL ]
Additional
-
Hails: Protecting Data Privacy in Untrusted Web Applications,
Giffin et al. 2012. USENIX OSDI 2012, pp.47—60.
[ URL ] -
SCONE: Secure Linux Containers with Intel SGX
Arnautov et al. 2016. USENIX OSDI 2016, pp.689—703.
[ URL ] -
GhostRider: A Hardware-Software System for Memory Trace Oblivious
Computation
Liu et al. 2015. ACM ASPLOS 2015, pp.87—101.
[ DOI ]
Week 8: Verification
-
IronFleet: Proving Practical Distributed Systems Correct,
Hawblitzel et al. 2015. ACM SOSP 2015, pp.1—17.
[ URL ] -
I4: Incremental Inference of Inductive Invariants for Verification of Distributed Protocols,
Ma et al. 2019. ACM SOSP 2019, pp.370—384.
[ URL ] -
Scaling symbolic evaluation for automated verification of systems code with Serval,
Nelson et al. 2019. ACM SOSP 2019, pp.225—242.
[ URL ]
To give you a feeling for the kinds of presentations expected, here are some more presentation examples. These are for papers not assigned this year, so you may find them a useful resource in any case. In some cases you will also find the conference presentations given by the original authors online too; while potentially useful, note that these serve a different purpose to and will usually be allocated more time than your presentations.
-
"EROS: a fast capability system", Shapiro et al, ACM SOSP 1999.
EXAMPLE1, EXAMPLE2. -
"Labels and Event Processes in the Asbestos Operating System",
Efstathopolous et al, ACM SOSP 2005.
EXAMPLE1, EXAMPLE2. -
"Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage
System", Terry et al, ACM SOSP 1995.
EXAMPLE. -
"Practical Byzantine Fault Tolerance", Castro and Liskov, USENIX OSDI
1999.
EXAMPLE. -
"Zyzzyva: Speculative Byzantine Fault Tolerance", Kotla et al, ACM SOSP
2007.
EXAMPLE1, EXAMPLE2. -
"DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing
Using a High-Level Language", Yu et al, USENIX OSDI 2008.
EXAMPLE. -
"Quincy: Fair Scheduling for Distributed Computing Clusters", Isard et al,
ACM SOSP 2009.
EXAMPLE1, EXAMPLE2.