Federated Learning: Theory and Practice
Principal lecturer: Dr Nic Lane
Taken by: MPhil ACS, Part III
Code: L361
Term: Lent
Hours: 16 (8h lectures, 4h lab sessions, 4h advanced material)
Format: In-person lectures
Class limit: max. 50 students
Prerequisites: It is strongly recommended that students have previously (and successfully) completed an undergraduate machine learning course - or have equivalent experience through open-source material (e.g., lectures or similar). An example course would be: Part1B "Artificial Intelligence". Students should feel comfortable with SGD and optimization methods used to train current popular neural networks and simple forms of neural network architectures.
Moodle, timetable
Objectives
This course aims to extend the machine learning knowledge available to students in Part I (or present in typical undergraduate degrees in other universities), and allow them to understand how these concepts can manifest in a decentralized setting. The course will consider both theoretical (e.g., decentralized optimization) and practical (e.g., networking efficiency) aspects that combine to define this growing area of machine learning.
At the end of the course students should:
- Understand popular methods used in federated learning
- Be able to construct and scale a simple federated system
- Have gained an appreciation of the core limitations to existing methods, and the approaches available to cope with these issues
- Developed an intuition for related technologies like differential privacy and secure aggregation, and are able to use them within typical federated settings
- Can reason about the privacy and security issues with federated systems
Lectures
- Course Overview. Introduction to Federated Learning.
- Decentralized Optimization.
- Statistical and Systems Heterogeneity.
- Variations of Federated Aggregation.
- Secure Aggregation.
- Differential Privacy within Federated Systems.
- Extensions to Federated Analytics.
- Applications to Speech, Video, Images and Robotics.
Lab sessions
- Federating a Centralized ML Classifier.
- Behaviour under Heterogeneity.
- Scaling a Federated Implementation.
- Exploring Privacy with Federated Settings
Assessment
Four labs are performed during the course, and students receive 12.5% of their total grade for work done as part of each lab. (For a total of 50% of the total grade from lab work alone). Labs will primarily provide hands-on teaching opportunities, that are then utilized within the lab assignment which is completed outside of the lab contact time. MPhil and Part III students will be given additional questions to answer within their version of the lab assignment which will differ from the assignment given to Part II CST students.
The remainder of the course grade (50%) will be given based on a hands-on project that applies the concepts taught in lectures and labs. This hands-on project will be assessed based on upon a combination of source code, related documentation and brief 8-minute pre-recorded talk that summarizes key project elements (any slides used are also submitted as part of the project). Please note, that in the case of Part II CST students, the talk itself is not examinable -- as such will be made optional to those students.
A range of possible practical projects will be described and offered to students to select from, or alternatively students may propose their own. MPhil and Part III students will select from a project pool that is separate from those offered to Part II CST students. MPhil and Part III projects will contain a greater emphasis on a research element, and the pre-recorded talks provided by this student group will focus on this research contribution. The project will be assessed on the level of student understanding demonstrated, the degree of difficulty, correctness of implementation -- and for Part III/MPhil students the additional criteria of the quality and execution of the research methodology, and depth and quality of results analysis.
This project can be done individually or in groups -- although individual projects will be strongly encouraged. It will be required the project is performed using a code repository that also will contain all documentation -- access to this repository will be shared with course staff (e.g., lecturer and TAs). Where needed, marks assigned to students within a group will be differentiated using this repository as an input. Furthermore if groups are formed, members must be either entirely from Part III/MPhil students or Part II CST, i.e., these two student groups should not mix to form a project group.
Projects will be made available publicly. A maximum word count for written contributions for the project will be enforced.
Recommended Reading
Readings will be assigned for each lecture. Readings will be taken from either research papers, tutorials, source code or blogs that provide more comprehensive treatment of taught concepts.
Advanced Material sessions - MPhil / Part III Students
These sessions are mandatory for the MPhil and Part III students. Part II CST students may attend if they wish
Topics and announced the first week of class. Selected to extend beyond topics covered in lectures and labs. Content is a mixture of sessions run in the lecture room the are either (1) presentations by guest lectures by an invited domain expert or (2) class-wide discussions regarding one or more related academic papers. Paper discussions will require students to read the paper ahead of the lecture, and a brief discussion primer presentation will be given students before discussions begin.
This module is shared with Part II of the Computer Science Tripos. Assessment will be adjusted for the two groups of students to be at an appropriate level for whichever course the student is enrolled on. Further information about assessment and practicals will follow at the first lecture.