Probabilistic Machine Learning
Prerequistites and related courses.
- The content in this course is advanced machine learning. If you are looking for an introduction to probabilistic machine learning, please see the IB Data Science course.
- There are close links to L48: Machine Learning and the Physical World. The topics covered are complementary, and the philosophy is different. The L48 lectures on Gaussian processes and probabilistic inference are especially relevant.
Practical arrangements.
- Auditing. If you wish to audit this course, all the lecture videos will be available online, as listed below, and there is no need to ask permission to use them. I regret that unregistered students are not permitted to attend in-person lectures this year.
- Lectures and videos. For topic 1, lectures will be in-person in the Computer Lab, and videos and notes will also be posted below. For topics 2&3, details are to be confirmed.
- Office hours will be in SS03, on Tuesdays starting 26 Oct, from 4–5pm. In exceptional cases, Thursdays 4–5pm will also be possible. You can also post questions to the Q&A forum on Moodle.
Topics covered
1. Probabilistic neural networks
Neural networks, from the perspective of probabilistic modelling. Topics covered: classifiers, generative models, recurrent networks, autoencoders, GANs.
- In-person lectures in the Computer Lab from 11 Oct to 25 Oct (timetable)
- Prerecorded videos are listed below. Where the material is the same as last year, the link goes to last year's video; where the material is new this year, I'll post a new video. I will not record live videos of this year's in-person lectures.
| Notes for sections 1 and 2 Notes for section 3 | |
| Lecture 1 [slides] | 
Prerequisites
— review section 1 of IB Data Science
 1.1 Prediction accuracy, 1.2 Probabilistic learning, 1.3 PyTorch — video (20:27) nn.ipynb | 
| Lecture 2 [slides] | Lecture 1 continued Generative models — IB Data Science section 1.6 | 
| Lecture 3 [slides] | |
| Lecture 4 [slides] | 
2.1 KL divergence — video to come 2.2 Importance sampling — video (11:17) 2.3 Bounds — video to come | 
| Lecture 5 [slides] | 
3.1 Generative neural networks —
video (10:40)
 3.2 Autoencoder in maths — video (15:15) 3.3 Autoencoder in practice — video (28:37) Reading: Auto-Encoding Variational Bayes (Kingma, Welling, 2014), Importance Weighted Autoencoders (Burda, Grosse, Salakhutdinov, 2015) | 
| Lecture 6 [slides] | 
Latent variable models for datasets
 | 
| Lecture 7 [slides] | 
2. TrueSkill — Graphical models and Gibbs sampling
Bayesian modelling, applied to the problem of how to rank players in a tournament.
- Delivered by the Engineering department as the second part of Part IIB course 4f13, starting 26 Oct
- Syllabus, slides, and timetable
- Videos on Moodle, and Moodle enrolment key. They say "Lectures will be pre-recorded in video and made accessible on [Moodle] Thursday morning". A few of last year's videos are still available.
3. Models for document collections — LDA
Another application of Bayesian modelling
- Delivered by the Engineering department, as above
Assessment
There are five pieces of coursework. Three are structured exercises designed to reinforce the lectures. Two are for an open-ended investigation of a topic that you chose from a small list, drawing on the main themes of the lecture course. Submission instructions are on Moodle.
- Exercise 1 [notebook] (10%, due 31 Oct) on probabilistic neural networks and solution notes
- Exercise 2 [notebook] (10%, due 14 Nov) on TrueSkill
- Exercise 3 [notebook] (10%, due 3 Dec) on document collections
- Investigative project
    - Proposal (5%, due 28 Nov)
- Group discussions (not assessed, to be arranged in week of 29 Nov)
- Report (65%, due 19 Jan 2022)
 
You may also be interested in a structured exercise on Gaussian processes, that was used in an earlier version of this course.
- Exercise on Gaussian processes [notebook] (not part of this course; here for information only)