skip to primary navigation skip to content

Data Science

Lecture notes

Abridged notes — These cover all examinable material, though they go into more depth and have more examples than lectures. They also include a non-examinable section on neural networks.
I am working on an extended version of the notes, which will be released part way through the course.

If you spot a mistake in the printed notes, let me know.

Announcements and Q&A

— Moodle

Lecture schedule

This is the planned lecture schedule. It will be updated as and when actual lectures deviate from schedule. Material marked * is non-examinable. Slides are uploaded the night before a lecture, and re-uploaded after the lecture with annotations made during the lecture.

Prerequisites
	Example sheet 0 and solutions
§1–§4. Learning with probability models
Lecture 1 [slides]	1. Learning with probability models 1.1 Specifying probability models
Lecture 2 [slides]	1.2 Standard random variables 1.3 Maximum likelihood estimation 1.4 Numerical optimization with scipy
Lecture 3 [slides]	1.5 Likelihood notation 1.6 Types of model 1.7 Supervised and unsupervised learning
Lecture 4 [slides]	Mock exam question 1 3. Neural networks as probability models (* non-examinable)
Lecture 5 [slides]	2.1 Linear modelling 2.2 Feature design
Lecture 6 [slides]	2.3 Diagnosing a linear model 2.5 The geometry of linear models
Lecture 7 [slides]	2.6 Interpreting parameters
Lecture 8 [slides]	2.4 Probabilistic linear modelling Discussion of climate dataset challenge (* non-examinable)
	Code snippets: fitting.ipynb and lm.ipynb Example sheet 1 OPTIONAL ex1 practical exercises [ex1.ipynb] (for supervisions) OPTIONAL PyTorch introduction and challenge OPTIONAL climate dataset challenge climate.ipynb Datasets investigated: climate.ipynb, stop-and-search.ipynb
§5, §6, §8. Bayesian inference and Monte Carlo
Lecture 8 ctd. [slides]	8. Bayesianism
Lecture 9	5.1 Bayes's rule for random variables 6.1 Monte Carlo integration 6.2 Bayes's rule via computation
Lecture 10	5.2 Bayes's rule calculations 8.3 Finding the posterior
Lecture 11	8.1, 8.2 Bayesianism 8.4 Bayesian readouts
video only	Mock exam question 2 and walkthrough (29:35)
	Example sheet 2
§7, §9, §10. Frequentist methods: hypothesis testing and empirical evaluation
Lecture 11 ctd	5.3 Deriving the likelihood
Lecture 12	7.1–7.2 Empirical cdf 7.3 The empirical distribution 9. Frequentism 9.1, 9.2 Resampling / confidence intervals 9.6 Non-parametric resampling
Lecture 13	9.3 Hypothesis testing
Lecture 14	9.3 Hypothesis testing (continued) 10. Holdout evaluation and the challenge of induction (* non-examinable)
video only	Mock exam question 3 and walkthrough (18:20)
	Example sheet 3
§11+§14. Autoregressive sequence models
Lecture 14 ctd	11.1 Causal diagrams and joint likelihood 11.2, 11.3 Markov models
Lecture 15	11.4, 11.5 Sequence models with history: RNNs, HMMs, and Transformers
Lecture 16	14.1 Calculations with Markov chains 14.4 Stationarity and average behaviour (* non-examinable)
	Example sheet 4