skip to primary navigationskip to content

Department of Computer Science and Technology

Data Science

 

Course pages 2025–26

Data Science

Lecture notes

  • Abridged notes — These cover all examinable material, though they go into more depth and have more examples than lectures. They also include a non-examinable section on neural networks.
  • I am working on an extended version of the notes, which will be released part way through the course.
If you spot a mistake in the printed notes, let me know.

Announcements and Q&A

Moodle

Lecture schedule

This is the planned lecture schedule. It will be updated as and when actual lectures deviate from schedule. Material marked * is non-examinable. Slides are uploaded the night before a lecture, and re-uploaded after the lecture with annotations made during the lecture.

Prerequisites
Example sheet 0 and solutions
§1–§4. Learning with probability models
Lecture 1
1. Learning with probability models
1.1 Specifying probability models
Lecture 2
1.2 Standard random variables
1.3 Maximum likelihood estimation
1.4 Numerical optimization with scipy
Lecture 3
1.5 Likelihood notation
1.6 Types of model
1.7 Supervised and unsupervised learning
Lecture 4
3. Neural networks as probability models (* non-examinable)
Lecture 5
2.1 Linear modelling
2.2 Feature design
Lecture 6
2.3 Diagnosing a linear model
2.5 The geometry of linear models
Lecture 7
2.6 Interpreting parameters
Lecture 8
2.4 Probabilistic linear modelling
Discussion of climate dataset challenge (* non-examinable)
Code snippets: fitting.ipynb and lm.ipynb
Datasets investigated: climate.ipynbstop-and-search.ipynb
§5, §6, §8. Bayesian inference and Monte Carlo
Lecture 8 ctd.
8. Bayesianism
Lecture 9
5.1 Bayes's rule for random variables
6.1 Monte Carlo integration
6.2 Bayes's rule via computation
Lecture 10
5.2 Bayes's rule calculations
8.3 Finding the posterior
Lecture 11
8.1, 8.2 Bayesianism
8.4 Bayesian readouts
video only Mock exam question 2 and walkthrough (29:35)
Example sheet 2
§7, §9, §10. Frequentist methods: hypothesis testing and empirical evaluation
Lecture 11 ctd
5.3 Deriving the likelihood
Lecture 12
7.1–7.2 Empirical cdf
7.3 The empirical distribution
9. Frequentism
9.1, 9.2 Resampling / confidence intervals
9.6 Non-parametric resampling
Lecture 13
9.3 Hypothesis testing
Lecture 14
9.3 Hypothesis testing (continued)
10. Holdout evaluation and the challenge of induction (* non-examinable)
video only Mock exam question 3 and walkthrough (18:20)
Example sheet 3
§11+§14. Autoregressive sequence models
Lecture 14 ctd
11.1 Causal diagrams and joint likelihood
11.2, 11.3 Markov models
Lecture 15
11.4, 11.5 Sequence models with history: RNNs, HMMs, and Transformers
Lecture 16
14.1 Calculations with Markov chains
14.4 Stationarity and average behaviour (* non-examinable)
Example sheet 4