Course pages 2017–18

# Foundations of Data Science

**Principal lecturer:** Dr Damon Wischik

**Taken by:** Part IB CST 50%, Part IB CST 75%

No. of lectures and practical classes: 12 + 4

Suggested hours of supervisions: 3

Prerequisite courses:
either Mathematics for Natural Sciences, or the equivalent from the
Maths Tripos

This course is a prerequisite:
for Part IB Formal Models of Language,
and for Part II Machine Learning and Bayesian Inference,
Bioinformatics, Computer Systems Modelling, Information Theory,
Quantum Computing, Natural Language Processing, Advanced Graphics.

## Aims

This course introduces fundamental tools for describing and reasoning about data. There are two themes: describing the behaviour of random systems; and making inferences based on data generated by such systems. The course will survey a wide range of models and tools, and it will emphasize how to design a model and what sorts of questions one might ask about it.

## Lectures

**Probabilistic models.**Examples: random sample, graphical models, Markov models. Common random variables and their uses. Joint distributions, independence. Rules for expectation and variance.**Distributions of random variables.**Generating random variables. Empirical distribution. Comparing distributions. Law of large numbers, central limit theorem.**Inference.**Maximum likelihood estimation, likelihood profile. Bootstrap, hypothesis testing, confidence intervals for parameters and predictions. Bayesianism, point estimation, classification. Case study: training a naive Bayes classifier.**Feature spaces.**Vector spaces, bases, inner products, projection. Model fitting as projection; linear modeling. Orthogonalisation, and application to linear models. Dimension reduction.**Random processes.**Drift models. Markov chain convergence: notions, and calculations. Examples. Notions of estimation and control. Examples of processes with memory.

## Objectives

At the end of the course students should

- be able to formulate basic probabilistic models, including discrete
time Markov chains, graphical models, and linear models
- be familiar with common random variables and their uses, and with the
use of empirical distributions rather than formulae
- be able to use expectation and conditional expectation,
limit theorems, equilibrium distributions
- understand different types of inference about noisy data, including
model fitting, hypothesis testing, and making predictions
- understand the fundamental properties of inner product spaces and orthonormal systems, and their application to model representation

## Recommended reading

* F.M. Dekking, C. Kraaikamp, H.P. Lopuhaä, L.E. Meester (2005).
*A modern introduction to probability and statistics: understanding why and how.*
Springer.

S.M. Ross (2002).
*Probability models for computer science.*
Harcourt / Academic Press.

M. Mitzenmacher & E. Upfal (2005).
*Probability and computing: randomized algorithms and probabilistic analysis.*
Cambridge University Press.