Course pages 2018–19

# Machine Learning and Bayesian Inference

**Principal lecturer:** Dr Sean Holden

**Taken by:** Part II CST 50%, Part II CST 75%

**Past exam questions**

No. of lectures: 12

Suggested hours of supervisions: 3

Prerequisite courses: Artificial Intelligence, Foundations of Data Science, Discrete Mathematics and Probability, Linear Algebra and Calculus from the NST Mathematics course.

## Aims

The Part 1B course *Artificial Intelligence* introduced simple
neural networks for supervised learning, and logic-based methods for
knowledge representation and reasoning. This course has two
aims. First, to provide a rigorous introduction to machine
learning, moving beyond the supervised case and ultimately presenting
state-of-the-art methods. Second, to provide an introduction to the
wider area of probabilistic methods for representing and reasoning
with knowledge.

## Lectures

**Introduction to learning and inference.**Supervised, unsupervised, semi-supervised and reinforcement learning. Bayesian inference in general. What the naive Bayes method actually does. Review of backpropagation. Other kinds of learning and inference. [1 lecture]**How to classify optimally.**Treating learning probabilistically. Bayesian decision theory and Bayes optimal classification. Likelihood functions and priors. Bayes theorem as applied to supervised learning. The maximum likelihood and maximum*a posteriori*hypotheses. What does this teach us about the backpropagation algorithm? [2 lectures]**Linear classifiers I.**Supervised learning via error minimization. Iterative reweighted least squares. The maximum margin classifier. [1 lecture]**Support vector machines (SVMs).**The kernel trick. Problem formulation. Constrained optimization and the dual problem. SVM algorithm. [2 lectures]**Practical issues.**Hyperparameters. Measuring performance. Cross-validation. Experimental methods. [1 lecture]**Linear classifiers II.**The Bayesian approach to neural networks. [1 lecture]**Unsupervised learning I.**The*k*-means algorithm. Clustering as a maximum likelihood problem. [1 lecture]**Unsupervised learning II.**The EM algorithm and its application to clustering. [1 lecture]**Bayesian networks I.**Representing uncertain knowledge using Bayesian networks. Conditional independence. Exact inference in Bayesian networks. [1 lecture]**Bayesian networks II.**Markov random fields. Approximate inference. Markov chain Monte Carlo methods. [1 lecture]

## Objectives

At the end of this course students should:

- Understand how learning and inference can be captured within a
probabilistic framework, and know how probability theory can be
applied in practice as a means of handling uncertainty in AI systems.
- Understand several algorithms for machine learning and apply those methods in practice with proper regard for good experimental practice.

## Recommended reading

If you are going to buy a single book for this course we recommend:

* Bishop, C.M. (2006). Pattern recognition and machine learning.
Springer.

The course text for Artificial Intelligence I:

Russell, S. & Norvig, P. (2010). Artificial intelligence: a
modern approach. Prentice Hall (3rd ed.).

covers some relevant material but often in insufficient detail. Similarly:

Mitchell, T.M. (1997). Machine Learning. McGraw-Hill.

gives a gentle introduction to some of the course material, but only an introduction. Recently a few new books have appeared that cover a lot of relevant ground well. For example:

Barber, D. (2012). Bayesian Reasoning and Machine Learning.
Cambridge University Press.

Flach, P. (2012). Machine Learning: The Art and Science of Algorithms
that Make Sense of Data. Cambridge University Press.

Murphy, K.P. (2012). Machine Learning: A Probabilistic Perspective.
MIT Press.