Course pages 2016–17

**Subsections**

##

Machine Learning and Bayesian Inference

*Lecturer: Dr S.B. Holden*

*No. of lectures:* 16

*Suggested hours of supervisions:* 4

*Prerequisite courses: Artificial Intelligence I, Mathematical Methods for Computer Science, Discrete Mathematics and Probability, Linear Algebra and Calculus from the NST Mathematics course*.

### Aims

Artificial Intelligence I introduced simple neural networks for supervised learning, and logic-based methods for knowledge representation and reasoning. This course has two aims. First, to provide a comprehensive introduction to machine learning, moving beyond the supervised case and ultimately presenting state-of-the-art methods. Second, to provide an introduction to the wider area of probabilistic methods for representing and reasoning with knowledge.

### Lectures

**Introduction to learning and inference.**Supervised, unsupervised, semi-supervised and reinforcement learning. Bayesian inference in general. What the naive Bayes method actually does. Review of backpropagation. Other kinds of learning and inference.**How to classify optimally.**Treating learning probabilistically. Bayesian decision theory and Bayes optimal classification. Generative and discriminativemodels. Likelihood functions and priors. Bayes theorem as applied to supervised learning. The maximum likelihood and maximum a posteriori hypotheses. What does this teach us about the backpropagation algorithm?**Linear classifiers I.**Supervised learning via error minimization. Iterative reweighted least squares. The maximum margin classifier.**Support vector machines (SVMs).**The kernel trick. Problem formulation. Constrained optimization and the dual problem. SVM algorithm.**Practical issues.**Hyperparameters. Measuring performance. Cross-validation. Experimental methods. Multiple classes.**Linear classifiers II.**The Bayesian approach to neural networks.**Gaussian processes.**Learning and inference for regression using Gaussian process models.**Unsupervised learning I.**The k-means algorithm. Clustering as a maximum likelihood problem.**Unsupervised learning II.**The EM algorithm and its application to clustering.**Deep networks.**Combining unsupervised and supervised training. Convolutional networks.**Semi-supervised learning.****Reinforcement learning I.**Learning from rewards and punishments. Markov decision processes. The problems of temporal credit assignment and exploration versus exploitation.**Reinforcement Learning II.**Q-learning and its convergence. How to choose actions.**Bayesian networks I.**Representing uncertain knowledge using Bayesian networks. Conditional independence. Exact inference in Bayesian networks.**Bayesian networks II.**Markov random fields. Approximate inference. Markov chain Monte Carlo methods.**Uncertain reasoning over time.**Markov processes, transition and sensor models. Hidden Markov models (HMMs). Inference in temporal models: filtering, prediction, smoothing and finding the most likely explanation. The Viterbi algorithm.

### Objectives

At the end of this course students should:

- Understand how learning and inference can be captured within a probabilistic framework, and know how probability theory can be applied in practice as a means of handling uncertainty in AI systems.
- Understand several state-of-the-art algorithms for machine learning and apply those methods in practice with proper regard for good experimental practice.

### Recommended reading

If you are going to buy a single book for this course we recommend:

* Bishop, C.M. (2006). Pattern recognition and machine learning.
Springer.

These cover some relevant material, but often in insufficient detail:

Mitchell, T.M. (1997). Machine Learning. McGraw-Hill.

Russell, S. & Norvig, P. (2010). Artificial intelligence: a
modern approach. Prentice Hall (3rd ed.).

Recently a few new books have appeared that cover a lot of relevant ground well:

Barber, D. (2012). Bayesian Reasoning and Machine Learning.
Cambridge University Press.

Flach, P. (2012). Machine Learning: The Art and Science of Algorithms
that Make Sense of Data. Cambridge University Press.

Murphy, K.P. (2012). Machine Learning: A Probabilistic Perspective.
MIT Press.