# Computer Laboratory

Course pages 2016–17

Subsections

## Machine Learning and Bayesian Inference

Lecturer: Dr S.B. Holden

No. of lectures: 16

Suggested hours of supervisions: 4

Prerequisite courses: Artificial Intelligence I, Mathematical Methods for Computer Science, Discrete Mathematics and Probability, Linear Algebra and Calculus from the NST Mathematics course.

### Aims

Artificial Intelligence I introduced simple neural networks for supervised learning, and logic-based methods for knowledge representation and reasoning. This course has two aims. First, to provide a comprehensive introduction to machine learning, moving beyond the supervised case and ultimately presenting state-of-the-art methods. Second, to provide an introduction to the wider area of probabilistic methods for representing and reasoning with knowledge.

### Lectures

• Introduction to learning and inference. Supervised, unsupervised, semi-supervised and reinforcement learning. Bayesian inference in general. What the naive Bayes method actually does. Review of backpropagation. Other kinds of learning and inference.
• How to classify optimally. Treating learning probabilistically. Bayesian decision theory and Bayes optimal classification. Generative and discriminativemodels. Likelihood functions and priors. Bayes theorem as applied to supervised learning. The maximum likelihood and maximum a posteriori hypotheses. What does this teach us about the backpropagation algorithm?
• Linear classifiers I. Supervised learning via error minimization. Iterative reweighted least squares. The maximum margin classifier.
• Support vector machines (SVMs). The kernel trick. Problem formulation. Constrained optimization and the dual problem. SVM algorithm.
• Practical issues. Hyperparameters. Measuring performance. Cross-validation. Experimental methods. Multiple classes.
• Linear classifiers II. The Bayesian approach to neural networks.
• Gaussian processes. Learning and inference for regression using Gaussian process models.
• Unsupervised learning I. The k-means algorithm. Clustering as a maximum likelihood problem.
• Unsupervised learning II. The EM algorithm and its application to clustering.
• Deep networks. Combining unsupervised and supervised training. Convolutional networks.
• Semi-supervised learning.
• Reinforcement learning I. Learning from rewards and punishments. Markov decision processes. The problems of temporal credit assignment and exploration versus exploitation.
• Reinforcement Learning II. Q-learning and its convergence. How to choose actions.
• Bayesian networks I. Representing uncertain knowledge using Bayesian networks. Conditional independence. Exact inference in Bayesian networks.
• Bayesian networks II. Markov random fields. Approximate inference. Markov chain Monte Carlo methods.
• Uncertain reasoning over time. Markov processes, transition and sensor models. Hidden Markov models (HMMs). Inference in temporal models: filtering, prediction, smoothing and finding the most likely explanation. The Viterbi algorithm.

### Objectives

At the end of this course students should:

• Understand how learning and inference can be captured within a probabilistic framework, and know how probability theory can be applied in practice as a means of handling uncertainty in AI systems.

• Understand several state-of-the-art algorithms for machine learning and apply those methods in practice with proper regard for good experimental practice.

If you are going to buy a single book for this course we recommend:

* Bishop, C.M. (2006). Pattern recognition and machine learning. Springer.

These cover some relevant material, but often in insufficient detail:

Mitchell, T.M. (1997). Machine Learning. McGraw-Hill.
Russell, S. & Norvig, P. (2010). Artificial intelligence: a modern approach. Prentice Hall (3rd ed.).

Recently a few new books have appeared that cover a lot of relevant ground well:

Barber, D. (2012). Bayesian Reasoning and Machine Learning. Cambridge University Press.
Flach, P. (2012). Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press.
Murphy, K.P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.