Machine Learning and Bayesian Inference
Principal lecturer: Dr Sean Holden
Taken by: Part II CST
Term: Lent
Hours: 16
Format: In-person lectures
Suggested hours of supervisions: 4
Prerequisites: Data Science, Discrete Mathematics
Past exam questions, Moodle, timetable
Aims
The Part 1B course Artificial Intelligence introduced simple neural networks for supervised learning, and logic-based methods for knowledge representation and reasoning. This course has two aims. First, to provide a rigorous introduction to machine learning, moving beyond the supervised case and ultimately presenting state-of-the-art methods. Second, to provide an introduction to the wider area of probabilistic methods for representing and reasoning with knowledge.
Lectures
- Introduction to learning and inference. Supervised, unsupervised, semi-supervised and reinforcement learning. Bayesian inference in general. What the naive Bayes method actually does. Review of backpropagation. Other kinds of learning and inference. [1 lecture]
- How to classify optimally. Treating learning probabilistically. Bayesian decision theory and Bayes optimal classification. Likelihood functions and priors. Bayes theorem as applied to supervised learning. The maximum likelihood and maximum a posteriori hypotheses. What does this teach us about the backpropagation algorithm? [2 lectures]
- Linear classifiers I. Supervised learning via error minimization. Iterative reweighted least squares. The maximum margin classifier. [2 lectures]
- Gaussian processes. Learning and inference for regression using Gaussian process models. [2 lectures]
- Support vector machines (SVMs). The kernel trick. Problem formulation. Constrained optimization and the dual problem. SVM algorithm. [2 lectures]
- Practical issues. Hyperparameters. Measuring performance. Cross-validation. Experimental methods. [1 lecture]
- Linear classifiers II. The Bayesian approach to neural networks. [1 lecture]
- Unsupervised learning I. The k-means algorithm. Clustering as a maximum likelihood problem. [1 lecture]
- Unsupervised learning II. The EM algorithm and its application to clustering. [1 lecture]
- Bayesian networks I. Representing uncertain knowledge using Bayesian networks. Conditional independence. Exact inference in Bayesian networks. [2 lectures]
- Bayesian networks II. Markov random fields. Approximate inference. Markov chain Monte Carlo methods. [1 lecture]
Objectives
At the end of this course students should:
- Understand how learning and inference can be captured within a probabilistic framework, and know how probability theory can be applied in practice as a means of handling uncertainty in AI systems.
- Understand several algorithms for machine learning and apply those methods in practice with proper regard for good experimental practice.
Recommended reading
If you are going to buy a single book for this course I recommend:
* Bishop, C.M. (2006). Pattern recognition and machine learning. Springer.
The course text for Artificial Intelligence I:
Russell, S. and Norvig, P. (2010). Artificial intelligence: a modern approach. Prentice Hall (3rd ed.).
covers some relevant material but often in insufficient detail. Similarly:
Mitchell, T.M. (1997). Machine Learning. McGraw-Hill.
gives a gentle introduction to some of the course material, but only an introduction.
Recently a few new books have appeared that cover a lot of relevant ground well. For example:
Barber, D. (2012). Bayesian Reasoning and Machine Learning.
Cambridge University Press.
Flach, P. (2012). Machine Learning: The Art and Science of
Algorithms that Make Sense of Data. Cambridge University
Press.
Murphy, K.P. (2012). Machine Learning: A Probabilistic
Perspective. MIT Press.