Department of Computer Science and Technology

Course pages 2017–18

Subsections


Machine Learning and Bayesian Inference

Lecturer: Dr S.B. Holden

No. of lectures: 16

Suggested hours of supervisions: 4

Prerequisite courses: Artificial Intelligence, Foundations of Data Science, Discrete Mathematics and Probability, Linear Algebra and Calculus from the NST Mathematics course.

Aims

The Part 1B course Artificial Intelligence introduced simple neural networks for supervised learning, and logic-based methods for knowledge representation and reasoning. This course has two aims. First, to provide a comprehensive introduction to machine learning, moving beyond the supervised case and ultimately presenting state-of-the-art methods. Second, to provide an introduction to the wider area of probabilistic methods for representing and reasoning with knowledge.

Lectures

  • Introduction to learning and inference. Supervised, unsupervised, semi-supervised and reinforcement learning. Bayesian inference in general. What the naive Bayes method actually does. Review of backpropagation. Other kinds of learning and inference. [1 lecture]
  • How to classify optimally. Treating learning probabilistically. Bayesian decision theory and Bayes optimal classification. Generative and discriminative models. Likelihood functions and priors. Bayes theorem as applied to supervised learning. The maximum likelihood and maximum a posteriori hypotheses. What does this teach us about the backpropagation algorithm? [1 lecture]

  • Linear classifiers I. Supervised learning via error minimization. Iterative reweighted least squares. The maximum margin classifier. [1 lecture]

  • Support vector machines (SVMs). The kernel trick. Problem formulation. Constrained optimization and the dual problem. SVM algorithm. [2 lectures]

  • Practical issues. Hyperparameters. Measuring performance. Cross-validation. Experimental methods. Multiple classes. [1 lecture]
  • Linear classifiers II. The Bayesian approach to neural networks. [1 lecture]
  • Gaussian processes. Learning and inference for regression using Gaussian process models. [2 lectures]

  • Deep networks. Scaling up backpropagation. Convolutional networks. [1 lecture]
  • Unsupervised learning I. The k-means algorithm. Clustering as a maximum likelihood problem. [1 lecture]

  • Unsupervised learning II. The EM algorithm and its application to clustering. [1 lecture]

  • Reinforcement learning I. Learning from rewards and punishments. Markov decision processes. The problems of temporal credit assignment and exploration versus exploitation. [1 lecture]

  • Reinforcement Learning II. Q-learning and Temporal Difference Learning. [1 lecture]

  • Bayesian networks I. Representing uncertain knowledge using Bayesian networks. Conditional independence. Exact inference in Bayesian networks. [1 lecture]

  • Bayesian networks II. Markov random fields. Approximate inference. Markov chain Monte Carlo methods. [1 lecture]

Objectives

At the end of this course students should:

  • Understand how learning and inference can be captured within a probabilistic framework, and know how probability theory can be applied in practice as a means of handling uncertainty in AI systems.

  • Understand several state-of-the-art algorithms for machine learning and apply those methods in practice with proper regard for good experimental practice.

Recommended reading

If you are going to buy a single book for this course we recommend:

* Bishop, C.M. (2006). Pattern recognition and machine learning. Springer.

The course text for Artificial Intelligence I:

Russell, S. & Norvig, P. (2010). Artificial intelligence: a modern approach. Prentice Hall (3rd ed.).

covers some relevant material but often in insufficient detail. Similarly:

Mitchell, T.M. (1997). Machine Learning. McGraw-Hill.

gives a gentle introduction to some of the course material, but only an introduction. Recently a few new books have appeared that cover a lot of relevant ground well. For example:

Barber, D. (2012). Bayesian Reasoning and Machine Learning. Cambridge University Press.
Flach, P. (2012). Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press.
Murphy, K.P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.