Course pages 2013–14

Computer Vision

Lecture Notes (PDF, 1-up for tablets), or here is a 2-up version of Lecture Notes (for laptops, monitors)
Exercises (PDF)
Exercises with Solutions (PDF)

Aims

The aims of this course are to introduce the principles, models and applications of computer vision, as well as some mechanisms used in biological visual systems that may inspire design of artificial ones. The course will cover: image formation, structure, and coding; edge and feature detection; neural operators for image analysis; texture, colour, stereo, motion; wavelet methods for visual coding and analysis; interpretation of surfaces, solids, and shapes; data fusion; probabilistic classifiers; visual inference and learning. Issues will be illustrated using the examples of optical character recognition, image retrieval, and face recognition.

Lectures

Goals of computer vision; why they are so difficult. How images are formed, and the ill-posed problem of making 3D inferences from them about objects and their properties.
Image sensing, pixel arrays, CCD cameras. Image coding and information measures. Elementary operations on image arrays.
Biological visual mechanisms from retina to cortex. Photoreceptor sampling; receptive field profiles; spike trains; channels and pathways. Neural image encoding operators.
Mathematical operations for extracting image structure. Finite differences and directional derivatives. Filters; convolution; correlation. 2D Fourier domain theorems.
Edge detection operators; the information revealed by edges. The Laplacian operator and its zero-crossings. Logan's Theorem.
Multi-scale feature detection and matching. SIFT (scale-invariant feature transform); pyramids. 2D wavelets as visual primitives. Energy-minimising snakes; active contours.
Higher level visual operations in brain cortical areas. Multiple parallel mappings; streaming and divisions of labour; reciprocal feedback through the visual system.
Texture, colour, stereo, and motion descriptors. Disambiguation and the achievement of invariances. Image and motion segmentation.
Lambertian and specular surfaces. Reflectance maps, and image formation geometry. Discounting the illuminant when infering 3D structure and surface properties.
Perceptual psychology and visual cognition. Vision as model-building and graphics in the brain. "Learning to see."
Lessons from neurological trauma and visual deficits. Visual illusions and what they may imply about how vision works.
Bayesian inference in vision; knowledge-driven interpretations. Classifiers and probabilistic decision-making.
Model estimation. Machine learning and statistical methods in vision.
Applications of machine learning in vision: discriminative versus generative methods. Optical character recognition. Content based image retrieval.
Approaches to face detection, face recognition, and facial interpretation.

Objectives

At the end of the course students should

understand visual processing from both "bottom-up" (data oriented) and "top-down" (goals oriented) perspectives
be able to decompose visual tasks into sequences of image analysis operations, representations, specific algorithms, and inference principles
understand the roles of image transformations and their invariances in pattern recognition and classification
be able to describe and contrast techniques for extracting and representing features, edges, shapes, and textures;
be able to analyse the robustness, brittleness, generalisability, and performance of different approaches in computer vision
be able to describe key aspects of how biological visual systems encode, analyse, and represent visual information
be able to think of ways in which biological visual strategies might be implemented in machine vision, despite the enormous differences in hardware
understand in depth at least one major practical application domain, such as face recognition, detection, and interpretation

Reference books

* Forsyth D A and Ponce J. (2003). Computer Vision: A Modern Approach. Prentice Hall.

Shapiro L and Stockman G (2001). Computer Vision. (Prentice Hall: ISBN 0-13-030796-3)

Duda R O, Hart P E, and Stork D G (2001). Pattern Classification, 2nd ed. (Wiley: ISBN 0-471-05669-3)

Assigned Exercises (Written, and Practical):

(week of 24 Jan 2014): Exercises 1 - 5.
(week of 31 Jan 2014): Exercises 6 - 8.
Supplementary demonstrations (PPT) of mathematical operations in early vision. (Slide credits to: C Town, A Torralba, D Forsyth, K Grauman, B Macq, S Seitz, L Lazebnik, M Irani, B Leibe, A Poonawala, D Lowe)
(week of 7 Feb 2014): Exercises 9 - 10.
Practical Exercises 1 & 2 on edge detection and other early vision operations (by C Richardt, T Baltrusaitis, and L Swirski).
(week of 14 Feb 2014): Exercises 11 - 12.
(week of 21 Feb 2014): Exercises 13 - 14.
Also study this compelling lightness illusion, this illustration of colour-constancy, this motion illusion, and this collection of dynamic, colour, and cognitive illusions, and try to explain them! More collections exist here and here.
(week of 28 Feb 2014): Exercise 15a.
Practical Exercises 3 (on panorama stitching, a form of the Correspondence Problem) & 4 (an OCR Bayesian classifier).
(week of 7 Mar 2014): Exercises 15b,c.
(week of 14 Mar 2014): Exercises 16 - 17.
View this 5-minute video about 3-D morphable face representations. Here is a background paper about face recognition.

Online resources

"CVonline: The Evolving, Distributed, Non-Proprietary, On-Line Compendium of Computer Vision" (Edinburgh University; updated Nov. 2013; includes links to many Wikipedia pages)
Matlab Functions for Computer Vision and Image Processing (updated May 2013)
Annotated Computer Vision Bibliography (updated 15 Dec. 2013)

Computer Laboratory