Course pages 2011–12
No. of lectures + examples classes: 15 + 1
Prerequisite courses: Probability, Mathematical Methods for Computer Science
The aims of this course are to introduce the principles, models and applications of computer vision, as well as some mechanisms used in biological visual systems that may inspire design of artificial ones. The course will cover: image formation, structure, and coding; edge and feature detection; neural operators for image analysis; texture, colour, stereo, and motion; wavelet methods for visual coding and analysis; interpretation of surfaces, solids, and shapes; classifiers and pattern recognition; visual inference and learning. Several of these issues will be illustrated in the topic of face recognition.
- Goals of computer vision; why they are so difficult. How images are formed, and the ill-posed problem of making 3D inferences from them about objects and their properties.
- Image sensing, pixel arrays, CCD cameras. Image coding and information measures. Elementary operations on image arrays.
- Biological visual mechanisms, from retina to cortex. Photoreceptor sampling; receptive field profiles; stochastic impulse codes; channels and pathways. Neural image encoding operators.
- Mathematical operators for extracting image structure. Finite differences and directional derivatives. Filters; convolution; correlation. 2D Fourier domain theorems.
- Edge detection operators; the information revealed by edges. The Laplacian operator and its zero-crossings. Logan’s theorem.
- Multi-resolution representations. Gaussian pyramids and SIFT (scale-invariant feature transform). Active contours; energy-minimising snakes. 2D wavelets as visual primitives.
- Higher visual operations in brain cortical areas. Multiple parallel mappings; streaming and divisions of labour; reciprocal feedback across the visual system.
- Texture, colour, stereo, and motion descriptors. Disambiguation and the achievement of invariances when inferring object properties from images.
- Lambertian and specular surfaces; reflectance maps. Geometric analysis of image formation from surfaces. Discounting the illuminant when inferring 3D structure from image properties.
- Shape representation. Inferring 3D shape from shading; surface geometry. Boundary descriptors; codons. Object-centred coordinates and the “2.5-Dimensional" sketch.
- Perceptual psychology and visual cognition. Vision as model-building and graphics in the brain. Learning to see.
- Lessons from visual illusions and from neurological trauma. Visual agnosias and illusions, and what they may imply about how vision works.
- Bayesian inference in vision; knowledge-driven interpretations. Classifiers and pattern recognition. Probabilistic methods in vision.
- Vision as a set of inverse problems. Mathematical methods for solving them: energy minimization, relaxation, regularization. Active models.
- Applications of machine learning in computer vision. Discriminative and generative methods. Content based image retrieval.
- Approaches to face detection, face recognition, and facial interpretation. Appearance versus model-based methods (2D and 3D approaches). Cascaded detectors.
At the end of the course students should
- understand visual processing from both “bottom-up” (data oriented) and “top-down” (goals oriented) perspectives;
- be able to decompose visual tasks into sequences of image analysis operations, representations, specific algorithms, and inference principles;
- understand the roles of image transformations and their invariances in pattern recognition and classification;
- be able to analyse the robustness, brittleness, generalizability, and performance of different approaches in computer vision;
- be able to describe key aspects of how biological visual systems work; and be able to think of ways in which biological visual strategies might be implemented in machine vision, despite the enormous differences in hardware;
- understand the roles of machine learning in computer vision today, including probabilistic inference, discriminative and generative methods;
- understand in depth at least one major practical application problem, such as face recognition, detection, and interpretation.
* Shapiro, L. & Stockman, G. (2001). Computer vision. Prentice Hall.