Course material 2010–11
Lecturer: Dr C.P. Town
No. of lectures: 12
Prerequisite courses: Probability, Mathematical Methods for Computer Science. Artificial Intelligence I (recommended)
The aims of this course are to introduce the principles, models and applications of computer vision. The course will cover: image formation, structure, and coding; edge and feature detection; texture, colour, stereo, and motion; wavelet methods for visual coding and analysis; interpretation of surfaces, solids, and shapes; appearance modelling; pattern recognition and classification; visual inference and learning. Several of these issues will be illustrated using the examples of optical character recognition, image retrieval, and face recognition.
- Goals of computer vision; why they are so difficult. How images are formed, and the ill-posed problem of making 3D inferences from them about objects and their properties.
- Image sensing, pixel arrays, cameras. Elementary operations on image arrays; coding and information measures. Sampling and aliasing.
- Mathematical operators for extracting image structure. Finite differences and directional derivatives. Filters; convolution; correlation. Fourier and wavelet transforms.
- Edge detection operators; the information revealed by edges. The Laplacian operator and its zero-crossings. Logan’s theorem.
- Multi-scale feature detection and matching. Gaussian pyramids and SIFT (scale-invariant feature transform). Active contours; energy-minimising snakes. 2D wavelets as visual primitives.
- Texture, colour, stereo, and motion descriptors. Disambiguation and the achievement of invariances. Image and motion segmentation.
- Lambertian and specular surfaces. Reflectance maps. Image formation geometry. Discounting the illuminant when inferring 3D structure and surface properties.
- Shape representation. Inferring 3D shape from shading; surface geometry. Boundary descriptors; active appearance models; codons; superquadrics and the “2.5-Dimensional” sketch.
- Perceptual psychology and visual cognition. Vision as model-building and graphics in the brain. Learning to see. Visual illusions, and what they may imply about how vision works.
- Bayesian inference in vision; knowledge-driven interpretations. Classifiers and pattern recognition. Probabilistic methods in vision.
- Applications of machine learning in computer vision. Appearance and model based representations. Discriminative and generative methods. Optical character recognition. Content based image retrieval.
- Approaches to face detection, face recognition, and facial interpretation.
At the end of the course students should
- understand visual processing from both “bottom-up” (data oriented) and “top-down” (goals oriented) perspectives;
- be able to decompose visual tasks into sequences of image analysis operations, representations, specific algorithms, and inference principles;
- understand the roles of image transformations and their invariances in pattern recognition and classification;
- be able to describe and contrast techniques for extracting and representing features, edges, shapes, and textures
- be able to analyse the robustness, brittleness, generalizability, and performance of different approaches in computer vision;
- understand some of the major practical application problems, such as face interpretation, character recognition, and image retrieval.
* Forsyth, D.A. & Ponce, J. (2003). Computer vision: a modern approach. Prentice Hall.
Shapiro, L. & Stockman, G. (2001). Computer vision. Prentice Hall.