Computer Laboratory – Course material 2010

Course material 2010–11

Computer Vision

Lecturer: Dr C.P. Town

No. of lectures: 12

Prerequisite courses: Probability, Mathematical Methods for Computer Science. Artificial Intelligence I (recommended)

Aims

The aims of this course are to introduce the principles, models and applications of computer vision. The course will cover: image formation, structure, and coding; edge and feature detection; texture, colour, stereo, and motion; wavelet methods for visual coding and analysis; interpretation of surfaces, solids, and shapes; appearance modelling; pattern recognition and classification; visual inference and learning. Several of these issues will be illustrated using the examples of optical character recognition, image retrieval, and face recognition.

Lectures

Goals of computer vision; why they are so difficult. How images are formed, and the ill-posed problem of making 3D inferences from them about objects and their properties.
Image sensing, pixel arrays, cameras. Elementary operations on image arrays; coding and information measures. Sampling and aliasing.
Mathematical operators for extracting image structure. Finite differences and directional derivatives. Filters; convolution; correlation. Fourier and wavelet transforms.
Edge detection operators; the information revealed by edges. The Laplacian operator and its zero-crossings. Logan’s theorem.
Multi-scale feature detection and matching. Gaussian pyramids and SIFT (scale-invariant feature transform). Active contours; energy-minimising snakes. 2D wavelets as visual primitives.
Texture, colour, stereo, and motion descriptors. Disambiguation and the achievement of invariances. Image and motion segmentation.
Lambertian and specular surfaces. Reflectance maps. Image formation geometry. Discounting the illuminant when inferring 3D structure and surface properties.
Shape representation. Inferring 3D shape from shading; surface geometry. Boundary descriptors; active appearance models; codons; superquadrics and the “2.5-Dimensional” sketch.
Perceptual psychology and visual cognition. Vision as model-building and graphics in the brain. Learning to see. Visual illusions, and what they may imply about how vision works.
Bayesian inference in vision; knowledge-driven interpretations. Classifiers and pattern recognition. Probabilistic methods in vision.
Applications of machine learning in computer vision. Appearance and model based representations. Discriminative and generative methods. Optical character recognition. Content based image retrieval.
Approaches to face detection, face recognition, and facial interpretation.

Objectives

At the end of the course students should

understand visual processing from both “bottom-up” (data oriented) and “top-down” (goals oriented) perspectives;
be able to decompose visual tasks into sequences of image analysis operations, representations, specific algorithms, and inference principles;
understand the roles of image transformations and their invariances in pattern recognition and classification;
be able to describe and contrast techniques for extracting and representing features, edges, shapes, and textures
be able to analyse the robustness, brittleness, generalizability, and performance of different approaches in computer vision;
understand some of the major practical application problems, such as face interpretation, character recognition, and image retrieval.

Recommended reading

* Forsyth, D.A. & Ponce, J. (2003). Computer vision: a modern approach. Prentice Hall.
Shapiro, L. & Stockman, G. (2001). Computer vision. Prentice Hall.