Computer Laboratory

Course material 2010–11


Computer Vision

Lecturer: Dr C.P. Town

No. of lectures: 12

Prerequisite courses: Probability, Mathematical Methods for Computer Science. Artificial Intelligence I (recommended)

Aims

The aims of this course are to introduce the principles, models and applications of computer vision. The course will cover: image formation, structure, and coding; edge and feature detection; texture, colour, stereo, and motion; wavelet methods for visual coding and analysis; interpretation of surfaces, solids, and shapes; appearance modelling; pattern recognition and classification; visual inference and learning. Several of these issues will be illustrated using the examples of optical character recognition, image retrieval, and face recognition.

Lectures

  • Goals of computer vision; why they are so difficult. How images are formed, and the ill-posed problem of making 3D inferences from them about objects and their properties.

  • Image sensing, pixel arrays, cameras. Elementary operations on image arrays; coding and information measures. Sampling and aliasing.

  • Mathematical operators for extracting image structure. Finite differences and directional derivatives. Filters; convolution; correlation. Fourier and wavelet transforms.

  • Edge detection operators; the information revealed by edges. The Laplacian operator and its zero-crossings. Logan’s theorem.

  • Multi-scale feature detection and matching. Gaussian pyramids and SIFT (scale-invariant feature transform). Active contours; energy-minimising snakes. 2D wavelets as visual primitives.

  • Texture, colour, stereo, and motion descriptors. Disambiguation and the achievement of invariances. Image and motion segmentation.

  • Lambertian and specular surfaces. Reflectance maps. Image formation geometry. Discounting the illuminant when inferring 3D structure and surface properties.

  • Shape representation. Inferring 3D shape from shading; surface geometry. Boundary descriptors; active appearance models; codons; superquadrics and the “2.5-Dimensional” sketch.

  • Perceptual psychology and visual cognition. Vision as model-building and graphics in the brain. Learning to see. Visual illusions, and what they may imply about how vision works.

  • Bayesian inference in vision; knowledge-driven interpretations. Classifiers and pattern recognition. Probabilistic methods in vision.

  • Applications of machine learning in computer vision. Appearance and model based representations. Discriminative and generative methods. Optical character recognition. Content based image retrieval.

  • Approaches to face detection, face recognition, and facial interpretation.

Objectives

At the end of the course students should

  • understand visual processing from both “bottom-up” (data oriented) and “top-down” (goals oriented) perspectives;

  • be able to decompose visual tasks into sequences of image analysis operations, representations, specific algorithms, and inference principles;

  • understand the roles of image transformations and their invariances in pattern recognition and classification;

  • be able to describe and contrast techniques for extracting and representing features, edges, shapes, and textures

  • be able to analyse the robustness, brittleness, generalizability, and performance of different approaches in computer vision;

  • understand some of the major practical application problems, such as face interpretation, character recognition, and image retrieval.

Recommended reading

* Forsyth, D.A. & Ponce, J. (2003). Computer vision: a modern approach. Prentice Hall.
Shapiro, L. & Stockman, G. (2001). Computer vision. Prentice Hall.