Course material 2010–11
Computer Vision
Lecturer: Dr C.P. Town
No. of lectures: 12
Prerequisite courses: Probability, Mathematical Methods for Computer Science. Artificial Intelligence I (recommended)
Aims
The aims of this course are to introduce the principles, models and applications of computer vision. The course will cover: image formation, structure, and coding; edge and feature detection; texture, colour, stereo, and motion; wavelet methods for visual coding and analysis; interpretation of surfaces, solids, and shapes; appearance modelling; pattern recognition and classification; visual inference and learning. Several of these issues will be illustrated using the examples of optical character recognition, image retrieval, and face recognition.
Lectures
- Goals of computer vision; why they are so difficult.
How images are formed, and the ill-posed problem of making 3D
inferences from them about objects and their properties.
- Image sensing, pixel arrays, cameras.
Elementary operations on image arrays; coding and information
measures. Sampling and aliasing.
- Mathematical operators for extracting image structure.
Finite differences and directional derivatives. Filters; convolution;
correlation. Fourier and wavelet transforms.
- Edge detection operators; the information revealed by edges.
The Laplacian operator and its zero-crossings. Logan’s theorem.
- Multi-scale feature detection and matching.
Gaussian pyramids and SIFT (scale-invariant feature transform).
Active contours; energy-minimising snakes. 2D wavelets as visual
primitives.
- Texture, colour, stereo, and motion descriptors.
Disambiguation and the achievement of invariances. Image and motion
segmentation.
- Lambertian and specular surfaces.
Reflectance maps. Image formation geometry. Discounting the illuminant
when inferring 3D structure and surface properties.
- Shape representation.
Inferring 3D shape from shading; surface geometry. Boundary
descriptors; active appearance models; codons; superquadrics and the
“2.5-Dimensional” sketch.
- Perceptual psychology and visual cognition.
Vision as model-building and graphics in the brain. Learning to
see. Visual illusions, and what they may imply about how vision works.
- Bayesian inference in vision; knowledge-driven interpretations.
Classifiers and pattern recognition. Probabilistic methods in vision.
- Applications of machine learning in computer vision.
Appearance and model based representations. Discriminative and
generative methods. Optical character recognition. Content based image
retrieval.
- Approaches to face detection, face recognition, and facial interpretation.
Objectives
At the end of the course students should
- understand visual processing from both “bottom-up” (data
oriented) and “top-down” (goals oriented) perspectives;
- be able to decompose visual tasks into sequences of image analysis
operations, representations, specific algorithms, and inference principles;
- understand the roles of image transformations and their
invariances in pattern recognition and classification;
- be able to describe and contrast techniques for extracting and
representing features, edges, shapes, and textures
- be able to analyse the robustness, brittleness, generalizability,
and performance of different approaches in computer vision;
- understand some of the major practical application problems,
such as face interpretation, character recognition, and image
retrieval.
Recommended reading
* Forsyth, D.A. & Ponce, J. (2003). Computer vision: a modern approach. Prentice Hall.
Shapiro, L. & Stockman, G. (2001). Computer vision. Prentice Hall.