Computer Laboratory

Course pages 2013–14

Information Theory and Coding

Principal lecturer: Prof John Daugman
Taken by: Part II
Past exam questions
Information for supervisors (contact lecturer for access permission)

No. of lectures: 12
Suggested hours of supervisions: 0 (two example classes will be provided)
Prerequisite courses: Probability, Mathematical Methods for Computer Science


The aims of this course are to introduce the principles and applications of information theory. The course will study how information is measured in terms of probability and entropy, and the relationships among conditional and joint entropies; how these are used to calculate the capacity of a communication channel, with and without noise; coding schemes, including error correcting codes; how discrete channels and measures of information generalize to their continuous forms; the Fourier perspective; and extensions to wavelets, complexity, compression, and efficient coding of audio-visual information.


  • Foundations: probability, uncertainty, information. How concepts of randomness, redundancy, compressibility, noise, bandwidth, and uncertainty are related to information. Ensembles, random variables, marginal and conditional probabilities. How the metrics of information are grounded in the rules of probability.

  • Entropies defined, and why they are measures of information. Marginal entropy, joint entropy, conditional entropy, and the Chain Rule for entropy. Mutual information between ensembles of random variables. Why entropy is the fundamental measure of information content.

  • Source coding theorem; prefix, variable-, and fixed-length codes. Symbol codes. The binary symmetric channel. Capacity of a noiseless discrete channel. Error correcting codes.

  • Channel types, properties, noise, and channel capacity. Perfect communication through a noisy channel. Capacity of a discrete channel as the maximum of its mutual information over all possible input distributions.

  • Continuous information; density; noisy channel coding theorem. Extensions of the discrete entropies and measures to the continuous case. Signal-to-noise ratio; power spectral density. Gaussian channels. Relative significance of bandwidth and noise limitations. The Shannon rate limit and efficiency for noisy continuous channels.

  • Fourier series, convergence, orthogonal representation. Generalized signal expansions in vector spaces. Independence. Representation of continuous or discrete data by complex exponentials. The Fourier basis. Fourier series for periodic functions. Examples.

  • Useful Fourier theorems; transform pairs. Sampling; aliasing. The Fourier transform for non-periodic functions. Properties of the transform, and examples. Nyquist’s Sampling Theorem derived, and the cause (and removal) of aliasing.

  • Discrete Fourier transform. Fast Fourier Transform algorithms. Efficient algorithms for computing Fourier transforms of discrete data. Computational complexity. Filters, correlation, modulation, demodulation, coherence.

  • The quantized degrees-of-freedom in a continuous signal. Why a continuous signal of finite bandwidth and duration has a fixed number of degrees-of-freedom. Diverse illustrations of the principle that information, even in such a signal, comes in quantized, countable, packets.

  • Gabor-Heisenberg-Weyl uncertainty relation. Optimal “Logons”. Unification of the time-domain and the frequency-domain as endpoints of a continuous deformation. The Uncertainty Principle and its optimal solution by Gabor’s expansion basis of “logons”. Multi-resolution wavelet codes. Extension to images, for analysis and compression.

  • Kolmogorov complexity. Minimal description length. Definition of the algorithmic complexity of a data sequence, and its relation to the entropy of the distribution from which the data was drawn. Fractals. Minimal description length, and why this measure of complexity is not computable.


At the end of the course students should be able to

  • calculate the information content of a random variable from its probability distribution;

  • relate the joint, conditional, and marginal entropies of variables in terms of their coupled probabilities;

  • define channel capacities and properties using Shannon’s Theorems;

  • construct efficient codes for data on imperfect communication channels;

  • generalize the discrete concepts to continuous signals on continuous channels;

  • understand Fourier Transforms and the main ideas of efficient algorithms for them;

  • describe the information resolution, compression, and efficient coding properties of wavelets.

Recommended reading

* Cover, T.M. & Thomas, J.A. (2006). Elements of information theory. New York: Wiley.