Course pages 2016–17
Information Theory
Lecturer: Professor J.G. Daugman
No. of lectures: 12
In lieu of supervisions, exercises will be set and reviewed in two Examples Classes.
Prerequisite courses: Mathematical Methods for Computer Science
Aims
The aims of this course are to introduce the principles and applications of information theory. The course covers how information is measured in terms of probability and various entropies, and how these are used to calculate the capacity of communication channels, continuous or discrete, with or without noise. Coding schemes including error correcting codes are studied along with data compression, spectral analysis, and efficient coding using wavelets. Applications of information theory are also reviewed, from bioinformatics to pattern recognition.
Lectures
- Foundations: probability, uncertainty, information.
How concepts of randomness, redundancy, compressibility, noise,
bandwidth, and uncertainty are related to information. Ensembles,
random variables, marginal and conditional probabilities. How the
metrics of information are grounded in the rules of probability.
- Entropies defined, and why they are measures of information.
Marginal entropy, joint entropy, conditional entropy,
and the Chain Rule for entropy. Mutual information between ensembles
of random variables. Why entropy is the fundamental measure of
information content.
- Source coding theorem; prefix, variable-, and fixed-length codes.
Markov sources. Entropy rate of a Markov process. Symbol codes.
Huffman codes and the prefix property. Binary symmetric channels.
Capacity of a noiseless discrete channel.
- Discrete channel properties, noise, and channel capacity.
Perfect communication through a noisy channel: error-correcting codes.
Capacity of a discrete channel as the maximum of its mutual information
over all possible input distributions.
- Spectral properties of continuous-time signals and channels.
Signals represented as combinations of complex exponential eigenfunctions;
channels represented as spectral filters that add noise. Applying
Fourier analysis to signal communication. Continuous versus discrete, and
periodic versus aperiodic signals and their transforms. Duality properties.
- Continuous information; density; noisy channel coding theorem.
Extensions of discrete entropies and measures to the continuous case.
Signal-to-noise ratio; power spectral density. Gaussian channels.
Relative significance of bandwidth and noise limitations.
The Shannon rate limit for noisy continuous channels.
- Signal coding and transmission schemes using Fourier theorems.
Nyquist Sampling Theorem. Aliasing and its prevention.
Modulation and shift theorems; multiple carriers; frequency and
phase modulation codes; ensembles. Filters, coherence, demodulation;
noise removal by correlation.
- The quantized degrees-of-freedom in a continuous signal.
Why a continuous signal of finite bandwidth and duration has a fixed
number of degrees-of-freedom. Diverse illustrations of the principle
that information, even in such a signal, comes in quantized, countable,
packets.
- Gabor-Heisenberg-Weyl uncertainty relation. Optimal “Logons”.
Unification of the time-domain and the frequency-domain as endpoints
of a continuous deformation. The Uncertainty Principle and its optimal
solution by Gabor’s expansion basis of “logons”. Multi-resolution
wavelet codes. Extension to images, for analysis and compression.
- Data compression codes and protocols.
Run-length coding; dictionary methods on strings; vector quantisation;
JPEG and JP2K image compression; orthogonal subspace projections;
predictive coding; the Laplacian pyramid; and wavelet scalar
quantisation.
- Kolmogorov complexity. Minimal description length.
Definition of the algorithmic complexity of a data sequence, and
its relation to the entropy of the distribution from which the data
was drawn. Fractals. Minimal description length, and why this measure
of complexity is not computable.
- Applications of information theory in other sciences.
Use of information metrics and analysis in: genomics; neuroscience;
astrophysics; noisy signal classification; and pattern recognition
including biometrics.
Objectives
At the end of the course students should be able to
- calculate the information content of a random variable
from its probability distribution;
- relate the joint, conditional, and marginal entropies
of variables in terms of their coupled probabilities;
- define channel capacities and properties using Shannon’s Theorems;
- construct efficient codes for data on imperfect communication channels;
- generalize the discrete concepts to continuous signals on continuous
channels;
- understand encoding and communication schemes in terms of
the spectral properties of signals and channels;
- describe compression schemes, and efficient coding using wavelets
and other representations for data.
Recommended reading
* Cover, T.M. & Thomas, J.A. (2006). Elements of information theory. New York: Wiley.