This course is useful for the Part II courses Advanced Graphics and Digital Signal Processing.

Aims

This course has two aims: firstly to provide an introduction to
(IEEE) floating-point data representation and arithmetic; and secondly
to show, mainly by fun examples backed up by simple analysis, how naïve
implementations of obvious mathematics can go badly wrong.

Lectures

IEEE Floating-point representation and arithmetic (32 and 64 bits).
Overflow, underflow, progressive loss of significance.
Rounding modes.

How floating-point computations diverge from real-number calculations.
Absolute Error, Relative Error, Machine epsilon.
Solving a quadratic.

Iteration and when to stop.
Why summing a Taylor series is problematic (loss of all precision,
range reduction, non-examinable hint at economisation).

Ill-conditioned or chaotic problems. Testing. Packages.
Non-examinable: exact real arithmetic.

Objectives

At the end of the course students should

be able to convert simple decimal numbers to and from IEEE
floating-point format, and to perform simple arithmetic

be able to identify problems with floating-point implementations of
simple mathematical problems

know when a problem is likely to yield incorrect solutions no
matter how it is processed numerically

know to use a professional package whenever possible