This course is useful for the Part II courses Advanced Graphics and Digital Signal Processing.
Aims
This course has two aims: firstly to provide an introduction to
(IEEE) floating-point data representation and arithmetic; and secondly
to show, mainly by fun examples backed up by simple analysis, how naïve
implementations of obvious mathematics can go badly wrong.
Lectures
IEEE Floating-point representation and arithmetic (32 and 64 bits).
Overflow, underflow, progressive loss of significance.
Rounding modes.
How floating-point computations diverge from real-number calculations.
Absolute Error, Relative Error, Machine epsilon.
Solving a quadratic.
Iteration and when to stop.
Why summing a Taylor series is problematic (loss of all precision,
range reduction, non-examinable hint at economisation).
Ill-conditioned or chaotic problems. Testing. Packages.
Non-examinable: exact real arithmetic.
Objectives
At the end of the course students should
be able to convert simple decimal numbers to and from IEEE
floating-point format, and to perform simple arithmetic
be able to identify problems with floating-point implementations of
simple mathematical problems
know when a problem is likely to yield incorrect solutions no
matter how it is processed numerically
know to use a professional package whenever possible