



Next: Group Project Up: Michaelmas Term 2007: Part Previous: Elementary Use of the Contents
Floating-Point Computation
Lecturer: Professor A. Mycroft
No. of lectures: 4
This course is useful for the Part II courses Advanced Graphics and Digital Signal Processing.
Aims
This course has two aims: firstly to provide an introduction to (IEEE) floating-point data representation and arithmetic; and secondly to show, mainly by fun examples backed up by simple analysis, how naïve implementations of obvious mathematics can go badly wrong.
Lectures
- IEEE Floating-point representation and arithmetic (32 and 64 bits).
Overflow, underflow, progressive loss of significance.
Rounding modes.
- How floating-point computations diverge from real-number calculations.
Absolute Error, Relative Error, Machine epsilon.
Solving a quadratic.
- Iteration and when to stop.
Why summing a Taylor series is problematic (loss of all precision,
range reduction, non-examinable hint at economisation).
- Ill-conditioned or chaotic problems. Testing. Packages. Non-examinable: exact real arithmetic.
Objectives
At the end of the course students should
- be able to convert simple decimal numbers to and from IEEE
floating-point format, and to perform simple arithmetic
- be able to identify problems with floating-point implementations of
simple mathematical problems
- know when a problem is likely to yield incorrect solutions no
matter how it is processed numerically
- know to use a professional package whenever possible
Recommended reading
None.




Next: Group Project Up: Michaelmas Term 2007: Part Previous: Elementary Use of the Contents