Computer Laboratory

Course pages 2016–17


Paper 1: Numerical Methods

Lecturer: Dr D.J. Greaves

No. of lectures: 11

Suggested hours of supervisions: 3 to 4

This course is useful for the Part II courses Advanced Graphics and Digital Signal Processing.


This course provides:

  1. an introduction to (IEEE) floating-point data representation and arithmetic;
  2. illustrations of how naïve implementations of obvious mathematics can go badly wrong;
  3. a study of several standard numerical processes, algorithms and techniques.

An overall implicit aim is to encourage caution when using any floating-point value produced by a computer program. A variety of code fragments are provided and most are available in multiple languages. Students are strongly encouraged to experiment with these fragments.

(Changes from last year: One fewer topics will be lectured. A full-text Learners’ Guide PDF will be available as well as slide hardcopies.)


  • Integer and floating-point representation and arithmetic. Signed and unsigned integers and fixed-point; arithmetic, saturating arithmetic. Long division and multiplication. Floating point I/O in ASCII. What numbers are exactly representable in bases 2 and 10. Accuracy in terms of significant figures.

  • IEEE floating-point arithmetic. Floating-point arithmetic, and the IEEE requirements. IEEE 754/854 floating point (32 and 64 bit); zeros, infinities, NaN. Overflow, underflow, progressive loss of significance. Rounding modes. Floating-point arithmetic is non-associative, and mathematical equivalences fail. Nonsensical results, e.g. sin(1e40). Difficulty in obtaining IEEE-quality in libraries.

  • How floating-point computations diverge from real-number calculations. Absolute Error, Relative Error, Machine epsilon, Unit in Last Place (ulp). Finite computation: solving a quadratic. Summing a finite series. Rounding (round-off) and truncation (discretisation) error. Numerical differentiation; determining a good step size.

  • Iteration and when to stop. Unbounded computation may produce unbounded errors. Solving equations by iteration and comparison to terminate it. Newton’s method. Order of convergence. Limit cycles. Why summing a Taylor series is problematic. Condition number, partial derivatives, backwards stability and chaos.

  • Matrix Form Simultaneous Equations. Gaussian Elimination. Stability and pivoting improvements. Positive-definite. L/U and Cholesky decompositions. Doolittle/Crout method.

  • Efficient and Approximate Implementations A subset of the following topics will we be lectured/examinable as announced on the website: Chebychev orthogonal basis (for power series economisation) Practical implementation of scientific (trig/log) functions. Splines. Comparison of Taylor, Chebychev and Cordic.

  • Finite-Difference Time-Domain Simulation. Numerical simulation of SHM, charge/discharge, waves and other various examples (such as a Moniac Simulator).

  • Fluid Flow Analysis. Using a matrix representation of a linear flow circuit (water, electricity etc) to find steady state. Extensions for non-linear and time-varying branches (as used by SPICE).

  • Adaptive Methods and Custom Encodings A subset of the following topics will we be lectured/examinable as announced on the website: Arbitrary precision floating point, adaptive floating point, interval arithmetic. Rounding errors in PCM. Logarithmic and other non-linear representations. Their use in a-posteriori decision algorithms. Eg for rapid multiplication in Viterbi/Bayes and specialist ALUs (e.g. for low-density parity). Simulated Annealing. Non-linear spatial quantisation.


At the end of the course students should

  • be able to convert simple decimal numbers to and from IEEE floating-point format, and to perform IEEE arithmetic on them;

  • be able to identify problems with floating-point implementations of simple mathematical problems and know when incorrect solution is likely;

  • be familiar with several key algorithms from the history of numerical analysis;

  • decide how and when computation energy should be traded for accuracy;

  • know to use a professionally-written package whenever possible (and still to treat claims of accuracy with suspicion).

Recommended reading

Overton, M.L. (2001). Numerical computing with IEEE floating point arithmetic. SIAM.

Further reading - goes far beyond the course

Goldberg, D. (1991). What every computer scientist should know about floating-point arithmetic. ACM Computing Surveys, vol. 23, pp. 5-48.