Course pages 2015–16

# Numerical Methods

**Principal lecturer:** Dr David Greaves**Taken by:** Part IA CST, Part IA NST, Part I PBS**Past exam questions**

No. of lectures: 11

Suggested hours of supervisions: 3 to 4

This course is useful for the Part II courses Advanced Graphics and Digital Signal Processing.

## Aims

This course provides:

- an introduction to (IEEE) floating-point data representation and arithmetic;
- illustrations of how naïve implementations of obvious mathematics can go badly wrong;
- a study of several standard numerical algorithms.

An overall implicit aim is to encourage caution when using any floating-point value produced by a computer program. A variety of code fragments are provided and most are available in multiple languages. Students are strongly encouraged to experiment with these fragments.

## Lectures

**Integer and floating-point representation and arithmetic.**Signed and unsigned integers and fixed-point; arithmetic, saturating arithmetic. Long division and multiplication. Floating point I/O in ASCII. What numbers are exactly representable in bases 2 and 10. Accuracy in terms of significant figures.**IEEE floating-point arithmetic.**Floating-point arithmetic, and the IEEE requirements. IEEE 754/854 floating point (32 and 64 bit); zeros, infinities, NaN. Overflow, underflow, progressive loss of significance. Rounding modes. Floating-point arithmetic is non-associative, and mathematical equivalences fail. Nonsensical results, e.g.`sin(1e40)`. Difficulty in obtaining IEEE-quality in libraries.**How floating-point computations diverge from real-number calculations.**Absolute Error, Relative Error, Machine epsilon, Unit in Last Place (ulp). Finite computation: solving a quadratic. Summing a finite series. Rounding (round-off) and truncation (discretisation) error. Numerical differentiation; determining a good step size.**Iteration and when to stop.**Unbounded computation may produce unbounded errors. Solving equations by iteration and comparison to terminate it. Newton’s method. Order of convergence. Limit cycles. Why summing a Taylor series is problematic. Condition number, partial derivatives, backwards stability and chaos.**Matrix Form Simultaneous Equations.**Gaussian Elimination. Stability and pivoting improvements. Positive-definite. L/U and Cholesky decompositions. Doolittle/Crout method.**Efficient Implementation**Chebychev orthogonal basis (for power series economisation) Practical implementation of scientific (trig/log) functions. Comparison of Taylor, Chebychev and Cordic.**Finite-Difference Time-Domain Simulation.**Numerical simulation of SHM, charge/discharge, waves and other various examples (such as a Moniac Simulator).**Circuit/Flow Analysis.**Using a matrix representation of a linear flow circuit (water, electricity etc) to find steady state. Extensions for non-linear and time-varying branches as used by SPICE.**Adaptive Methods and Custom Encodings**A subset of the following topics will we be lectured. Details in handout. Arbitrary precision floating point, adaptive floating point, interval arithmetic. Rounding errors in PCM. Logarithmic and other non-linear representations. Their use in a-posteriori decision algorithms. Eg for rapid multiplication in Viterbi/Bayes and specialist ALUs (e.g. for low-density parity). Simulated Annealing. Non-linear spatial quantisation.

## Objectives

At the end of the course students should

- be able to convert simple decimal numbers to and from IEEE floating-point format, and to perform IEEE arithmetic on them;
- be able to identify problems with floating-point implementations of simple mathematical problems and know when incorrect solution is likely;
- be familiar with several key algorithms from the history of numerical analysis;
- decide how and when computation energy should be traded for accuracy;
- know to use a professionally-written package whenever possible (and still to treat claims of accuracy with suspicion).

## Recommended reading

Overton, M.L. (2001). *Numerical computing with IEEE floating point arithmetic*. SIAM.

Further reading - goes far beyond the course

Goldberg, D. (1991). *What every computer scientist should know about floating-point arithmetic*. ACM Computing Surveys, vol. 23, pp. 5-48.