Next: Information Retrieval
Up: Lent Term 2003: Part
Previous: Advanced Systems Topics
  Contents
Comparative Architectures
Lecturer: Dr I.A. Pratt
(ian.pratt@cl.cam.ac.uk)
No. of lectures: 16
Prerequisite course: Computer Design
Aims
This course examines the architecture and implementation of
state-of-the-art microprocessors and memory systems. It begins by
examining the different design goals that microprocessors are
developed for, and discusses the difficulties associated with making
objective performance comparisons.
Features of a number of popular Instruction Set Architectures are
compared and contrasted, with particular attention to their effects on
implementation and hence performance. The second half of the course
addresses micro-architecture implementation issues, examining how
Instruction Level Parallelism can be exploited through deep pipelining
and super-scalar techniques such as out-of-order execution. Issues in
memory hierarchy design are explored, and the impact they have on code
optimisation. Finally, muti-processors architectures are
examined.
Lectures
- Comparing architectures.
The technology curve. System versus chip performance. Speed:
MIPS, MHz, FLOPS, SPEC. Power. Price. Compatibility.
Features. [2 lectures]
- Instruction set architecture.
Amdahl's law and RISC principles. Byte sex. Word size. Stacks,
Accumulators and GPRs. Load-store versus
register-memory. Addressing modes. Code density. Sub-word and
un-aligned loads and stores. [3 lectures]
- Advanced pipelining.
The CPU performance equation. Structural hazards: long latency
instructions. Data hazards: result forwarding and delayed loads.
Control hazards: optimising branches, and avoiding branches.
Exceptions. [3 lectures]
- Super-scalar techniques. Instruction Level Parallelism
(ILP). Statically scheduled and dynamic out-of-order execution.
Register renaming. [2 lectures]
- Beyond super-scalar. The limits of ILP. Alternative
architectures: VLIW, SMT, SCMP [2 lectures]
- Memory hierarchy. Cache
configurations. Latency versus bandwidth. Re-ordering and
coherence. Programming for caches.
[2 lectures]
- Multi-processor Systems.
Multi-processor cache coherency. Message passing clusters. Interconnects.
Weak memory ordering.
[2 lecture]
Objectives
At the end of the course students should
- appreciate the balance between implementation and architecture
in determining performance
- understand how quantitative analysis led to the convergence
towards RISC-like designs
- comprehend the issues associated with deeply-pipelined designs
- understand the operation of processors supporting out-of-order
execution
- be able to describe the difficulties associated with building
wide-issue machines, and have a basic understanding of the
alternatives to Instruction Level Parallelism
- appreciate the tradeoffs made by architects in the design of
memory hierarchies, and be able to optimise algorithms for memory
hierarchy performance
Recommended books
Hennessy, J. & Patterson, D. (2002). Computer Architecture: a
Quantitative Approach. Morgan Kaufmann (3rd ed.) ISBN 1-55860-724-2.
(2nd edition, 1996, is also good.)
Further reading and reference:
Johnson, M. (1991). Superscalar Microprocessor Design.
Prentice-Hall.
Markstein, P. (1990). IA-64 and Elementary Functions.
Prentice-Hall.
Tannenbaum, A.S. (1990). Structured Computer Organization.
Prentice-Hall (2nd ed.).
Van Someren, A. & Atack, C. (1994). The ARM RISC Chip: a
Programmer's Guide. Addison-Wesley.
Sites, R.L. (ed.) (1992). Alpha Architecture Reference Manual.
Digital Press.
Kane, G. & Heinrich, J. (1992). MIPS RISC
Architecture. Prentice-Hall.
The CPU Info Center http://infopad.eecs.berkeley.edu/CIC/tech/
Next: Information Retrieval
Up: Lent Term 2003: Part
Previous: Advanced Systems Topics
  Contents
Christine Northeast
Wed Sep 4 14:43:05 BST 2002