Computer Laboratory Home Page Search A-Z Directory Help
University of Cambridge Home Computer Laboratory
Computer Science Syllabus - Comparative Architectures
Computer Laboratory > Computer Science Syllabus - Comparative Architectures

Comparative Architectures next up previous contents
Next: Computer Vision Up: Lent Term 2006: Part Previous: Bioinformatics   Contents

Comparative Architectures

Lecturer: Dr D.J. Greaves

No. of lectures: 16

Prerequisite course: Computer Design


This course examines the architecture and implementation of state-of-the-art microprocessors and memory systems. It begins by examining the different design goals that microprocessors are developed for, and discusses the difficulties associated with making objective performance comparisons.

Features of a number of popular Instruction Set Architectures are compared and contrasted, with particular attention to their effects on implementation and hence performance. The second half of the course addresses micro-architecture implementation issues, examining how Instruction Level Parallelism can be exploited through deep pipelining and super-scalar techniques such as out-of-order execution. Issues in memory hierarchy design are explored, and the impact they have on code optimisation. Finally, multi-processor architectures are examined.


  • Comparing architectures. The technology curve. System versus chip performance. Speed: MIPS, MHz, FLOPS, SPEC. Power. Price. Compatibility. Features. [2 lectures]

  • Instruction set architecture. Amdahl's law and RISC principles. Byte sex. Word size. Stacks, Accumulators and GPRs. Load-store versus register-memory. Addressing modes. Code density. Sub-word and un-aligned loads and stores. [3 lectures]

  • Advanced pipelining. The CPU performance equation. Structural hazards: long latency instructions. Data hazards: result forwarding and delayed loads. Control hazards: optimising branches, and avoiding branches. Exceptions. [3 lectures]

  • Super-scalar techniques. Instruction Level Parallelism (ILP). Statically scheduled and dynamic out-of-order execution. Register renaming. [2 lectures]

  • Beyond super-scalar. The limits of ILP. Alternative architectures: VLIW, SMT, SCMP [2 lectures]

  • Memory hierarchy. Cache configurations. Latency versus bandwidth. Re-ordering and coherence. Programming for caches. [2 lectures]

  • Multi-processor systems. Multi-processor cache coherency. Message passing clusters. Interconnects. Weak memory ordering. [2 lectures]


At the end of the course students should

  • appreciate the balance between implementation and architecture in determining performance

  • understand how quantitative analysis led to the convergence towards RISC-like designs

  • comprehend the issues associated with deeply-pipelined designs

  • understand the operation of processors supporting out-of-order execution

  • be able to describe the difficulties associated with building wide-issue machines, and have a basic understanding of the alternatives to Instruction Level Parallelism

  • appreciate the tradeoffs made by architects in the design of memory hierarchies, and be able to optimise algorithms for memory hierarchy performance

Recommended reading

Hennessy, J. & Patterson, D. (2002). Computer architecture: a quantitative approach. Morgan Kaufmann (3rd ed.) ISBN 1-55860-724-2. (2nd edition, 1996, is also good.)

Further reading and reference:

Johnson, M. (1991). Superscalar microprocessor design. Prentice-Hall.
Markstein, P. (1990). IA-64 and elementary functions. Prentice-Hall.
Tannenbaum, A.S. (1990). Structured computer organization. Prentice-Hall (2nd ed.).
Van Someren, A. & Atack, C. (1994). The ARM RISC chip: a programmer's guide. Addison-Wesley.
Sites, R.L. (ed.) (1992). Alpha architecture reference manual. Digital Press.
Kane, G. & Heinrich, J. (1992). MIPS RISC architecture. Prentice-Hall.
The CPU Info Center

next up previous contents
Next: Computer Vision Up: Lent Term 2006: Part Previous: Bioinformatics   Contents
Christine Northeast
Sun Sep 11 15:46:50 BST 2005