Course pages 2019–20
Advanced Topics in Computer Architecture
Contents |
Reading List
Note: All Week 1 papers are not for assessment. You should not submit an essay on the topic of these papers.
Week 1: Trends in Computer Architecture
- Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures Agarwal, Hrishikesh, Keckler and Burger, ISCA, June 2000.
[ IEEE Xplore ] - Dark Silicon and the End of Multicore Scaling Esmaeilzadeh et al, IEEE Micro, 32:2, May-June 2012.
[ IEEE Xplore ] - The Accelerator Wall: Limits to Chip Specialization Fuchs and Wentzlaff, HPCA 2019.
[ IEEE Xplore ]
Other optional material for week 1
- A New Golden Age for Computer Architecture Hennessy and Patterson, Communications of the ACM, Feb. 2019, 62(2), pp. 48-60 (Turing Lecture)
Week 2: State-of-the-art Processor Design
- BROOM: An Open-Source Out-of-Order Processor With Resilient Low-Voltage Operation in 28nm CMOS
Celio, Chiu, Asanovic, Nikolic and Patterson. Hot Chips 30, 2019.
[IEEE Xplore ] - Inside 6th-Generation Intel Core: New Microarchitecture Code-Named Skylake
Doweck et al, IEEE Micro, vol. 37, 2017
[IEEE Xplore ] - The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips Davidson et al. IEEE Micro, 38(2), March-April, 2019
[IEEE Xplore ]
Other optional material for week 2
- Samsung M3 Processor
Rupley, Burgess, Grayson, Zuraski, IEEE Micro, 39(2), March-April, 2019
[IEEE Xplore ]
Week 3: Memory system design
- Linearizing Irregular Memory Accesses for Improved Correlated Prefetching
Jain and Lin, MICRO 2013
[ACM Digital Library] - Best-Offset Hardware Prefetching
Michaud, HPCA 2016
[IEEE Xplore] [Version with figure 6 fixed] - Minnow: Lightweight Offload Engines for Worklist Management and Worklist-Directed Prefetching
Zhang, Ma, Thomson and Chiou, ASPLOS 2018
[ACM Digital Library]
Other optional material for week 3
- Continuous Runahead: Transparent Hardware Acceleration for Memory Intensive Workloads
Hashemi, Mutlu and Patt, MICRO 2016
[ACM Digital Library] - Meet the Walkers: Accelerating Index Traversals for In-Memory Databases
Kocberber, Grot, Picorel, Falsafi, Lim and Ranganathan
[ACM Digital Library]
Week 4: Specification, verification and test
- ISA specification & verification:
- Mandatory: Who Guards the Guards? Formal validation of
the Arm v8-m architecture specification, OOPSLA 2017
[ACM Digital Library] - ISA Semantics for ARMv8-A, RISC-V, and CHERI-MIPS, POPL 2019
[Open Access] - Sail RISC-V docs:
[GitHub] - Instruction test generation:
- Mandatory: Genesys-Pro: Innovations in Test Program
Generation for Functional Processor Verification, IBM Research,
IEEE Design and Test 2004
[IEEExplore] - Randomised testing of a microprocessor model using SMT-solver
state generation, 2015
[Science Direct] - RISC-V torture tests:
[GitHub] - Additional material:
- RISC-V tests:
[GitHub] - RISC-V formal framework:
[Open Access Slides]
Week 5: Hardware security (I)
- Background: An Introduction to CHERI, Technical Report UCAM-CL-TR-941, Computer Laboratory, September 2019.
[local PDF] - Efficient Tagged Memory, ICCD 2017
[open access PDF] - CHERI: A Hybrid Capability-System Architecture for Scalable Software Compartmentalization, SSP 2015
[open access PDF] - CHERIvoke: Characterising Pointer Revocation using CHERI
Capabilities for Temporal Memory Safety, MICRO 2019
[open access PDF] - Further optional reading:
- CHERI Concentrate: Practical Compressed Capabilities, IEEE Transactions on Computers 2019
[open access PDF] - Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 7), Technical Report UCAM-CL-TR-927
[local PDF] - CHERI publications list
- CHERI Concentrate: Practical Compressed Capabilities, IEEE Transactions on Computers 2019
Week 6: Hardware security (II)
- The attack: Spectre Attacks: Exploiting Speculative
Execution
[PDF from spectreattack.com] - Example industry response: ARM white paper: Cache Speculation
Side-channels
[PDF from ARM] - Research into hardware mitigations: MI6: Secure Enclaves in a Speculative Out-of-Order Processor
[PDF on Arxiv] - Further pointers:
[https://spectreattack.com/]
[https://meltdownattack.com/]
Week 7: Hardware reliability
- StageWeb: Interweaving Pipeline Stages into a Wearout and Variation Tolerant CMP Fabric
Gupta, Ansari, Feng and Mahlke, DSN 2010
[IEEE Xplore] - Reunion: Complexity-Effective Multicore Redundancy
Smolens, Gold, Falsafi and Hoe, MICRO 2006
[ACM Digital Library] - Utilizing Dynamically Coupled Cores to Form a Resilient Chip Multiprocessor
LaFrieda, İpek, Martínez and Manohar, DSN 2007
[IEEE Xplore]
Other optional material for week 7
- Architectural Core Salvaging in a Multi-Core Processor for Hard-Error Tolerance
Powell, Biswas, Gupta and Mukherjee, ISCA 2009
[ACM Digital Library] - Relax: An Architectural Framework for Software Recovery of Hardware Faults
De Kruijf, Nomura and Sankaralingam, ISCA 2010
[ACM Digital Library]
Week 8: HW Accelerators and accelerators for machine learning
- Pushing the limits of accelerator efficiency while retaining programmability, Nowatzki, Gangadhar, Sankaralingam and Wright, HPCA 2016
- Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks, Chen, Emer and Sze, ISCA 2016
- EIE: Efficient Inference Engine on Compressed Deep Neural Network,
Han, Liu, Mao, Pu, Pedram, Horowitz and Dally, ISCA 2016
[VIDEO] Song Han presenting work at ISCA
Other optional/background material for week 8
- Efficient Processing of Deep Neural Networks: A Tutorial and Survey Sze, Chen, Yang, Emer, Proceedings of the IEEE, Vol. 105, No. 12, Dec. 2017
- Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures Williams, Waterman and Patterson, Communications of the ACM, vol. 52, Issue 4, April 2009, pp 65-76.