Advanced Topics in Computer Architecture
Contents |
Reading List
Note: All Week 1 papers are not for assessment. You should not submit an essay on the topic of these papers.
Week 1: Trends in Computer Architecture
-
Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures Agarwal, Hrishikesh, Keckler and Burger, ISCA, June 2000.
[ IEEE Xplore ] -
Dark Silicon and the End of Multicore Scaling Esmaeilzadeh et al, IEEE Micro, 32:2, May-June 2012.
[ IEEE Xplore ] -
The Accelerator Wall: Limits to Chip Specialization Fuchs and Wentzlaff, HPCA 2019.
[ IEEE Xplore ]
Other optional material for week 1
- A New Golden Age for Computer Architecture Hennessy and Patterson, Communications of the ACM, Feb. 2019, 62(2), pp. 48-60 (Turing Lecture)
- Sophie Wilson, "The Future of Microprocessors", 2020 Wheeler Lecture, University of Cambridge, May 2020
Week 2: State-of-the-art Processor Design
-
BROOM: An Open-Source Out-of-Order Processor With Resilient Low-Voltage Operation in 28nm CMOS
Celio, Chiu, Asanovic, Nikolic and Patterson. Hot Chips 30, 2019.
[IEEE Xplore ] -
Inside 6th-Generation Intel Core: New Microarchitecture Code-Named Skylake
Doweck et al, IEEE Micro, vol. 37, 2017
[IEEE Xplore ] -
The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips Davidson et al. IEEE Micro, 38(2), March-April, 2019
[IEEE Xplore ]
Other optional material for week 2
-
Samsung M3 Processor
Rupley, Burgess, Grayson, Zuraski, IEEE Micro, 39(2), March-April, 2019
[IEEE Xplore ]
Week 3: Memory system design
-
Linearizing Irregular Memory Accesses for Improved Correlated Prefetching
Jain and Lin, MICRO 2013
[ACM Digital Library] -
Best-Offset Hardware Prefetching
Michaud, HPCA 2016
[IEEE Xplore] [Version with figure 6 fixed] -
Minnow: Lightweight Offload Engines for Worklist Management and Worklist-Directed Prefetching
Zhang, Ma, Thomson and Chiou, ASPLOS 2018
[ACM Digital Library]
Other optional material for week 3
-
Continuous Runahead: Transparent Hardware Acceleration for Memory Intensive Workloads
Hashemi, Mutlu and Patt, MICRO 2016
[ACM Digital Library] -
Meet the Walkers: Accelerating Index Traversals for In-Memory Databases
Kocberber, Grot, Picorel, Falsafi, Lim and Ranganathan, MICRO 2013
[ACM Digital Library]
Week 4: Hardware reliability
-
BlackJack: Hard Error Detection with Redundant Threads on SMT
Schuchman and Vijaykumar, DSN 2007
[IEEE Xplore] -
Utilizing Dynamically Coupled Cores to Form a Resilient Chip Multiprocessor
LaFrieda, İpek, Martínez and Manohar, DSN 2007
[IEEE Xplore] -
StageWeb: Interweaving Pipeline Stages into a Wearout and Variation Tolerant CMP Fabric
Gupta, Ansari, Feng and Mahlke, DSN 2010
[IEEE Xplore]
Other optional material for week 4
-
Architectural Core Salvaging in a Multi-Core Processor for Hard-Error Tolerance
Powell, Biswas, Gupta and Mukherjee, ISCA 2009
[ACM Digital Library] -
Relax: An Architectural Framework for Software Recovery of Hardware Faults
De Kruijf, Nomura and Sankaralingam, ISCA 2010
[ACM Digital Library]
Week 5: Specification, verification and test
This week there are two interrelated themes, so I suggest that you write your essay based on two papers from the same theme though this is not mandatory. Three of the papers are marked for talks.
- Theme 1: Formal specification of ISAs:
- Main paper (and talk): Who Guards the Guards? Formal validation of
the Arm v8-m architecture specification, OOPSLA 2017
[ACM Digital Library] - Alternative main paper (to compare with the above): ISA Semantics for ARMv8-A, RISC-V, and CHERI-MIPS, POPL 2019
[Open Access] - Sail RISC-V docs:
[GitHub] - Theme 2: Instruction test generation:
- Main paper (and talk): Genesys-Pro: Innovations in Test Program
Generation for Functional Processor Verification, IBM Research,
IEEE Design and Test 2004
[IEEExplore] - Main paper (and talk): Randomised testing of a microprocessor model using SMT-solver
state generation, 2015
[Science Direct] - RISC-V torture tests:
[GitHub]
Other optional material for week 5
- RISC-V tests:
[GitHub] - RISC-V formal framework:
[Open Access Slides]
Week 6: Security 1: CHERI
- Background:An Introduction to CHERI
[Technical Report UCAM-CL-TR-941] - Efficient Tagged Memory:
[local PDF] - CHERI: A Hybrid Capability-System Architecture for Scalable
Software Compartmentalization:
[local PDF] - CHERIvoke: Characterising Pointer Revocation using CHERI Capabilities for Temporal Memory Safety:
[local PDF]
Optional material for week 6
- Cornucopia: Temporal Safety for CHERI Heaps
[local PDF] - CHERI Concentrate: Practical Compressed Capabilities
[local PDF] - Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 8)
[Technical Report UCAM-CL-TR-927] - [CHERI publications list]
Week 7: Security 2: Speculative execution attacks
- Spectre Attacks: Exploiting Speculative Execution
[Open PDF] - Example industry response: ARM white paper: Cache Speculation
Side-channels
[ARM white paper] [Base links to ARM work on speculative vulnerabilities] - Research into hardware mitigations: MI6: Secure Enclaves in a Speculative Out-of-Order Processor
[PDF on Arxiv]
Optional material for week 7
Week 8: Hardware accelerators and accelerators for machine learning
- Pushing the limits of accelerator efficiency while retaining programmability, Nowatzki, Gangadhar, Sankaralingam and Wright, HPCA 2016
- A Software-defined Tensor Streaming Multiprocessor for Large-scale Machine Learning, Abts et al., ISCA 2022
[ACM Digital Library] - Plasticine: A Reconfigurable Architecture For Parallel Patterns, Prabhakar et al, ISCA 2017
Other optional/background material for week 8
You may also be interested in the approaches taken by Tenstorrent, Graphcore and Esparanto.
The following papers are perhaps a little more hardware-centric but may also be of interest:
- Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks, Chen, Emer and Sze, ISCA 2016
- EIE: Efficient Inference Engine on Compressed Deep Neural Network,
Han, Liu, Mao, Pu, Pedram, Horowitz and Dally, ISCA 2016
[VIDEO] Song Han presenting work at ISCA -
NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps
Aimar et al, IEEE Trans. Neural Networks Learn. Syst. 30(3): 644-656 (2019) -
Efficient Processing of Deep Neural Networks: A Tutorial and Survey
Sze, Chen, Yang, Emer, Proceedings of the IEEE, Vol. 105, No. 12, Dec. 2017
Lecture Slides
Seminar 1 - Trends in Computer Architecture
Seminar 2 - Superscalar processor Design (background)
Seminar 7 - Secure processors 2: Speculative execution attacks