## COMPUTER SCIENCE TRIPOS Part II - 2018 - Paper 8 ## 13 System-on-Chip Design (DJG) - (a) At the lowest level, what is the primary consumer of electrical power in digital logic today? Give a formula for the expected energy or power use for a CMOS gate. [2 marks] - (b) A matrix (a 2-dimensional array) is stored on-chip in static RAM. What main factors contribute to the time and energy needed to transpose it? [4 marks] - (c) Assume now a square matrix is to be held in DRAM. - (i) When might it be helpful to store multiple-copies of a given matrix in different DRAM banks? [1 mark] - (ii) When might it be helpful to store multiple-copies of the matrix (or another example data structure) in one DRAM bank? [2 marks] - (iii) One way to avoid transposing a matrix is simply to hold an annotation that it has been transposed and to then swap over the row and column arguments for each operation. Why might physically performing the transpose ultimately benefit performance? Where would the annotation be held? [2 marks] - (d) A computation operates on square matrices of size $10^5 \times 10^5$ . The inner loop, to be accelerated in hardware, has the following basic structure: ``` for (int i= ...) for (int j= ...) { DD[i, j] = ff(SS[i-1, j], SS[i, j-1]) } ``` - (i) Are there any loop-carried dependencies? What does this mean for performance optimisation? [1 mark] - (ii) If the DRAM timings are 11-11-11, meaning row activation, column activation and writeback each take 11 clock cycles, estimate roughly the minimum time for a naive implementation of the computation. Assume a simple linear data layout. State all further assumptions. [6 marks] - (iii) What determines whether it is possible or a good idea to perform the operation 'in place' (ie. using the same memory for DD and SS)? [2 marks]