# System On Chip: Design & Modelling (SOC/DAM)

## Exercises

Here is the second set of exercises. These are intended to cover subject groups 5-8 of the SOC/DAM syllabus (ABD, SFT, RD, E). These questions are styled as Tripos questions, with 20 marks each, but they largely involve repeating what was lectured. Tripos exam questions this year will be certainly be no harder than anything in these exercises. (In future years, questions that require a greater amount of creative thinking might be set).

Full model answers and answer notes are available (for supervisors only).

## 1 ABD: Assertion-Based Design.

Topics: Assertion-based design. PSL/SVA assertions. Overview of temporal logic compilation to FSM. Overview of sequential equivalence checking and model checking.

#### Exercise ABD1 : Assertion-based design.

a) What is the difference between a safety and liveness assertion over the behaviour of a system. [4 Marks]

b) How does a declarative safety assertion differ from an imperative assert statement ? [4 Marks]

c) How can a liveness assertion be checked in a simulation ? [4 Marks]

d) Give a short segment of RTL or pseudocode that contains an imperative assertion that holds and give also a pair of valid safety and liveness assertions that hold for your code. [8 Marks]

#### **Exercise ABD2** : General ABD.

a) What are the benefits of the assertion-based design (ABD) methodology ? [5 Marks]

b) Illustrate how a regular expression can be used as part of a safety assertion ? [5 Marks]

c) Using three or more modelling layers, describe the PSL reference model.[5 Marks]

In PSL next-cycle suffix implication uses |=> and same-cycle suffix implication uses |->.

d) Use these two different forms to give a pair of PSL expressions that have identical meaning. [6 Marks]

See http://www.esperan.com/tutorial/psl\_simple.html

#### Exercise ABD3 : Four-phase handshake.

a) Give a temporal logic expression that defines a four-phase handshake in PSL-like syntax. [5 Marks]

b) Give the synthesisable RTL or circuit for a monitor that checks operation of a four-phase handshake. You may assume a high-frequency clock is available that does not alias any transitions. [5 Marks]

c) Discuss whether your PSL specification or circuit are equivalent or in any sense complete. [5 Marks]

d) What is automated stimulus generation and can it be usefully applied to interfaces such as the FP H/S? [5 Marks]

### Exercise ABD 4 : Bus Monitors.

a) What is meant by the formal specification of a protocol and what is a bus monitor ? [5 Marks]

b) How are bus monitors used in ABD and what sort of error might be detected (safety of liveness etc.) ? [5 Marks]

c) How can a bus monitor be used to generate simulation stimulus ? What coverage might be possible ? [5 Marks]

d) What statistics might a bus monitor collect ? [5 Marks]

## **Exercise ABD5** : PSL Operators and Algorithm.

a) Why is it recommended to always use a PSL SERES as part of a suffix implication ? [5 Marks]

b) Describe five infix operators defined in PSL. [5 Marks]

c) Outline an algorithm for synthesising a pattern detecting automaton from a PSL SERES. [5 Marks]

d) How can pattern detectors be combined with suffix implication operators ? [5 Marks]

Exercise ABD6 : ABD Methodology.

a) What is meant by 'Assertion Based Design'? [5 Marks]

b) Compare the use of assertions and yes/no test wrappers in regression testing ?[5 Marks]

c) Explain how certain assertions can span layers (and others not). For example, some might be used for TLM modelling as well as for pre-synthesis and post-synthesis forms of an RTL design. [5 Marks]

d) What is meant in testing by the term 'coverage' and can this be applied to set of assertions ? [5 Marks]

**Exercise ABD7** : Sequential Equivalence Checker (SEC).

a) What is the combinational equivalence problem ? What is the role of don't cares in it ? [5 Marks]

b) What is meant by sequential equivalence and strong and weak bi-simulation ? [5 Marks]

c) Why might sequential equivalence be violated in a design flow (i.e. SEC gives a negative result)? [5 Marks]

d) Why might we see false negatives from a SEC ? [5 Marks]

## 2 SFT: Structure, Flow and Tools.

Topics: Bus Structures, Design Flow and Tools.

### Exercise SFT1 : Dynamic Clock Gating.

a) What is dynamic clock gating and why is it used ? [4 Marks]

b) Compare coarse-grained manual and fine-grained automatic clock gating. [4 Marks]

c) Describe some common clock-gate insertion transformations. [6 Marks]

d) Compare dynamic clock gating with power isolation in terms of automation, scale and functionality. [6 Marks]

#### Exercise SFT2 : Cell Library.

a) Give a short list of logic cells to be found in a standard cell library. [5 Marks]

b) List five types of information that should be stored about each cell. [5 Marks]

c) How can an algorithm that chooses an assembler instruction from an instruction set in the back end of a compiler be used for choosing a cell from a cell library have in the back end of a logic synthesiser ? [5 Marks]

d) Name several illustrative, specialist VLSI structures or components that cannot readily be made out of standard logic cells and explain why custom design is needed. [5 Marks]

NB: Detailed custom versus semi-custom versus FGPA/CPLD material might not be lectured in 08/09.

#### Exercise SFT3 : JTAG Port.

a) Why do ASICs commonly support special test modes? [4 Marks]

b) Define and compare boundary scan with full scan test path [4 Marks]

c) Briefly describe the structure and operation of the JTAG test port used on many chips. [4 Marks]

d) How can JTAG ports be combined and is this a good idea within a single SoC ? [4 Marks]

e) What other uses can the JTAG port frequently be put to ? [4 Marks]

#### Exercise SFT4 : A Basic BVCI SoC.

a) Sketch the block diagram for a SoC with one processor, one SRAM, one ROM, one Counter/Timer block and one PIO section, all connected to a single bus without any bus bridges. [5 Marks]

b) List the (main) signals that make up the BVCI bus (or a bus of similar functionality) and explain the protocol. [6 Marks]

c) Is DMA supported in the SoC of part a and how might it be added ? [3 Marks]

d) How are interrupts delivered in your SoC of part a? [3 Marks]

e) What modifications are needed if a second processor core were to be added ? Is a second bus a good idea ? [3 Marks]

**Exercise SFT5** : Memory Macrocell Generator.

a) What input parameters might we expect to give to a generator program that creates multi-ported SRAM memories for use in a System on Chip ? [5 Marks]

b) What output files might we expect from the memory generator program ? [5 Marks] c) Sketch either a TLM-style or RTL-style simulation model in RTL or SystemC code for a SRAM memory with two read ports and one write port. [5 Marks]

d) What differences in terms of timing and contention might we see if a model of a memory subsystem is populated with TLM-style models of the RAMs compared with RTL-style models. [5 Marks]

*Bonus:* What problems might there be if the simulation model from part c were fed into a logic synthesiser for use on an actual ASIC or FPGA ?

### Exercise SFT6 : Multiple Busses With Bridges.

a) In SoC terms, what is a bus and why are tri-states not often used? [2 Marks]

b) How might the output port for a transaction over such a bus switch decided ? [2 Marks]

c) What is a bus bridge, what transactions might it support and what internal operations might it implement ? [4 Marks]

d If a SoC is designed with a number of bridged busses, what are the main aspects that determine the allocation of initiators and targets to the busses ? [3 Marks]

*e* Is there no real difference between a Network On Chip and a set of bus bridges ? [3 Marks]

f) Is temporal de-coupling needed for operations on a SoC that uses a number of bridges busses ? [3 Marks]

g) How is target contention handled in a SoC that uses a number of bridges busses compared with a NoC (network on chip) ? [3 Marks]

#### **Exercise SFT7** : Network-On-Chip (NoC).

a) What is meant by the term Network-on-Chip and what are the main two differences between using a number of bus bridges and a network fabric? [5 Marks]

b) Describe two buffering techniques that might be used in a NoC ? [2 Marks]

b) Describe two flow control techniques used in a NoC ? [2 Marks]

c) What can be done to avoid NoC deadlock ? How can it be detected ? What should be done when it is detected ? [6 Marks]

d) What is the flattened-butterfly NoC topology and why is it considered ? [5 Marks]

NB: Detailed NoC material might not be fully lectured in 08/09.

## **3** RD: Recent Developments

Topics: Future languages, Recent developments, lectured as time permits: System Verilog, BlueSpec, C-to-Gates, Kiwi, Co-Synthesis, Future Directions.

This material is not examinable in 2008/9 and may not be lectured.

## Exercise RD1 : System Verilog.

a) How does SystemVerilog extend Verilog? [5 Marks]

b) Does System Verilog promote or discourage higher-level expression of designs ? [10 Marks]

c) In what ways are System Verilog designs different from C-to-gates designs ? [5 Marks]

**Exercise RD2** : Bluespec System Verilog.

a) What is a Bluespec rule (guarded atomic transaction)? [5 Marks]

b) How is parallel programming expressed in Bluespec System Verilog and in conventional RTL ? Explain which is considered the higher-level language and say why. [5 Marks]

c) What is the code explosion problem in high-level synthesis and how does Bluespec avoid it ? [5 Marks]

d) How does Bluespec help with timing closure ? [5 Marks]

### **Exercise RD3** Kiwi Project (C#-to-gates via .net).

a) What is a generate statement, as found in VHDL and Verilog, and how is the same effect achieved in Kiwi? [5 Marks]

b) What do parallel programming and hardware design have in common ? [5 Marks]

c) How have the designers of the Kiwi system exploited this ? [5 Marks]

d) What might be the interpretation of a thread fork and join in general C-togates flows and in Kiwi in particular ? [5 Marks]

Exercise RD4 : UML For VLSI Design (Marte Project).

a) What are the basic roles of a graphical cockpit, such as Eclipse or another GUI, in SoC design ? [5 Marks]

b) Give an overview of the Marte Project and explain the potential roles of UML in SoC design ? [5 Marks]

c) How does UML differ from IP-XACT as used in SoC design? [5 Marks]

d) List several visualisation tools could usefully be offered in an integrated development environment for SoC design, debugging and evaluation ? [5 Marks]

**Exercise RD5** : Glue Logic Synthesis.

a) What is the data conservation principle in interface design ? [2 Marks]

b) List the major steps in the product method for glue logic synthesis. [5 Marks]

c) Give four (or more) commonly appearing component joining paradigms.[8 Marks]

d) Using example fragments of RTL or SystemC-like glue code that joins a pair of interfaces, explain what the user needs to define and what might be synthesised. (You might choose, as your example, a duplex mailbox that offers blocking target ports on both sides.) [5 Marks]

## Exercise RD6 : Transactor Synthesis.

a) What is an ESL transactor ? [5 Marks]

b) Explain why it might be useful for a common protocol specification to be used both to synthesise bus monitors and to synthesise transactors. [5 Marks]

c) Name three typical transactor configurations (and explain why the obvious fourth is potentially useless). [5 Marks]

d) Can glue logic be synthesised from transactor definitions? [5 Marks]

## **Exercise RD7** : IP-XACT

a) What is the purpose of the IP-XACT specification ? [5 Marks]

b) How can device driver register definitions be automatically aligned with RTL implementations ? [5 Marks]

c) What alternatives to IP-XACT might be considered for structural netlists ? [5 Marks]

d) How might IP-XACT be used in conjunction with transactor synthesis ? [5 Marks]

# 4 E: Engineering and Physical Considerations.

Topics: Power consumption, scaling, size, logical effort and performance limits.

## **Exercise E1** : Logical Effort

a) When sending a signal a long distance over a chip, compare using powerful drivers with a repeater arrangement that uses a larger number of less-powerful drivers. [5 Marks]

b) When building a multi-stage logic circuit, what arrangement gives least area ? [5 Marks]

c) When building a multi-stage logic circuit, what arrangement gives least power ? [5 Marks]

d) When building a multi-stage logic circuit, what arrangement gives lowest delay ? [5 Marks]

NB: Detailed material to answer this question is unlikely to be lectured in 08/09.

## Exercise E2 : VLSI Energy Use.

For this question, use the following figures:

| Parameter               | Value        | Unit                      |
|-------------------------|--------------|---------------------------|
| Drawn Gate Length       | 0.08         | $\mu \mathrm{m}$          |
| Metal Layers            | 6 to 9       | layers                    |
| Gate Density            | 400 K        | $gates/mm^2$              |
| Track Width             | 0.25         | $\mu{ m m}$               |
| Track Spacing           | 0.25         | $\mu{ m m}$               |
| Gate Output Capacitance | 0.06         | $_{\mathrm{fF}}$          |
| Gate Input Capacitance  | 0.03         | $_{ m fF}$                |
| Tracking Capacitance    | 1            | $\mathrm{fF}/\mathrm{mm}$ |
| Core Supply Voltage     | 0.9  to  1.4 | V                         |
| FO4 Delay               | 51           | $\mathbf{ps}$             |
| Leakage current         | 21           | nA/gate                   |

A processor core in the above technology uses 200k gates, excluding cache memories. It has two operating conditions: 100 MHz at 0.9 volts or 400 MHz at 1.4 volts. The average net activity ratio during halt is negligible and 0.3 when running.

Give all working and intermediate results. State any additional assumptions you need or use.

a) Estimate the area of the processor. [2 Marks]

b) Compute the power consumed per gate at each operating condition when

driving a tracks of 0 mm and 1 mm. [2 Marks]

c) Estimate the power consumption of the processor core when halted and running for each operating condition. [3 Marks]

d) Compared with having the processor running at full performance all the time, how much power is saved just by halting the processor when it is idle ? [2 Marks]

e) How much power is saved by dynamic frequency scaling? [2 Marks]

f) How does dynamic frequency scaling compare with halting ? [2 Marks]

g) How much power is saved by combined dynamic voltage and frequency scaling ? [2 Marks]

h) How much power might be saved by power gating (i.e. power isolation) ? [2 Marks]

i) Estimate the relative costs of performing a 32 bit addition and sending the 32 bit result 1 mm over the chip [3 Marks]

**Exercise E3** : Dynamic Voltage and Frequency Scaling.

a) Give a formula for the power dissipation associated with a net on a silicon chip. [3 Marks]

b) What is meant be course-grained and fine-grained clock gating? [3 Marks]

c) For a fixed supply voltage, quantify the power benefits of frequency scaling. In other words, compare computing quickly and halting with computing moreslowly and finishing just in time. [3 Marks]

d) Give two ways that the supply voltage to a region may be varied? [3 Marks]

e) Using variable supply voltages, quantify the power benefits of frequency scaling. [3 Marks]

f) Sketch the architecture of an ASIC (or part of) that uses all of these techniques. [5 Marks]

## **Exercise E4** : Information Flux.

a) How many signal nets per square micron can be routed in a vertical plane in modern VLSI ? [5 Marks]

b) How does the power required to drive a signal net vary with its planar density and length ? [5 Marks]

c) What is the maximum information flux feasible in a modern silicon chip ? [5 Marks]

d) How might we use replicated computation to a meliorate this situation ?  $[5\ {\rm Marks}]$  NB: Detailed material to answer this question may not have been lectured in 08/09.

#### **Exercise E5** : Technology/Scaling.

a) What is meant by the term *feature size* in VLSI ? Give typical values. [5 Marks]

b) What are the main consequences of moving to a smaller feature size in VLSI fabrication ? [5 Marks]

c) What happens to the relative costs of computation and communication as features get smaller ? [5 Marks]

d) Why has parallel computation become more important than ever before ?  $[5~{\rm Marks}]$ 

### **Exercise E6** : Cost and Power

a) Summarise the historical trends that affect the relative merits of FPGA and custom silicon in consumer, professional and military, mains-powered applications [5 Marks].

b) How does the argument differ for battery-powered devices ? [5 Marks]

c) What are the main power consuming components in FPGA, embedded processors, custom silicon and programmable core silicon ? [5 Marks]

d) Discuss whether multi-core processor chips can/should take over from FPGA and custom silicon in various applications. Consider Picochip, XMOS and ARC if you are familiar with them. [5 Marks]

(C) 2008-9 DJ GREAVES.

END OF DOCUMENT.