A Broadside Register

Clock

D \rightarrow \text{Broadside register} \rightarrow Q

Clock

D0 \rightarrow D \rightarrow Q0

D1 \rightarrow D \rightarrow Q1

D2 \rightarrow D \rightarrow Q2

\cdots

D(N-1) \rightarrow D \rightarrow Q(N-1)
A broadside two-to-one multiplexor
A dual port register file

- Data in
- Write Address
- clock
- Data out A
- Read Address A
- Data out B
- Read Address B
Read Only Memory (ROM)

The ROM takes \( A \) address bits named \( A_0 \) to \( A_{<A-1>} \) and produces data words of \( N \) bits wide. For example, if \( A=5 \) and \( D=8 \) then the ROM contains \( 2^5 \) which is 32 locations of 8 bits each. The address lines are called \( A_0, A_1, A_2, A_3, A_4 \) and the data lines \( D_0, D_1, \ldots, D_7 \).

The ROM’s outputs are high impedance unless the enable input is asserted (low). After the enable is low the output drivers turn on. When the address has been stable sufficiently long, valid data from that address comes out.

The ROM contents are placed inside during manufacture or field programming.
Read Cycle - Like the ROM

- Read or write mode select
- Enable Input (active low)
- Address In
- Data Bus: **High-Z**
  - Valid data
- Data In and Out: **High-Z**

Write Cycle - Data stored internally

- Read or write mode select
- Enable Input (active low)
- Address In
- Data Bus: **High-Z**
  - Data must be valid here to be stored.
- Data In and Out: **High-Z**
Unlike the edge-triggered flip-flop, the transparent latch passes data through in a transparent way when its enable input is high. When its enable input is low, the output stays at the current value.
A DRAM has a multiplexed address bus and the address is presented in two halves, known as row and column addresses. So the capacity is $4^A \times D$. A 4 Mbit DRAM might have $A=10$ and $D=4$.

When a processor (or its cache) wishes to read many locations in sequence, only one row address needs be given and multiple col addresses can be given quickly to access data in the same row. This is known as 'page mode' access.

EDO (extended data out) DRAM is now quite common. This guarantees data to be valid for an extended period after CAS, thus helping system timing design at high CAS rates.

**Refresh Cycle - must happen sufficiently often!**

No data enters or leaves the DRAM during refresh, so it 'eats memory bandwidth'. Typically 512 cycles of refresh must be done every 8 milliseconds.
Crystal oscillator clock source

RC oscillator clock source
Clock multiplication and distribution

Outside the chip

Inside the chip

PLL Circuit

VCO

264 MHz

Divide 8

Clock distribution tree

Power-on reset

Supply

Active low
Reset output

Vo

Vi

Ground

External clock input

33 MHz

Reset output

Power-on reset

Ground
Driving a heavy current or high-voltage load
Debouncer circuit for a two-pole switch

Switch

+5Volt supply rail

Pullup Resistors

Output

Gnd

Bounces

A

B

Output
ALU and flags register

Function Code:
- 4

A-input:
- N

B-input:
- N

ALU

Output

Carry In

C

N

Z

V

Flags register

Flags Clock
ALU and register file

8 bit ALU

A-input

B-input

8

Function Code

A

B

Register file
16 registers of 8 bits

Q

Din

D

4 bit counter

4

Zero detect

4 bit counter

FUNCTION GEN for F code

FUNCTION GEN for A input

Clock source

Output

Carry In

Carry Out

4

4
Example of memory address decode and simple LED and switch interfacing for programmed IO (PIO) to a microprocessor.
A small computer

- **Control Unit**
- **Execution Unit + ALU**
- **Memory**
  - **Static RAM**
  - **16 kByte**
- **Register File (including PC)**
- **Data bus (8 bits)**
- **Address bus (16 bits)**
- **Data bus (8 bits)**

** UART**
- **Serial Port**
- **Rs232 Serial Connection**

**Memory Map decoder circuit**
- Often a ‘PAL’ single chip device.

**1 K Byte ROM**
- **Read Only Memory**
- **A0-9**
- **Enb**

**UART_ENABLE_BAR**

**ROM_ENABLE_BAR**

**RAM_ENABLE_BAR**

**Address bus (16 bits)**

**Clock**

**Reset**

**D0-7**
Flow control: New data is not sent while the busy wire is high.
Serial Port (UART)

Flow control: New data can be sent at any time unless either:
- additional signals are used to indicate clear to send
or
- a software protocol is defined to run on top (Xon/Xoff) by reserving certain of the bytes.

Most computers just use a 9 way connector these days.

25-Way D connector for Serial Port.
Keyboard and/or PS/2 port

PS/2 Keyboard/Mouse Cable
1. Clock
2. Ground
3. Data
4. Spare
5. Power +5Volts
6. Spare
Canonical synchronous FSM

Canonical synchronous FSM

Inputs

Clock

Moore Outputs

Mealy Outputs

Moore Outputs

Mealy Outputs

Inputs

I0

I1

I2

I(M-1)

STATE FLOPS

Q0

Q1

Q2

CURRENT STATE FEEDBACK

LOOP-FREE COMBINATORIAL LOGIC BLOCK

LOOP-FREE COMBINATORIAL LOGIC BLOCK

LOOP-FREE COMBINATORIAL LOGIC BLOCK

CURRENT STATE FEEDBACK
Timing Specifications

- Clock
- Data in
- Q output
- Setup time
- Hold time
- Propagation delay
Typical nature of a critical path

Clock

D Q

A

B

C

D

D Q

Clock

A

B

C

D

Margin

Setup

Period = 1/F
Johnson counters

Clock

Q1 Q2 Q3
Pipelining

Desired logic function

Desired logic function - pipelined version.
Cascading FSMs

Clock

Inputs

Moore Outputs
Mealy Outputs

Moore Outputs
Mealy Outputs

Moore
Mealy

Inputs

FSM

FSM

FSM
An example that uses (badly) a derived clock: a serial-to-parallel converter
A D-type with clock-enable

Clock enable

Data in

Clock

Q Output

LOGIC SYMBOL

AN EQUIVALENT CIRCUIT

Clock enable

Data in

Clock

Q Output
A Gated Clock
Clock Skew

a) A three-stage shift register with some clock skew delays.

b) System interconnection with clock skews

c) A solution for serious skew and delay problems?
Crossing an async boundary
1. The wider the bus width, N, the fewer the number of transactions per second needed and the greater the timing flexibility in reading the data from the receiving latch.

2. Make sure that the transmitter does not change the guard and the data in the same transmit clock cycle.

3. Place a second flip-flop after the receiving decision flip-flop so that on the rare occurrences when the first is metastable for a significant length of time (e.g. 1/2 a clock cycle) the second will present a good clean signal to the rest of the receiving system.
Paths between FSMs w/ derived clocks

Moore feedback to parent clock domain

Clock Input

Moore

Mealy

Feedforward of outputs to son FSM

Inputs

Moore Outputs

Mealy Outputs

Moore

Mealy

Inputs

Moore Outputs

Mealy Outputs
Dicing a wafer

(Chips are not always square)
A chip in its package, ready for bond wires

IO and power pads
### Die cost example

<table>
<thead>
<tr>
<th>Area</th>
<th>Wafer dies</th>
<th>Working dies</th>
<th>Cost per working die</th>
</tr>
</thead>
<tbody>
<tr>
<td>2</td>
<td>9000</td>
<td>8910</td>
<td>0.56</td>
</tr>
<tr>
<td>3</td>
<td>6000</td>
<td>5910</td>
<td>0.85</td>
</tr>
<tr>
<td>4</td>
<td>4500</td>
<td>4411</td>
<td>1.13</td>
</tr>
<tr>
<td>6</td>
<td>3000</td>
<td>2911</td>
<td>1.72</td>
</tr>
<tr>
<td>9</td>
<td>2000</td>
<td>1912</td>
<td>2.62</td>
</tr>
<tr>
<td>13</td>
<td>1385</td>
<td>1297</td>
<td>3.85</td>
</tr>
<tr>
<td>19</td>
<td>947</td>
<td>861</td>
<td>5.81</td>
</tr>
<tr>
<td>28</td>
<td>643</td>
<td>559</td>
<td>8.95</td>
</tr>
<tr>
<td>42</td>
<td>429</td>
<td>347</td>
<td>14.40</td>
</tr>
<tr>
<td>63</td>
<td>286</td>
<td>208</td>
<td>24.00</td>
</tr>
<tr>
<td>94</td>
<td>191</td>
<td>120</td>
<td>41.83</td>
</tr>
<tr>
<td>141</td>
<td>128</td>
<td>63</td>
<td>79.41</td>
</tr>
<tr>
<td>211</td>
<td>85</td>
<td>30</td>
<td>168.78</td>
</tr>
<tr>
<td>316</td>
<td>57</td>
<td>12</td>
<td>427.85</td>
</tr>
<tr>
<td>474</td>
<td>38</td>
<td>4</td>
<td>1416.89</td>
</tr>
</tbody>
</table>
A taxonomy of ICs

Integrated Circuits

- Standard Parts
  - Commodity Parts
  - Masked ASICs
    - Full Custom
    - Standard Cell
    - Gate Array
    - Semi Custom
  - General Chip Products

- Field Programmable Parts
  - FPGA
  - Array Logic (PALs)
Field Programmable Gate Arrays
A configurable logic block for a look-up-table based FPGA

General inputs

Combinatorial function generator

D Q

D Q

Clock input

First output

Programmable multiplexers

Second Output

38
A simple IO block FPGA

Connections to central array.

Tristate control

Programmable multiplexor

Output enable

Output buffer

Input buffer

Input

Output
The diagram shows a circuit with various input and output connections. The following points are highlighted:

- **Clock input**
- **General purpose inputs**
- **Product line**
- **Term line**
- **Output enable product line**
- **Ground pin**
- **Power supply pin**
- **Macro-cell**

The cross points in the shaded regions are programmable points. The diagram indicates the connectivity between different parts of the circuit, showing how signals flow through the various components.
Contents of the PAL macrocell
Example programming of a PAL showing only fuses for the top macrocell

```plaintext
pin 16 = o1;
pin 2 = a;
pin 3 = b;
pin 4 = c

o1.oe = ~a;
o1 = (b & o1) | c;
```

-`x--- ----- ---- ---- ---- ----` (oe term)
-`--x- x--- ---- ---- ---- ----` (pin 3 and 16)
-`----- ----- x--- ---- ---- ----` (pin 4)

xxxx xxxx xxxx xxxx xxxx xxxx xxxx
xxxx xxxx xxxx xxxx xxxx xxxx xxxx
xxxx xxxx xxxx xxxx xxxx xxxx xxxx
xxxx xxxx xxxx xxxx xxxx xxxx xxxx
xxxx xxxx xxxx xxxx xxxx xxxx xxxx
xxxx xxxx xxxx xxxx xxxx xxxx xxxx
xxxx xxxx xxxx xxxx xxxx xxxx xxxx
```

x (macrocell fuse)
## Delay-power style of technology comparison chart

<table>
<thead>
<tr>
<th>Technology</th>
<th>Device</th>
<th>Propagation Delay (ns)</th>
<th>Power (mW)</th>
<th>Delay-power Product (pJ)</th>
</tr>
</thead>
<tbody>
<tr>
<td>CMOS</td>
<td>74hc00</td>
<td>7 ns</td>
<td>1 mW</td>
<td>7</td>
</tr>
<tr>
<td>TTL</td>
<td>74f00</td>
<td>3.4 ns</td>
<td>5 mW</td>
<td>17</td>
</tr>
<tr>
<td>ECL</td>
<td>sp92701</td>
<td>0.8 ns</td>
<td>200 mW</td>
<td>160</td>
</tr>
</tbody>
</table>

### Diagram Details
- **ECL**: 1980
- **TTL**: 1990
- **CMOS**: 2000
- **Power per gate (mW)**
  - 100
  - 10
  - 1
- **Delay (ns)**
  - 1000
  - 100
  - 10
  - 1

**Lines of constant delay-power product**
Logic net with tracking and input load capacitances

- Logic gates with tracking and input load capacitances.
- Parasitic input capacitance proportional to total track length (area).
- Driving gates connected to substrate capacitance.
- Paralleled capacitors in the circuit.
An example cell from a manufacturer’s cell library

NAND4 Standard Cell

Library: CBG0.5um

Schematic Symbol

Simulator/HDL Call

NAND4X2(f, a, b, c, d);

Logical Function

F = NOT(a & b & c & d)

ELECTRICAL SPECIFICATION

Switching characteristics: Nominal delays (25 deg C, 5 Volt, signal rise and fall 0.5 ns)

<table>
<thead>
<tr>
<th>Inputs</th>
<th>Outputs</th>
<th>O/P Falling</th>
<th>O/P Rising</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td>(ps)</td>
<td>ps/LU</td>
</tr>
<tr>
<td>A</td>
<td>F</td>
<td>142</td>
<td>37</td>
</tr>
<tr>
<td>B</td>
<td>F</td>
<td>161</td>
<td>37</td>
</tr>
<tr>
<td>C</td>
<td>F</td>
<td>165</td>
<td>37</td>
</tr>
<tr>
<td>D</td>
<td>F</td>
<td>170</td>
<td>37</td>
</tr>
</tbody>
</table>

Min and Max delays depend upon temperature range, supply voltage, input edge speed and process spreads. The timing information is for guidance only. Accurate delays are used by the UDC.

CELL PARAMETERS

(One load unit = 49 fF)

<table>
<thead>
<tr>
<th>Parameters</th>
<th>Pin</th>
<th>Value</th>
<th>Units</th>
</tr>
</thead>
<tbody>
<tr>
<td>Input loading</td>
<td>a</td>
<td>2.1</td>
<td></td>
</tr>
<tr>
<td></td>
<td>b</td>
<td>2.1</td>
<td></td>
</tr>
<tr>
<td></td>
<td>c</td>
<td>2.1</td>
<td></td>
</tr>
<tr>
<td></td>
<td>d</td>
<td>2.0</td>
<td>Load units</td>
</tr>
<tr>
<td>Drive capability</td>
<td>f</td>
<td>35</td>
<td>Load units</td>
</tr>
</tbody>
</table>
Comparative view of digital logic technologies

<table>
<thead>
<tr>
<th>Technology</th>
<th>Maximum clock speed</th>
<th>Maximum gate count</th>
<th>Maximum I/Os</th>
</tr>
</thead>
<tbody>
<tr>
<td>GaAs bipolar</td>
<td>100 GHz</td>
<td>500</td>
<td>30</td>
</tr>
<tr>
<td>GaAs fet</td>
<td>30 GHz</td>
<td>10K</td>
<td>300</td>
</tr>
<tr>
<td>Si ECL</td>
<td>10 GHz</td>
<td>10M</td>
<td>500</td>
</tr>
<tr>
<td>Si CMOS</td>
<td>8 GHz</td>
<td>50M</td>
<td>1000</td>
</tr>
</tbody>
</table>

Within Si CMOS

<table>
<thead>
<tr>
<th>Technology</th>
<th>Maximum clock speed</th>
<th>Maximum gate count</th>
<th>Maximum I/Os</th>
</tr>
</thead>
<tbody>
<tr>
<td>Full-custom</td>
<td>8 GHz</td>
<td>50M</td>
<td>1000</td>
</tr>
<tr>
<td>Standard cell</td>
<td>4 GHz</td>
<td>25M</td>
<td>1000</td>
</tr>
<tr>
<td>Gate array</td>
<td>2 GHz</td>
<td>5M</td>
<td>1000</td>
</tr>
<tr>
<td>FPGA</td>
<td>175 MHz</td>
<td>500K</td>
<td>800</td>
</tr>
<tr>
<td>CPLD</td>
<td>150 MHz</td>
<td>10K</td>
<td>200</td>
</tr>
<tr>
<td>PAL</td>
<td>200 MHz</td>
<td>500</td>
<td>60</td>
</tr>
</tbody>
</table>
Addition of two integers serially, l.s.b first

let Y = a * 5 in ...

Bit-serial multiplication of an integer by a hardwired constant

let y = a*5 in ...

47
Design partitioning: The Cambridge Fast Ring

- DRAM (Standard Part)
- ECL CHIP
- Isolating transformers
- Host Bus
- Standard data buffers
- Address PAL
- Interrupt PAL
- CMOS CHIP
  - 12.5 MHz
  - 100 MHz
  - VCO (analogue)
  - Ring Connector
Design partitioning: An external modem
Design partitioning: A Miniature Radio Module

Multi-chip module or mini PCB

FLASH memory chip
RAM Microcontroller
Line drivers
Baseband Modem
Digital Integrated Circuit
ADC
DAC
Hop Controller

www.bluetooth.org
www.csr.com

Analog (RF) Integrated Circuit
Carrier Oscillator 2.4 GHz
IF Amps
RF Amps
Antenna

Data Interfaces
A Microcontroller

- Microprocessor (8 bit generally)
- RAM (e.g. 2 Kbytes)
- OTP EPROM (e.g. 8 Kbytes)
- Counters and Timers
- Programmable IO
- UART
- Clock
- Reset capacitor
- Power Up reset
- Internal A and D busses
- I/O wires OR external bus
- Serial TX and RX
LEDs wired in a matrix to reduce external pin count
IR Handset Internal Circuit

Scan multiplexed keyboard

Clock capacitor

Single chip containing all semiconductors

Infra-red transmit diodes

Battery
Scan multiplex logic for an LED pixel-mapped display

- **CLOCK**
- **N bit COUNTER**
- **Pixel RAM**
- **Binary to Unary Decoder**
- **Scan Multiplexed Display Matrix**

One col line is logic one at a time.

2^N col lines

Data lines (zero for on)
Addition of psudo dual-porting logic

Write address

Write strobe bar

Write data

Broadside tri-state buffer

Pixel RAM

MUX2

N bit COUNTER

BINARY to UNARY DECODER

SCAN MULTIPLEXED DISPLAY MATRIX

N bit COUNTER}

55
Use of a ROM as a function look-up table

A to D convertor

Look-up table ROM

D to A convertor

65536 by 16 ROM

Sample clock 44.1 kHz

12 inch speakers

Amplifier

A

16

D

16
Use of an SRAM to make the delay required for an echo unit

- A to D convertor
- D to A convertor
- Amplifier
- Static RAM 65536 by 16 bits
- Synchronous counter
- Timing generator circuit
- Derived clock, 44.1 kHz
- 88.2 kHz
- 44.1 kHz
- Read cycle
- Write cycle
- Old sample replay
- New sample write
- Counter Output
- N-1
- N
- N+1
- RAM data pins
- ADOE
- RAMWE
- RAMOE
- Clock 88.2
- Clock 44.1
Merge unit block diagram

LOGIC 1
LOGIC 0

Bit spacing is reciprocal of 31.25 kbaud, which is 32 microseconds.

MIDI serial data format

9n kk vv  (note on)
8n kk vv  (note off)
9n kk 00  (note off with zero velocity)
MIDI merge unit internal functional units

Midi In 0
Serial to par
Remove status
FIFO Queue
Meger core function

Midi In 1
Serial to par
Remove status
Queue

Par to serial
Insert running status
Queue

Merged midi output
The serial to parallel converter:

```verilog
input clk;
output [7:0] pardata; output guard;
```

The running status remover:

```verilog
input clk;
input guard_in; input [7:0] pardata_in;
output guard_out; output [23:0] pardata_out
```

For the FIFOs:

```verilog
input clk;
inpu{input guard_in; input [7:0] pardata_in;
inpu{input read; output guard_out; output [23:0] pardata_out;
inpu{input read; output guard_out; output [23:0] pardata_out;
```

For the merge core unit:

```verilog
input clk;
inpu{input guard_in0; input [23:0] pardata_in0; output read0;
inpu{input guard_in1; input [23:0] pardata_in1; output read1;
output guard_out; output [23:0] pardata_out;
inpu{input read; output guard_out; output [23:0] pardata_out;
```

Status inserter / parallel to serial converter are reverse of reciprocal units