The **time/space fold** and **unfold** operations trade execution time
for silcon area. A given function can be computed with fewer clocks by
'unfolding' in the the time domain, typically by loop unwinding
(and predication).

LOOPED (time) option: | UNWOUND (space) option: | for (i=0; i < 3 && i < limit; i++) | if (0 < limit) sum += data[0] * coef[j]; sum += data[i] * coef[i+j]; | if (1 < limit) sum += data[1] * coef[1+j]; | if (2 < limit) sum += data[2] * coef[2+j];

Sharing structural resources may require additional multiplexers and wiring: so not always worth it.

A good design not only balances structural resource use between clock cycles, but also timing delays.

We can **retime** a design with and without changing its state encoding.
Adding a pipeline stage can increase the amount of state without **recoding**
existing state.

36: (C) 2008-11, DJ Greaves, University of Cambridge, Computer Laboratory.