Generally we have to chose between high performance or low power.
(We shall see this at the gate level later on).
The time/space fold and unfold operations trade execution time for silcon area. A given function can be computed with fewer clocks by `unfolding' in the the time domain, typically by loop unwinding (and predication).
LOOPED (time) option: | UNWOUND (space) option: | for (i=0; i < 3 and i < limit; i++) | if (0 < limit) sum += data * coef[j]; sum += data[i] * coef[i+j]; | if (1 < limit) sum += data * coef[1+j]; | if (2 < limit) sum += data * coef[2+j];The `+=' operator is an associative reduction operator. When the only interactions between loop iterations are outputs via such an operator, the loop iterations can be executed in parallel.
If one iteration stores to a variable that is read by the next iteration or affects the loop exit condition then unwinding possibilities are reduced.
We can retime a design with and without changing its state encoding.
Adding a pipeline stage can increase the amount of state without recoding existing state.