HOME       UP       PREV       NEXT (Critical Path Timing Delay)  

Folding, Retiming & Recoding

Generally we have to chose between high performance or low power.

(We can see this also in the selection of drive strengths for standard cell gates.)

The time/space fold and unfold operations trade execution time for silcon area. A given function can be computed with fewer clocks by `unfolding' in the the time domain, typically by loop unwinding (and predication).

  LOOPED (time) option:                | UNWOUND (space) option: 
                                       |
  for (i=0; i < 3 and i < limit; i++)  |  if (0 < limit) sum += data[0] * coef[j];
     sum += data[i] * coef[i+j];       |  if (1 < limit) sum += data[1] * coef[1+j];
                                       |  if (2 < limit) sum += data[2] * coef[2+j];

The `+=' operator is an associative reduction operator. When the only interactions between loop iterations are outputs via such an operator, the loop iterations can be executed in parallel.

If one iteration stores to a variable that is read by the next iteration or affects the loop exit condition then unwinding possibilities are reduced.

We can retime a design with and without changing its state encoding.

Adding a pipeline stage can increase the amount of state without recoding existing state.

Note: some of this material is/would be better presented in the HLS section of the course, now it exists!


43: (C) 2012-17, DJ Greaves, University of Cambridge, Computer Laboratory.   TAPE MISSING ICON