Compiling RTL to gates involves three main steps:

- convert to a pure RTL form where each variable is assigned only once,
- convert each assignment of a vector variable to a list of assignments to individual bits,
- convert the r.h.s. of each bit assignment to a network of gates.

Firstly, to convert to 'pure RTL', for each register we need exactly one hardware circuit for its input, regardless of however many times it is assigned, so we need to build a multiplexor expression that ranges over all its sources and is controlled by the conditions that make the assignment occur. In other words, we need a list for each clock domain that holds pairs of the form (register name, value to be assigned on clock edge).

There are two varieties of the conversion to pure RTL algorithm depending on whether non-blocking signal assigns are used or normal variable assignment is used. The difference is simply whether we need to look up variables occurring on the right-hand side of expressions in the list of already assigned variables. The two techniques can be mixed when both forms of assignment are present.

»Conversion to 'pure RTL' list form, ML fragment

Secondly, for each register that is more than one bit we generate separate assignments for each bit. This is colloquially known as 'bit blasting'. Logic and sub-expressions can be shared between variables and bit lanes of a given variable. This stage removes arithmetic operators and leaves only boolean operators.

»Conversion to Bit Blasted Form, ML fragment

Thirdly, we produce gate-level circuits for each of the expression trees with a gate builder function that recurses to the leaves of the expression and emits the gates, returning their output net name as it returns up the stack. »gatebuilder, ML fragment

Assignments to arrays are slightly more problematic.

The **name alias problem** is that at compile time we might not
be able to determine whether a pair of subscripts are going to be the
same or not at run time, and hence, for blocking variable assigns
we cannot always do a lookup. Secondly, the restricted number
of ports leads to hazards that may need the design to be re-timed.

Of course, this is a simplified approach to logic synthesis and real tools must consider sub-expression sharing and replication depending on whether they are aiming for speed, area, power or some composite performance goal. Also, not all arithmetic units should be converted to gates: it is better to implement by instantiating special-purpose components. If these components are not-fully pipelined then we get further hazards. com ----------------------------continued---------------

(C) 2008-10, DJ Greaves, University of Cambridge, Computer Laboratory.