We have several simple ARM based prototyping boards (ARM PIE) each
with a closely coupled software programmable gate array board based
upon a XiLinx chip. To make a multiprocessor from these boards a
communications unit needs to be designed for implementation on the
XiLinx chip.
Previous Work
In 1995/96 Gerald Cheung tackled this project and resolved the low
level inter-board communications mechanism and XiLinx-to-ARM link.
However, there is much work to extend this system, e.g. designing a
router.
Notes on Requirements
Since XiLinx chips are comparatively slow it would be impractical
to support high speed serial communications. However, since
communication distances can be kept short, some degree of parallel
data transfer is practical.
The ARM PIE boards will operate from independent clocks.
Consequently some form of asynchronous data transfer is required. If
a full custom design was sought then asynchronous
FIFO buffers, with carefully matched delays, could be used. However,
since it is difficult to control routing delays on XiLinx chips, a
fully delay insensitive implementation is recommended (see originator
for details). For example, a 3 of 6 encoding could be used (only 3
wires from 6 are allowed to go high - fewer signals indicating
incomplete data and more signals, an error) giving 20 possible
combinations. 16 values could be used for data and 4 for signalling
(start and end of packet etc.).
There are several communication methods which could be deployed,
and hence there is more than one possible project here. For example:
switch based - offers space division multiplexing. It would
probably have four ports (north,south,east,west) allowing a grid
surface to be constructed with wraparound at the edges. A simple
relative routing scheme could be used based upon an (x,y) offset -
positive x indicating `go north', negative `go south' etc.
ring based - offers time division multiplexing. Routing
could also be performed in a relative manner using a bit-stream - one
bit per station on the ring (we don't expect to have many). Whilst
simplistic, this scheme does allow multicast, including the choice of
whether a message is returned to the producer for checking. A
conventional slotted ring approach could be taken or alternatively one
could take advantage of the elastic nature of self-timed pipeline
structures to allow the number of messages in the ring to vary.