Components



next up previous
Next: Port Controller Firmware Up: Experiences of building ATM Previous: Introduction

Components

 

The Switching Fabric

The Fairisle switch fabric is composed of 4 by 4 crossbar elements implemented on a single 6400 gate equivalent Xilinx FPGA [\bf\protect\citenameXilinx92]. The largest switch so far constructed is a 16 by 16, the fabric built as a two stage delta. More commonly, two stage 8 by 8 switches are used for experimentation.

The fabric has an 8 bit data path and is clocked at 20MHz (nominally) for a raw bandwidth per port of 160Mbit/sec. The fabric is also cell-synchronous; besides the 20MHz clock signal, a further frame start signal is distributed to all port controllers, and any pending cells at the inputs are injected a defined number of clock ticks after the frame pulse.

In common with many other designs the fabric elements are self routeing, that is each cell has routeing tags prepended to it which indicate the requested output (at each stage) for that cell; two bits of routeing information and one priority bit are used. The arbitration units in the switch elements implement round robin like service - the winning input in each frame is remembered in the arbitration unit and if contention is experienced for the next cell, the ``next'' requesting input wins (``next'' in the obvious modulo 4 manner). The format of the routeing tag is given in figure 1.

  
Figure 1: Switching fabric routeing tag per stage

As the fabric or output port controller may reject a cell due to contention or finding a full output buffer, an acknowledgement signal is provided to the input. At a defined point during the cell frame, this signal can be observed and indicates whether or not the cell has succeeded in traversing the fabric and is being accepted by the output - in the case of failure, it is the responsibility of the input port to decide what to do with the cell.

Since the switching fabric used in the Fairisle switch is both internally and output blocking, the switch is required to be input buffered; to minimise the effect of the blocking on throughput, the fabric is run faster than the line rate (160Mbit/sec v. 100Mbit/sec), which also has the side effect of requiring some modest amount of output buffering.

  
Figure 2: Throughput against fabric, line rate ratio

Many comparisons of input and output buffered switches [\bf\protect\citenameKarol87] assume a internal switching rate equal to the line rate. Given the relative complexity of input and output buffering this seems rather restrictive for the input buffered case. Indeed our current fabric is used well below it's maximum rate; the 20MHz rate currently used is dictated by the design and implementation of the clock distribution circuitry.

Figure 2 demonstrates the effect on throughput of this speed-up for uniform random traffic; the graphs are for a 16 by 16 crossbar (where only output blocking is experienced) and for the Fairisle two stage delta; in both these examples, the worst case of fifo queueing at the input is used. Hence for the 160:100 ratio present in the Fairisle switch, while the fabric port utilisation is the expected 53%, the line utilization achievable (i.e. throughput) is approximately 87% (for this oft used hypothetical traffic).

The work of the AN2 [\bf\protect\citenameAnderson93] designers has led to a reconsideration of the use of the priority bit within the Fairisle fabric. AN2 uses an iterative bipartite matching algorithm to compute a switch schedule each cell time. A key stage is the first iteration in which reserved slots are removed from the scheduling if any data for the relevant circuit has arrived at the input buffer; this provides the mechanism to provide reserved bandwidth for certain channels, while being able to use such capacity for other channels when not required. While within Fairisle we use a self routeing switch fabric, rather than separate routeing and scheduling mechanisms as in AN2, the same ability to provided reserved slots is being implemented using the priority bit. Non-reserved slots, and reserved slots for which no reserved traffic arrives are allocated on the strictly round robin basis described above.

Finally, as part of an ongoing SERC funded research project to apply formal techniques to networking problems, the fabric elements and fabric itself has been formally verified [\bf\protect\citenameCurzon94] using the HOL system.

The Port Controller

Each Fairisle Port Controller provides an input port and the corresponding output port for the switching fabric. The port controller also interfaces to the transmission system. A detailed description of the hardware may be found in [Orange93]gif. Three versions of the hardware were produced:

FPC1s have been retired; the FPC2s are in use to provide a service network; the FPC3s are used for the majority of the ongoing research work. The versions are due to changes in processor availability, cost savings based on volume, and increased low level functionality.

The port controller consists of three major sections: the queue manager based around an ARM RISC processor; network buffer memory and DMA engine; and transmission system. The transmission system is discussed further below. Figure 3 shows an overview of the port controller, the lower half forming the processor section, and the upper the buffer section.

  
Figure 3: Port controller schematic

The input buffer on the card consists of 128k bytes of (35ns) static RAM arranged as 2048 cell buffers of 64 bytes each - room for the cell payload, header, fabric routeing, VP/VC mapping and ``next'' pointer. This buffer memory resides in the address space space of the processor to enable the processor to both manipulate the control information and to send and receive cells from locally executing tasks (e.g. signalling cells). Cells can be injected into the input buffer from both the transmission line and, via a loop-back fifo, from the output section of the fabric - this enables ports on the same switch to easily communicate with each other.

The processing section of the port controller has an ARM processor and runs the Wanda kernel [\bf\protect\citenameDixon92] to provide an environment in which to implement services such as switch and network management. The provision of the standard IO bus for the ARM chip-set also enables the provision of other services; for example, a port controller with an Ethernet interface has been used as an Ethernet / ATM IP router.

The software responsible for cell queue management is performed at a high interrupt priority (the so called FIQ) and effectively runs asynchronously with respect to the Wanda kernel. The software interacts with the static RAM DMA engine (also implemented using a Xilinx device) to both enqueue cells arriving from the transmission system and dequeue cells for injection into the switch fabric.

Each port controller is a double height extended depth euro-card; a backplane connects port controllers to their associated input and output ports on the switch fabric and provides the clock and frame sync distribution from a master clock. A complete switch of 16 ports, composed of a 16 by 16 fabric, master clock board and Ethernet interface fits into a standard 19" 6U subrack (just).

Transmission System

The transmission system is based on the ATM Forum de-facto standard using AMD TAXI components at 100Mbit/sec. The majority of the physical media in use is coax - fibre is used only for long runs greater than several hundred metres, as the coax has proved very reliable for short runs. A Xilinx chip performs the framing and HEC calculation, and small fifos are used to decouple the transmission derived clocks from the internal clock domain.

The transmission system was initially implemented as a plug in card on the first two prototype versions of the port controller - we intended to move to SONET which was always going to be available ``real soon now'' - issue three port controllers include the TAXI transmission systems on the main board.

The transmission system can be configured to interleave data symbols with various numbers of idle symbols on the line. This permits a range of line speeds (512 different values) to be emulated, from 0.4 Mbit/sec up to 100Mbit/sec. This has been provided to enable experiments into the effects on the network of links of various speeds. It has a practical use in that when using transmission converters, such as the TAXI to 34Mbit/sec G.703 used for early Super-JANET experiments, the switch output rate can be matched to the line.

Host Interfaces

A VME interface developed early in the project was shelved in favour of the Olivetti Research YES-v2 [Orange93]gif interfaces when the project acquired DEC Turbo-channel based systems. The switch performance results presented later have been obtained using such interfaces.

Indirect Developments

The modular nature of the Fairisle switch has lead to related developments which now form part of the ATM environment in which the networking experiments are being performed.

The Desk Area Network (DAN) reuses the port controllers and switch fabric of the Fairisle switch to implement a workstation in which the fabric is used as the main bus of the system. Multimedia devices (video and audio, capture and display) and processor cache/memory systems all using the Fairisle fabric have been built and are described elsewhere [\bf\protect\citenameHayter93][\bf\protect\citenameMcAuley93][\bf\protect\citenameHayter91].

Of more direct relevance to the network research are the Null Port Controller and Multicast fabric.

Null Port Controller

The Null Port Controller is a fifo queueing port controller for the Fairisle switch. The device is extremely basic; besides the standard transmission daughter board, the port controller is composed of three components: a Xilinx 3042, a 256K by 8 bit ``triple ported'' VRAM, and a fifo memory.

The main buffering function is performed in the fifo. The Xilinx is again used for all control functions.

The VRAM is used to perform the header remapping and to prepend the fabric routeing tags. Arriving cell data is shifted into one of the serial access memory (SAM) ports of the VRAM at a tap point defined by some bits of the VP/VC. After complete reception of a cell, the payload and parts of the header are written into the main DRAM array at a row address defined by further VP/VC bits. This area of memory has been initialised during call setup with the appropriate new VCI and fabric routeing informationgif, so that when the cell and new header are read back into the other SAM, the new cell is ready for injection into the switch fabric. 4K translations can be supported.

Hence the VRAM is used to provide both the header mapping, high speed double buffering and the ability to retransmit a cell. This compares with the initial (and more obvious) designs which use an SRAM translation table and either, a fifo with resettable read pointer, or two fifos. VRAMs are cheaper.

Such a port controller clearly has minimal abilities to provide quality of service to streams (different priorities and retry count on blocking). However, our main motivation for building the NPC was based on the observation that many ATM switches will effectively only be acting as low to high rate multiplexors without overload and our desire was to investigate how simple (cheap) this allowed port controllers to become. In particular our aim was to attach 10 Ethernets to an ATM link into an ATM backbone - the maximum cell buffer occupancy at each Ethernet input port is onegif - anything more complex than fifo in this circumstance is gratuitous.

The Sapphire switch [\bf\protect\citenamePrudence93] is an experimental ATM switch built by HP Labs based on the Null Port Controllers and the 8 by 8 Fairisle fabric. Port controllers are attached to six of the fabric ports, the seventh is used for connection of a switch management processor and the eighth port is socketed for the addition of an optional transmission port.

Multicast Copy Fabric

The multicast copy fabric was an experimental addition to the Fairisle Switch. This was a hardware device for replicating multicast cells on the way into the switch by presenting them across multiple inputs. The design and performance of the system is described in [\bf\protect\citenameDoar93].



next up previous
Next: Port Controller Firmware Up: Experiences of building ATM Previous: Introduction



Richard Black, Ian Leslie et. al.