The interrupt handler



next up previous
Next: Performance Up: FPC2 xilinx chip - Previous: Known problems with

The interrupt handler

Issues

Many of the issues which caused problems in the Xi2 design have been removed however these changes have introduced subtle changes to the design requirements for other parts of the system.

Delay free and Queues

The delay free problem has been completely removed by the handling of the free list entirely within the xilinx chip. The xilinx chip now automatically frees a cell buffer when it is actually transmitted simply by loading the old value of TransP into TP1 on the next relaod of TransP. This removes the problem of management of the free list from the software. It adds some complexity, however, in the case when the Wanda network management software needs to free a cell. This is because a special ``NoTX'' transmission must be performed.

The major change is in the form of the queues. It is no longer possible to have the queues with a dummy buffer element on them, as the freeing of buffers is no longer done on queue removal. Instead a queue must potentially contain zero buffers. This in turn means that the queue can no longer be designed to be safe in the presence of concurrent access and that it is now necessary for the Wanda networking code to disable FIQs when accessing the queues. This is less important than it was on the Xi2 design because no freeing of buffers is involved.

Data Structures

For the Xi3 design there are now only two data structres manintained by the interrupt software.

Cell Queues

Cell queues are used to put cells on when they arrive, and to take them off when they are to be transmitted. Some of these queues will be both generated and consumed by the FIQ code, and some will be generated or consumed by the Wanda networking code. There must also be a queue of buffers to be pseudo-transmitted, that is, transmitted without the data being sent ``NoTX''.

The Xi3 design software still maintains the queues as a head and tail pointer and chaining in the a map of indecies. The principal change from the Xi2 software is that NULL queues are now possible. They are represented by having a zero value in the head pointer.

Manipulation has changed so that for addition if the head is null then the new buffer number is written to both head and tail. This can be performed with a store multiple. Otherwise the link pointer is inserted into the previous tail and the new tail written.

For a removal operation if the head is zero then the queue is empty. There is also a special case if the head and the tail are equal because the queue then goes empty and the head must be written with zero. Otherwise the head is written with the link entry of the previous head.

Again no link entry of zero is ever required with this system aiding efficiency.

The forwarding map

The forwarding map has an identical form to the case of the Xi2 design when remapping is available. The following must be encoded:

It is noted that a single bit is required to indicate which interrupt to enable. Performance is improved by encoding this bit in the same word as the queue pointer. The word stored is in fact the queue pointer shifted left by one bit with the bottom bit used to encode which interrupt to perform. A single right shift both recovers the queue address and places the controlling bit in the carry flag for the conditional branch.

Algorithms

Interrpt dispatching

One of the major advantages of the Xi3 xilinx hardware is that it is designed to give as much information as possible to the software when it is trying to find out what the status of the chip is. Previously the software had to read the IOC FIQ status register and then dispatch on various conditions of this. However the new design allows a single slightly slower read cycle (ArmP, irqstat, reload) from the xilinx chip to to all of the following:

When one of these loads is done it necessitates that no matter what the other conditions are that a recieve buffer interrupt must be dispatched before the register is read again.

The priority of the dispatcher is to nack and special condition interrupts. If a special condition interrupt is pending then code takes a machine check (panic) just like the Xi2 software. The transmit available code takes the next higher priority (assuming it is enabled in the software interrupt mask) and receive takes least priority.

The latched status bit is only ever looked at when there is no other work to do. If it is set a normal IRQ is enabled allowing the higher level sceptic code to investigate the status register for the exact condition.

Arriving cells

This is essentially the same as Xi2 software except the buffer number is already available from the interrupt dispatcher. The VCI is recovered from the received cell and indexed into the forwarding map. The cell is added to the specified queue and the appropriate interrupt is enabled.

Departing cells

If enabled, on a transmit available interrupt the code searches a number of queues for a cell to transmit. This could in general be any number of queues but the current Wanda networking code uses only three queues; one for all traffic being cell forwarded through this node, one for management or other traffic being generated at this node, and the third for cells to be added to the free list by null transmission. If a queue is non empty the head cell is removed and written (together with the appropriate control flags) to the Xilinx transmit queue with a strobe. This atomic operation adds the buffer for transmission, increases the count of cells to be transmitted and also the data written makes up the link in what will become the free list when the buffer is freed.

The current position within the Xilinx transmit queue is kept in a FIQ state register. The new position is easily calculated after the write as it is simply the link entry in the buffer number just written.

As an additional efficiency, the code to remove a cell from any of the three queues is implemented as an inline examination of the heads of each of the queues in turn with conditional branching to a single section of code which actually does the removal. This ensures at most one pipline fill for the entire transmit available servicing code.

As noted above at the end of the tranmsit queue available interrupt the code must check the value read in the interrupt dispatcher (which is preserved in a register) and jump directly to the receive code if appropriate.

Blocked cells

 

The Fabric generates a NACK whenever there has been contention, either for an output port or due to blocking within the fabric, and this port controller's cell has lost. A NACK is also generated by the destination port controller if its fifo is becoming full. On seeing a NACK the Xilinx may try between zero and seven additional times before generating a FIQ interrupt and stopping transmission.

On Xi3 design hardware it is much easier to deal with a NACK interrupt than on Xi2. The possible options supported by the hardware are described in section 4.

The default software again implements a naive sixteen attempts for every cell. Eight of these attempts are performed on each occasion by the xilinx chip by setting the retry value in the previous link pointer to seven. However this means there must be retries at the software level.

Like the Xi2 software, the Xi3 software uses a mechanism which is not perfectly accurate but which is almost so and much cheaper to implement. The number of the buffer which caused the last nack is kept in a FIQ state register. When a nack occurs the buffer number which was nacked is read from TP1. If it was not the same as the last time then the state register is updated and TransP is written with the buffer number obtained from TP1, which causes it to be retried. In either case the xilinx command register is then written with a ``GoTX'' command which restarts transmission. This is highly efficient compared with the Xi2 case.



next up previous
Next: Performance Up: FPC2 xilinx chip - Previous: Known problems with



Mark Hayter and Richard Black