Network Video and Audio Formats

next up previous
Next: AVA200 Console Interface Up: The ATM Camera V2 Previous: AVA200 Firmware and

Network Video and Audio Formats


Both the audio and video streams are sent encapsulated within AAL5 frames. The video and audio service specific convergence sub-layers are described below.

Figure 23: 8 Bit RGB Video Tile Encoding

Uncompressed Video Format

The AVA-200 does not transmit the digitised video in scanline format. Instead, a frame is broken down into tiles each of which is 8x8 pixels in size. Each tile is encoded on the ATM network in scan line order (each scan line comprising eight pixels). For an 8 bit RGB stream each scan line will comprise 8 octets. The mapping is illustrated in Figure 23.

The AVA-200 can pack a variable number of tiles (definable on a per stream basis) into a single AAL5 frame. Note that when asking the AVA-200 to pack multiple tiles into a single frame the packing factor is required to be set such that the picture frame requested encodes to an integral number of fully packed AAL5 frames. Tiles are packed contiguously the final one being followed by a variable length pad, a 8 octet tile trailer and finally an 8 octet AAL5 trailer. The pad is used such that the AAL5 PDU frame is an integral number of cells (48 bytes) in length. The tile trailer contains the co-ordinates of the first tile in tile dimension units. Also present is a 32 bit picture frame number to which this tile belongs.

Figure 23 illustrates the AAL5 frame format when an 8 bit RGB stream is encoded with a packing factor of four. As illustrated the packing order is left to right, when one row of tiles has been encoded the next row is started. It is acceptable for the packing factor to be set such that tiles from multiple rows be encoded in the same AAL5 PDU. Indeed, if space permits a whole frame may be packed into a signle AAL5 PDU.

Figure 24: JPEG Video Tile Encoding

JPEG Compressed Video Format

The AVA-200 supports JPEG video compression as an option. If a JPEG stream is selected from a suitably equipped AVA then the encoding on the ATM network is slightly different from the uncompressed coding previously described. The picture frame is still split into and transmitted as a sequence of tiles except that the compressed pixel data will be of varying size. The size of the compressed result will depend on the composition of the original tile (typically unknown) and the JPEG compression parameters selected.

The AVA-200 JPEG video encoding is illustrated in Figure 24 (again a packing factor of four is used as an example). Note that if multiple JPEG compressed tiles are to be packed in a single AAL5 frame then it is necessary to unpack the tiles by using the RST Offset field in the final cell to backstep through the AAL5 frame. Note also that the packing factor for JPEG compressed video should be set carefully. If the packing factor is set too high then the AAL5 frame produced by the AVA-200 for a pessimal case sample may overflow the maximum allowable in the sink entity.

Figure 25: 16-bit Stereo Audio Encoding

Audio Format

In a manner similar to the video encoding the AVA-200 is able to pack a variable number of samples into a single AAL5 SDU. The ATM encoding for 16-bit stereo audio from the AVA-200 is illustrated in Figure 25. For stero encoding the left channel data is always transferred before the right channel data. The number of bytes that the AVA-200 will buffer before generating a trailer is controllable by the remote client. Note that the total audio sample data length selected must be a multiple of 16 octets. The figure chosen is a tradeoff between a variety of concerns. If a small number is chosen, e.g. one sample per frame, then a great number of cells will be generated, each with a high header/padding overhead. For example, 16-bit stereo sampled at 44.1 kHz will require 18.3 Mbps of network bandwidth even though the sample bandwidth is only 1.4 Mbps. A stock workstation would find great difficulty sinking such a stream given the high rate of AAL5 frames that would have to be transferred through the operating system. A dedicated hardware sink would have less difficulty. This is analogous to the tradeoff in the AVA-200 video encoding case. Note that when multiple samples are buffered then an audio cell is sent as soon as it is full.

next up previous
Next: AVA200 Console Interface Up: The ATM Camera V2 Previous: AVA200 Firmware and

Ian Pratt and Paul Barham