next up previous contents
Next: MPEG Up: Moving Image Previous: H.261 Error Correction Framing


H263+ is a new addition to the ITU H series and is aimed at extending the repertoire to Video Coding for Low Bit Rate Communication. This makes it eminently suitable to a wide variety of Internet access line speeds, and therefore also probably reasonably friendly to many Internet Service Providers backbone speeds.

Existing A/V Standards and Video and the basic technology of CCD camera and of Television and general CRT dictates frame grabbing at some particular resolution and rate. The choice of resolution is complex. One could have fixed number of pixels, and aspect ratio, or allow a range of choice of line rate and samples rates. H.261 and MPEG choose latter.

The line rate (a.k.a. Picture Clock Frequency - PCF) is 30,000/1001 or about 29.97Hz but one can also use multiples of this. The chosen resolution for H.263 is dx*dy luminance and chrominanace is just one half this in both dimensions. H.263+ then allows for sub-QCIF which is 128*96 pixels, QCIF - 176*144 pixels, CIF - 352*288 pixels, 4CIF (SCIF in the INRIA Ivs tool) 704*576 pixels and 16CIF 1408*1152 pixels. The designer can also choose a pixel aspect ration; the default is 288/3: 352/4 which is 12:11 (as per H.261). The picture area covered by standard formats has aspect ratio of 4:3.

Luminance and chromnance sample positions as per H.261, discussed earlier in this chapter. The structure of the coder is just the same too, although there are now two additional modes called the ``slice'' and ``picture block'' modes.

A block is 16*16 Y and 8*8 Cb and Cr each; The Group of Block, or GOB refers to k*16 lines; GOBS are numbered using a vertical scan starting with 0 to k, depending on the number of lines in Picture. e.g. normally, when


, k is 1. The number of GOBS per picture then is 6 for subQCIF, 9 for QCIF, 18 for CIF (and for 4CIF and 16CIF because of special rules).

Prediction works on Intra, inter, B, PB, EI or EP (the reference picture is smaller).

The Macroblock is 16 lines of Y, and the corresponding 8 each of Cb and Cr Motion vetors of which we can receive 1 per macroblock.

There is some provision for other technology - we could envisage "intelligent" device in camera, and only detect "objects" and motion - this is some ways off in the future and anyhow can be done after the event with general s/w intelligence after dumb capture and compression (using compression for hints.

H.263 then, extends H.261 over lower bitrate (not just p*64kbps design goal) and more features for better quality and services, but the basic ideas same.

There are then a number of basic enhancements in H263 including

Continuous Presence Multi-point and Video Multiplex mode - basically 4 in 1 sub-bit-stream transmission. This may be useful for conferences, tele-presence, surveillance and so on
Motion Vectors can point outside picture
Arithmetic as well as variable length coding (VLC)
Advanced Prediction Mode which is also known as ``Overlapped Block Motion Compensation'' uses 4 8*8 blocks instead of 1 16*16, This gives better detail.
PB Frames known as combined Predictive and Bi-Directional frames (like MPEG II).
FEC to help with transmission loss; Advanced Intra coding to help with interpolation; Deblocking Filter mode, to remove blocking artifacts
Slice Structured Mode (re-order blocks so Slice layer instead of GOB layer is more delay and loss tolerant for packet transport
Supplemental Enhancement Information, Freeze/Freeze Release and Enhancement and Chroma Key (use external picture as merge/background etc...for mixing).
Improved PB mode, including 2 way motion vectors in PB mode
Reference Picture Selection
Temporal, SNR and Spatial Scalability mode; this allows receivers to drop B frames for example - gives potential heterogeneity amongst receivers of multicast.
Reduced Resolution Update; Independent Segment decoding; Alternate INTER VLC mode
Modified Quantization mode (can adjust up or down the amount of quantization to give fine quality/bit-rate control.

Chroma Keying is a commonly used technology in TV, e.g. for picture in picture/superimpose etc, for weather people and so on. The idea is to define some pixels in an image as ``transparent'' or ``semi-transparent'' and instead of showing these, a reference, background image is used (c.f. transparent GIFs in WWW). We need an octet per pixel to define the keying color for Y, Cb and Cr, each. The actual choice when there isn't an exact match is implementor defined.

next up previous contents
Next: MPEG Up: Moving Image Previous: H.261 Error Correction Framing