How Big Is a Single Frame of Video?

First we consider the spatial size of analogue video when compared to the common formats for digital video standards. A PAL television displays video as 625 lines and an NTSC television displays 525 lines. Current televisions have an aspect ratio of 4:3, giving PAL a spatial resolution of 833 x 625, and NTSC a resolution of 700 x 525, not all of which is visible. Most common formats for digital video are related to the visible area for each of the television standards. The size of video when using the international standard H.261, found in [#!h261!#] is 352 x 288 for the Common Image Format (CIF) format and 176 x 144 for the (Quarter CIF) QCIF format, and 704 x 576 for the (Super CIF) SCIF format, where a CIF image is a quarter the size of the visible area of a PAL image. For NTSC derived formats 640 x 480, 320 x 240, and 160 x 120 are common. Figure 4.12 shows the spatial size of these common resolutions with respect to a PAL TV image.

**Figure 4.12:** The spatial size of digital video compared with a PAL TV image
$\begin{figure} \centerline{\psfig{figure=pix/video_size.idraw}} \end{figure}$

It can be seen that digital images are all smaller than current television sizes. Moreover, television images are significantly smaller than current workstation screen sizes which are commonly of the order 1200 x 1000 pixels. Digital video utilizes even less of a workstation screen.

Due to this significant size difference, some observers have commented that digital video often looks like "moving postage stamps", on modern workstations.

For digital video, as with analogue video, a new frame is required every 1/25th second for PAL and every 1/30th second for NTSC. If we assume that there are 24 bits per pixel in the digital video and 30 frames per second, the amount of disc space required for such a stream of full-motion video is shown in table 4.2. The table is presented for the amount of time the digital video is shown and for a given spatial size in pixels.

Table 4.2: The amount of data for full-motion digital video

Time:Size	640x480	320x240	160x120
1sec	27Mb	6.75Mb	1.68Mb
1min	1.6Gb	400Mb	100Mb
1hour	97Gb	24Gb	6Gb
1000hours	97Tb	24Tb	6Tb

We can see that 1 hour of video with a resolution of 640 x 480 would consume 97 Gb of disc space, which is significantly larger than most storage devices. An equivalent amount of analogue video (i.e. a 1 hour video) , which has a higher resolution and also contains audio, would only take between a half and a quarter of a video cassette, for a 120 minute or a 240 minute cassette, respectively. However, although there are devices that can store this amount of data, there are currently no digital storage devices which could store 97 Gb on half a device which is the size of a video cassette. The data shown in the tables was collated by Larry Rowe of the Computer Science Division - EECS, University of California at Berkeley, for his work on The Continuous Media Player [#!rowe!#].

In order to reduce the amount of data used for digital video, it is common to use compression techniques, such as the international standards H.261, MPEG [#!mpegrtp!#], or to use proprietary techniques such as nv encoding [#!frederick!#] or CellB [#!cellb!#]. Rowe has also estimated the amount of space used when compression techniques are used. Table 4.3 shows the space needed when compressing video of size 640 x 480 pixels, and table 4.4 shows the space used when compressing video of size 320 x 240 pixels. Both tables present data for a given scale factor of compression and for the time the video is shown. The 97 Gb used for the 1 hour of 640 x 480 video can be reduced to approximately 1 Gb when compression is done at a scale factor of 100:1.

Table 4.3: The amount of data for compressed video of size 640x480

Time v. Scale	None	3:1	25:1 (JPEG)	100:1 (MPEG)
1 sec	27 Mb	9 Mb	1.1 Mb	270 Kb
1 min	1.6 Gb	540 Mb	65 Mb	16 Mb
1 hour	97 Gb	32 Gb	3.9 Gb	970 Mb

Table 4.4: The amount of data for compressed video of size 320x240

Time v. Scale	None	3:1	25:1 (JPEG)	100:1 (MPEG)
1 sec	6.75 Mb	2.25 Mb	270 Kb	68 Kb
1 min	400 Mb	133 Mb	16 Mb	4 Mb
1 hour	24 Gb	8 Gb	1 Gb	240 Mb

Although the table shows compression factors for MPEG, the H.261 standard uses a Discrete Cosine Transform encoding function which is similar to that used in MPEG, therefore we can expect the compression ratios to be of a similar order of magnitude. In reality, when encoding real video the compression factor is not constant but variable because the amount of data produced by the encoder is a function of motion. However, these figures do give a reasonable estimation of what can be achieved.

It is significant that with digital video it is possible to dramatically reduce the amount of data generated even further by reducing the perceived frame rate of the video from 30 frames a second down to 15 or even 2 frames a second. This can be achieved by explicitly limiting the number of frames or through a bandwidth limitation mechanism. In many multicast conferences the bandwidth used is between 15 and 64 Kbps. Although the reduced frame rate video loses the quality of full-motion video, it is perfectly adequate for many situations, particularly in multimedia conferencing.

There are a large number of still image formats and compression schemes in use in the network today. Common schemes include:

The first two of these still image schemes are discussed elsewhere in great detail. JPEG is interesting as it is also the same baseline technology as is used partly in several populat moving image compression schemes. The JPEG standard`s goal has been to develop a method for continuous-tone image compression for both color and greyscale images. The standard define four modes:

JPEG uses the Discrete Cosine Transform to compress spatial redundancy within an image in all of its modes apart from the lossless one where a predictive method issued instead.

As JPEG was essentially designed for the compression of still images, it makes no use of temporal redundancy which is a very important element in most video compression schemes. Thus, despite the availability of real-time JPEG video compression hardware, its use will be quite limit due to its poor video quality.