Video Basics

Each of our eyes have 6 to 7 million receptors that transfer what we see, via one million nerves, to the brain. Although it’s virtually impossible for any video system to match this kind of quality, it can create a very effective illusion, assisted by some quirks of human physiology and psychology.

A video circuit conveys a moving image in the form of a varying analogue voltage or as a sequence of digital codes. In order to send such information over a single wire or communications channel, the picture must be broken up into transportable pieces and reconstituted at the receiving end.


The designer of any video system has to decide on what quality or resolution is required. This sets the size of the smallest component to be transmitted, often known as a picture element or pixel. In digital video systems each pixel has a square or rectangular shape, the entire picture being replicated by assembling these blocks together, both side by side and one above the other. In analogue video technology the picture is divided into lines, effectively of the same height as a pixel.

In standard definition (SD) video there are 525 or 625 lines available, although only around 500 or 600 are used for the picture itself. In high definition (HD) video, a much greater resolution is obtained by using between 1050 and 1250 lines, of which 720 or 1080 may be used for the image.

Aspect Ratios

The proportions of a picture are described by its aspect ratio. For example, a picture that’s 4 inches wide and 3 inches high has an aspect ratio of 4:3, which remains unchanged should the image be enlarged or reduced in size. As it happens, this is the standard ratio used for SD video images. This and other standard ratios are shown in the following table.

graphic ​print
5:70.71Not ​used ​for ​video
graphic ​print
8:100.80Not ​used ​for ​video
SD ​video ​or ​film4:31.33Equivalent ​to12:9 ​aspect, ​square ​pixels
Digital ​image3:21.5Rectangular ​pixels ​for ​4:3 ​aspect
SD ​video ​'letter ​box'14:91.56Compromise ​aspect ​for ​4:3 ​displays
HD ​video ​16:101.60 Square ​pixels
Wide ​screen ​film ​*16:91.78'Letter ​box' ​view ​on ​other ​systems
scope ​film
2.21:​12.2119.89:9 ​aspect, ​roughly ​20:9
scope ​film
2.35:​12.35'Letter ​box' ​view ​on ​other ​systems

1920 × 1200 pixels, as on Apple 23-inch Cinema HD Display. 1980 × 1080 or 1920 × 1080 used for 16:9 aspect ratio

* Viewable on 16:10 display, but with loss of ends or black bars above and below

Colour Coding

Most systems use Red Green Blue (RGB) or Component Video (YUV) colour coding. RGB is the oldest and simplest method, although conveying images in this form isn’t ideal.

The range of colours available in RGB coding is known as the RGB colour space. Similarly, the range of YUV colours is known as the YUV colour space. Although most colours can be represented by both systems, several colours fall outside the range covered by such codings. The values in the two different systems are related by the following equations:-

Y = (0.299 × R)​ + (0.587 × G)​ + (0.114 × R)

U = B - Y

V = R - Y

The normalised range of Y is between 0 and 100% whilst U and V are in the range of -88% to +88%.

The value of V can be plotted against the value of U on a graph, with V on the vertical axis. The point where Y = 0% represents saturated white, grey or black whilst a 100% value on the V or U axes represents pure colours such as red, magenta, blue, cyan, green or yellow.

Digital Compression

Component video material in digital form is often compressed, as shown in the following table:-

4:4:4Uncompressed, ​10 bits per component
4:2:2Cb and Cr ​sampled at half rate
4:1:1Cb and Cr ​sampled at quarter rate
4:2:0See below

The notation is slightly confusing: the first number, always a 4, relates to the base sampling rate, about four times the NTSC or PAL colour subcarrier frequency. The other numbers specify the Cr and Cb rates respectively, in relation to the base rate. However, in the PAL 4:2:0 standard the chroma information is sampled 360 times per line, but only on every other line of the field. The following table shows how these forms of compression relate to the usual Digital Video (DV) standards:-

FormatCodingData ​Rate ​(Mbit/s)
DV/DVPRODV 4:1:125
DVPRO50 *DV 4:2:250

Analogue Video

In analogue video systems, as used prior to the arrival of digital technology, the image is broken up and reconstructed by scanning, which splits the picture into lines instead of pixels.

The electron beam in a traditional video camera creates a spot that moves across each line to extract the picture information. Having reached the right-hand end of a line, it moves quickly leftwards and down to the next line. A complete image is created by scanning from the top left-hand corner to the bottom right of the picture.

Traditionally, an analogue picture is created by another moving dot in a cathode ray tube (CRT), whose intensity is varied as the beam sweeps horizontally across the screen, moving slowly downwards after each line to create an image pattern known as a raster. Each complete image is known as a frame and the number of frames per second (frm/s or fps) is called the frame rate. The NTSC television system uses a rate of 30 Hz, whilst the PAL system runs at 25 Hz.

Fortunately, the human eye has persistence of vision, retaining an image on the retina for a short time after it actually disappears. CRT displays also have persistence, which allows systems to be designed with a frame rate that doesn’t produce any noticeable flicker.

Although analogue video doesn’t divide the image into true pixels, the height of each line can be considered to correspond to the height of a pixel. The equivalent ‘pixel width’ is set by the frequency response, also known as bandwidth or horizontal resolution, of the signal. On initial consideration, you might think that this resolution should equal one pixel. In practice, it can be much less, partly because of the illusory effect of the visible horizontal lines on the human eye.


Interlaced scanning, also known as interleaved scanning, is employed in traditional analogue television broadcasting. In this system, the ‘even’ lines of the image are created first, producing what is known as an even field. A scan is then made of the ‘odd’ lines, creating an odd field. Together, each pair of fields make up the entire picture, commonly known as a frame.

This means that the ‘even’ lines 20 and 22 are in one frame whilst the ‘odd’ lines 21 and 23 are in the other. Such a technique has the effect of doubling the subjective flicker rate, so that a system based on, for example, a frame rate of 25 frm/s actually appears to flicker at 50 Hz

A simpler form of scanning, known as progressive scanning, non-interlaced scanning or non-interleaved scanning, is used in computers and more advanced video systems. This scans all of the lines in a sequential order. Fortunately, this doesn’t suffer from some of the artifacts associated with interlaced scanning, such as wheels that appear to spin backwards.

Unfortunately, traditional interlaced images aren’t compatible with computer-based media, such as CD-ROM or the World Wide Web, none of which use television-based interlacing. Problems can also occur when feeding standard video material into some types of computer or video capture card, although modern hardware and software can often accommodate such signals.

Difficulties also occur in the opposite direction, when a non-interlaced image from a computer is fed into video equipment that requires an interlaced signal. Typically, this causes the picture to jitter up and down at the frame rate, a highly unpleasant effect. Fortunately, this can be overcome by using modern hardware that provides a convolved video output. In this type of signal, adjacent pairs of lines are smoothed to reduce flicker, but with a corresponding reduction in vertical resolution.


The video signal itself is interspersed by synchronisation pulses that keep the scanning system at the receiver in step with that used at the source. In a composite video signal a positive voltage is used for picture information, whilst the synchronisation pulses are negative, making it easy to separate the components at a receiver. The horizontal synchronisation pulse occurs at the end of each line whilst the vertical synchronisation pulse appears at the end of the last line in each field.

In reality, the vertical synchronisation ‘pulse’ consists of a train of pulses, each identical to the horizontal pulse. This rapidly repeating sequence is detected in the receiver by using a process known as integration. Such narrow pulses, effectively at a high frequency, avoid any problems that could be caused by limitations in the low-frequency performance of the communications channel.

Analogue Standards

Before working with any type of video equipment it can be useful to understand some of the variations in technical standards that exist throughout the world. The main differences in these standards are centred around the method of colour-encoding, resolution and frame rate.

The common standards are:-

Cinema Film 24 frm/s

Traditionally, a film is viewed using a mechanical projector that shows 24 complete frames every second. A film can be converted to PAL or SECAM video by running at 25 frm/s: although this slightly increases the pitch of sounds and reduces running time, the results are often acceptable.

For other TV standards, or where exact pitch is required, the film is played at the correct speed and is then scanned using a flying-spot device, also known as a telecine machine. This is arranged so as not to scan the film whenever a frame of film is being advanced forward. This prevents those slow-moving horizontal black bars that can appear when there’s a lack of picture synchronisation.

For conversion to NTSC video (see below) a telecine machine must interlace extra frames into the original material. Typically, this follows a sequence such as 3 non-interlaced, 2 interlaced and 3 non-interlaced and so on. This technique is commonly known as 3:2 pull-down.

NTSC 29.97 frm/s Drop Frame

This is the American National Television System Committee (NTSC) standard for colour television: it’s sometimes referred to as ‘30 frm/s’ even though it doesn’t actually use this rate. The first two frames in every minute (but not on the tenth minute) are simply discarded or dropped. This means that a total of 108 frames are dropped in every hour.

The image has a standard aspect ratio and 525 horizontal lines, although only 484 lines (or less) as used for the image itself, with the remaining 41 lines switched to ‘black’ to create an interframe blanking period. A digitised NTSC image usually occupies 640 × 480 pixels.

NTSC 29.97 frm/s Non-Drop

An alternative version of NTSC in which one hour of video time corresponds to one hour plus 3.6 seconds of real time. This provides colour pictures without the complication of dropping frames.

NTSC 30 frm/s

This is the form of NTSC for monochrome TV, or colour TV where precise time lock isn’t required.

PAL 25 frm/s

The Phase Alternate Line (PAL) colour television standard is used by countries in the European Broadcasting Union (EBU) and is defined by the International Radio Consultative Committee (CCIR). The image has a standard aspect ratio and contains 625 horizontal lines, although some of these are switched to ‘black’ to create the blanking period. A digitised PAL image should occupy 768 × 576 pixels, although most systems use 720 × 576 pixels.

The colour-encoding system in PAL is derived from NTSC but is modified to reverse the phase of the colour signal on alternate lines, eliminating changes in colour caused by phase shifts in the signal. For this reason engineers in Europe often refer to NTSC as Never Twice the Same Colour.

SECAM 25 frm/s

The French Sequential Colour with Memory (SECAM) system uses the same image size as PAL but employs a different colour-encoding method. A digitised image should occupy 768 × 576 pixels, , although most systems use 720 × 576 pixels.

Digital Standards

Digital video (DV) provides a higher quality of reproduction than analogue technology, although most digital standards are derived from analogue systems, using either interlaced or progressive scanning. The frame rate and scanning method are given as a number and letter, as in 25p, indicating 25 frm/s with progressIve scanning, or 50i, which uses a frame rate of 25 frm/s with interlacing.

The following digital standards are often encountered:-

High Definition (HD)

This appears in 24p, 25p, 30p and 50i formats in Europe and Australasian countries, these scanning rates relating to the standard film and analogue PAL frame rates. In addition, the USA and Japan use the 60i format, which is based on the analogue NTSC standard. However, all images consist of 1980 × 1080 pixels, ensuring easy conversion between the various systems.


This system provides an image 720 × 480 pixels, with the same number of pixels for the height as in a converted analogue NTSC image. Since square pixels would normally give an aspect ratio of 3:2, this system employs oblong pixels to ‘stretch’ the image to the required 4:3 aspect.


This system provides an image 720 × 576 pixels. Since square pixels would give an aspect ratio of 5:4, this system employs oblong pixels to ‘stretch’ the image to the required 4:3 aspect.

PAL DV Wide Screen Format

This uses the same number of pixels as PAL DV, but provides an aspect ratio of 16:9 by employing pixels whose width is 1.422 times their height.

Web and CD-ROM

Material for this kind of media is normally based on analogue PAL or NTSC standards, although frequently ‘stripped down’ to reduce the amount of data. Unlike traditional analogue video signals, interlacing isn’t normally used. Common file formats include QuickTime and MPEG.

Many consumer products, including analogue television and CD-ROMs, as well as Web content, require images that are made of square pixels. This means that the oblong pixels in DV material can produce pictures of a distorted shape. As a result, special software, such as Media Cleaner Pro, must be used to convert material containing such pixels. This application can convert a DV image into standard PAL or NTSC format and can also remove the borders used in DV material.

Video Cameras

The earliest cameras were based on cathode ray tube (CRT) technology. Unfortunately, such devices often demand very specific light conditions and can be damaged by mechanical shock. Most modern cameras employ solid-state technology, with charge-coupled devices (CCDs) or metal oxide semiconductor (MOS) sensors. These photoconduction devices are more robust than tube-based cameras and last longer. In addition, they produce less background ‘noise’, don’t retain an image between scans and don’t ‘overload’ when exposed to high levels of lighting.

Some colour cameras have three tubes or sensors. The light passing through the camera lens is separated into red, blue and green colours by a system of red and blue dichroic mirrors. The outputs of the three sensors are then matrixed together so as to create a composite video signal containing both luminance (brightness) and chrominance (colour hue and saturation) components.

Avoiding ‘hand wobble’ whilst using a video camera can be very difficult. Fortunately, this problem can be eliminated altogether by using a camera stand. In addition, some cameras incorporate an optical image stabilisation system (OIS) that reduces the effect of your jitters.

Display Devices

Traditionally, all televisions contained a cathode ray tube (CRT). The earliest kind of colour CRT, known as the shadow mask tube, employs ‘triplets’ of coloured phosphor dots and three electron beams. Recently, other tubes have come into favour, such as Sony’s single-gun Trinitron, which uses alternate strips of coloured phosphor. In both devices, the electron beam (or beams), whose energy is controlled by the video signal, causes the phosphors to emit coloured light.

Larger images can be produced by using a projection television, although these tend to be cumbersome. In older devices, three separate high-intensity CRTs are used for each primary colour, back-projected via a mirror system onto a translucent screen. However, such televisions have been largely replaced by video projectors that employ non-CRT technology and a conventional screen.

Recent television receivers and video devices employ a liquid crystal display (LCD) screen. Such a device consumes far less energy than a CRT and provides a flat display with minimum depth. Unfortunately, LCD screens can be expensive, although the prices continue to fall.

©Ray White 2004.