Most modern computers incorporate audio hardware for recording and replaying sound. Even if you don’t have a recording facility, you can play audio CDs and transfer audio tracks from such discs into a sound utility where you can manipulate the material or create sound samples.
The sound that arrives at your computer’s input is digitised and can be recorded onto your hard disk as a sound file. Once in this form, it can also be processed by means of a sound utility. When a sound is played the data is converted back into an analogue signal that appears at the output connector.
With more advanced software, normally operating in conjunction with special hardware, you can use your computer as a multi-track recording studio. The sound samples that you’ve manufactured can then be ‘fired off’ and used within a musical sequencing application.
You can convert an audio signal into a sound file with a sound recording application. The window shown below is from Coaster (Christian Roth), a neat Classic Mac OS application.
As you can see, the Sound Input is set to Built-in Mic, although some computers don’t come with this kind of microphone. The Sample Rate is set to 44100, equal to 44.1 kHz, and the Sample Size is 16, which means that 16-bit recording is to be used, thereby ensuring CD-quality sound.
You can also use rates of 22050 (22.05 kHz) or 11025 (11.025 kHz), although these give a reduced high-frequency response, giving an increasingly muffled sound quality. Similarly, you can use 8-bit recording, although this results in distortion that makes the sound ‘granular’.
This application includes sliders for adjusting the Input Gain and provides meters for checking the stereo input levels. Although the red record button isn’t activated, the meters still show the levels, which have been excessive on the left-hand channel, as shown by the red clipping indicator.
Once you’ve set the signal levels you can hit the record button, after which the material is saved in a sound file. In this instance, the file has been assigned a Mac OS creator code of
TVOD. This means that that when you double-click on the sound file it’ll be opened in QuickTime Player.
Applications such as Coaster (see above) create files that conform to the Audio Interchange File Format (AIFF) standard. When used for CD-quality sound, this kind of document is the same as that used for each track on an audio CD. In fact, AIFF files generated from any application can be easily transferred to a CD-R or CD-RW using a CD burner, so as to create your own music disc.
Unfortunately, AIFF documents use up acres of disk space, around 5 MB for every minute of recorded sound, or, putting it another way, 75 KB per second. For stereo you must multiply this by two and for a multi-track recording application you must multiply it by the number of tracks.
Some applications save sounds as a QuickTime movie, which is usually the same size as an AIFF unless compressed with a suitable codec. Sounds can also be stored in a MPEG-1 Level-3 file, also known as MP3 file. Thankfully, this kind of document is usually a tenth of the size of an AIFF, although the lossy compression that it employs can cause some degradation in sound quality.
In a multi-track sound application you can build up layers of sound in a similar way to using a multi-track tape machine. During playback the sounds from each track can be mixed together as required. The application can be set up to ‘drop’ into overdub mode at a specific time, allowing extra material to be recorded onto those tracks that you’ve already enabled for recording.
Whatever hardware is used, the number of tracks that you can play simultaneously is limited by the computer’s memory and its drive speed. All PowerPCs and later AV models support multi-track applications directly, usually with up to 32 tracks, although these limitations apply to old machines:-
If you use a stereo or multi-track audio card, or an audio adaptor box connected via a port, you’ll need driver software and compatible applications. A sound card or FireWire device that incorporates digital signal processing (DSP) lets you apply more dramatic effects to your recordings, but can only be used with compatible applications and plug-ins (see below).
The most common of the proprietary hardware systems include Time Division Multiplexing (TDM) and Real Time Audio System (RTAS), both used in Digidesign products, Virtual Studio Technology (VST) from Steinberg, which employs Audio Stream Input Output (ASIO) for multi-track recording, and MOTU Audio System (MAS), as used in products from Mark of the Unicorn. Some hardware also works with Core Audio in Mac OS X, which accommodates any number of audio channels and all sample rates up to 192 kHz, with 24-bit sampling and 32-bit processing.
Audio and MIDI sequencing applications often accept plug-ins that provide extra digital sound processing (DSP) capability. Traditional types include MAS, RTAS and VST, matching the hardware in your sound card or adaptor, although these formats are being overtaken by Audio Units plug-ins, which work with the Core Audio and Core MIDI elements of Mac OS X.
The following varieties of plug-ins are in common use:-
As built into Mac OS X and supported by Peak 4, Logic 6 and Digital Performer 4, as well as later versions of these applications. This format is likely to replace most of the older plug-ins for the Mac.
An updated version of TDM (see below) that also works with older applications designed for TDM plug-ins, including ProTools and some versions of Logic Audio. Unlike the original TDM format, these plug-ins use the computer’s processor for DSP, rather than the card’s hardware, and are really a hybrid of RTAS and TDM technology (see below).
This common format employs the hardware in a TDM card for signal processing. Note that some TDM plug-ins are also known as virtual instruments.
As used with products from Mark of the Unicorn, such as Digital Performer. A special MAS plug-in allows VST plug-ins to also be used with Performer, although the results can be variable.
A plug-in that employs the computer for signal processing, as used in the LE version of ProTools.
A popular plug-in, as used in Cubase VST, Nuendo and older versions of Studio Vision and Logic Audio. With Audio Units active in Mac OS X these plug-ins are disabled, although VST-AU Adaptor (FXpansion Audio) can be used to convert any Carbonised plug-in to the Audio Units format. The multi-track form of VST is known as Audio Stream Input Output (ASIO), whilst an improved form of ASIO, known as ASIO 2, reduces the time delays that can be introduced by signal processing.
A hard disk used for serious audio recording can easily become fragmented. As a result, the data for each sound file gets scattered all over the disk. During playback, the drive mechanism must be able to extract the audio data in real time, but if it’s badly fragmented this isn’t possible.
You can reduce the effects of fragmentation by regularly formatting a drive that’s used for audio recording. Since this erases the disk most people prefer to use a disk optimisation application.
A sound utility can often be used to process the sounds in a file or to save them in another file format. Some applications can also extract sounds from various documents or generic sound files, whilst there are Classic Mac OS programs that can get sound resources out of any file, including an application.
The main window of SoundEffects (Alberto Ricci), a Classic Mac OS utility for processing sounds, is shown below. In this instance, a stereo file has been opened and a segment of one of the stereo channels is selected. The top left-hand corner gives details for the file, including its length in bytes. The central boxes give details about the whole of the selected channel while those at the top right show the position and size of the selection: the segment size shown here is measured in samples.
The waveforms shows how the signal level of each channel vary with time: the left-hand channel (channel 1) is at the top and the right-hand channel (channel 2) at the bottom. Below, there are buttons to enlarge, reduce or normalise the viewing magnification. To the right of these are scrolling buttons for moving the view through the sample. The usual tape recorder buttons appear below these. The last button lets you play the file or the selected segment continuously as a loop.
In a typical utility of this kind you can usually process either a selected segment or the entire file. Special effects can include:-
|Amplify||Increases volume, possibly causing distortion|
|Backwards||Plays sound backwards|
|Channel||Moves sound to another channel|
|Chorus||Adds replica of sound slightly shifted in time|
|Dither||Adds background noise to improve subjective quality|
|Downsize||Reduces sample resolution from 16-bit to 8-bit|
|Echo||Repeats sound, fading away with time|
|Fade In||Fades volume from zero to maximum|
|Fade Out||Fades volume from maximum to zero|
|Filter||Emphasises or reduces frequency components|
|Flange||As chorus, but with feedback to create a metallic effect|
|Keyboard||For ‘playing’ the sound with a musical keyboard|
|Mono||Converts sound sampled as stereo into mono|
|Noise||Inserts white noise (hiss) in place of sound|
|Pan||Moves sound around the stereo ‘stage’|
|Pitch Bend||Modifies pitch during playback|
|Resample||Changes the sample rate (resolution)|
|Reverb||Similar to Echo|
|Reverse||Same as Backwards|
|Robotise||Removes tonal components|
|Silence||Inserts silence in place of sound|
|Smooth||Removes spiky components|
|Stereo||Converts mono samples into stereo for later processing|
|Upsize||Increases resolution from 8 to 16-bit, but not quality|
|Waveform||Inserts fixed frequency tone in place of sound|
MP3 files give near-CD quality music, but are small enough to be downloaded over the Internet. They can be played on a computer using an MP3 application, such as iTunes for the Mac OS or iTunes for Windows. They can also be downloaded via a FireWire or USB port to a portable MP3 player: Apple’s iPod is ideal, as it contains a high-capacity hard disk drive.
You can locate MP3 files on the Internet by using an MP3 search engine, such as Music Seek, Lycos MP3 search or 2look4. To actually download the sound files you can use an FTP client application, although most people use a Web browser, such as Internet Explorer or Netscape.
1:3, you must upload one file for every three you download.
550: Permission deniedmessage, indicating that the site needs a special Password and Username (or User ID) for the file. Fortunately, within your application you should find a Transcript window, containing a record of this information or telling you how to obtain it.
Not all Internet sites supply music as standard MP3 files. For example, Apple’s iTunes Music Store provides Advanced Audio Coding (AAC) files, which can be played on recent versions of iTunes, installed on up to three Mac OS machines, burnt onto a CD or transferred to a portable and AAC-compatible MP3 player, such as a ‘version 2’ iPod, or better.
Some sites, such as OD2, a UK service provided by EMI, offer Windows Media Audio (WMA) files, also known as Windows Media 9 (WM9) files. These are protected by Microsoft’s own digital rights management (DRM) system and can only normally be used with a WMA-compatible application or player.
Formats other than MP3, AAC and WMA aren’t widely supported by MP3 player applications and hardware. This is unfortunate, since the Ogg Vorbis format, in particular, provides a particularly good quality of sound.
Sony’s Connect service uses their ATRAC sound format, which only works with Sony hardware. This allows you to store 45 hours of sound on a Hi-MD MiniDisc player or up to 22 hours on a flash card player.
MP3 files can be played using any MP3 player application, although a reasonably fast computer is required for best results. In the Mac OS you could use QuickTime Player (Apple), although it’s much more sensible to use iTunes, also available as iTunes for Windows. When first used, this excellent application automatically looks for all the MP3s on your drive and presents them as shown below: it can also be used for listening to Internet radio stations.
The options for burning CDs in iTunes are set under Preferences. The Disc Format can be set to Audio CD (for playing on an audio CD player), MP3 CD (for playing on an MP3-compatible audio player or computer drive) or Data CD or DVD (for playing on a computer using various file formats, such as Audible.com (AA), AAC, AIFF or WAV). Selecting Use Sound Check when using the Audio CD format ensures a consistent volume across the entire CD.
Some albums are designed to be heard as a continuous recording. Unfortunately, setting Gap Between Songs to none in the Audio CD format still leaves a small gap between each track. To avoid this, you must initially ‘rip’ all of the material from the source CD at once by selecting the required tracks and selecting Advanced ➡ Join CD Tracks. Note, however, that this makes it impossible to select a particular track in the final recording.
You can listen to MP3s ‘on the move’ with a portable MP3 player, although you must first transfer the required tracks from your computer to the player’s memory via a FireWire or USB connection.
F. These are inside a Music folder, itself inside a folder called iPod_Control, the latter normally rendered invisible by the system.
Many CD players, in addition to playing standard audio CDs, can play MP3 or WMA tracks that have been recorded on a CD-R or CD-RW data disc. Although less convenient than a portable MP3 player, this lets you store your recordings safely outside of the player and outside of a computer. As with other players (see above), there are often limitations on the files that can be played (AAC files are particularly difficult), whilst older players can have problems with CD-RW discs.
To make MP3s you’ll need an MP3 encoder such as iTunes or MPEG Audio Creator. Both can create MP3s from a CD while the latter can record from alternative sound sources.
The following sections describe sound file formats, complete with filename extensions in order of preference and the standard Classic Mac OS type codes. Those shown with a QuickTime icon can be opened using applications in the QuickTime package. Several of the remaining icons are employed by SoundApp, an excellent Classic Mac OS application from Norman Franke.
The most common formats you’ll encounter are:-
This file format, based on MPEG-4 and capable of conveying video and text as well as audio, is used for transferring information to and from a third generation (3G) mobile phone.
A special type of file used for talking books at the audible.com Web site. This kind of document often requires matching software, although it’s supported directly in Mac OS X by iTunes 3 or later.
File size varies according to the required sound quality, involving the use of different formats numbered from 1 to 5, where 1 employs the greatest amount of compression. A medium-length book of ‘MP3’ quality can occupy around 20 MB, about one-third the size of an equivalent MP3 file.
This file format is based on the MPEG-4 standard, providing more effective compression than MP3 (see below) whilst retaining a very high quality of sound. Unfortunately, many older MP3 applications and portable MP3 players are incapable of playing files of this kind. The use of recordings in this format can be restricted by means of FairPlay Digital Rights Management (DRM).
This kind of file, used various operating systems, including Apple and SGI machines, is the standard sound format for Mac OS X. Unlike the system sounds used in the Classic Mac OS, this kind of document can’t be played by simply clicking on the file: it requires a suitable sound application.
Data is stored as signed 16-bit samples in the data fork, allowing any number of channels at any sampling rate. The duration is restricted only by the maximum file size permitted by the system. The 2 GB limit in the Classic Mac OS lets you record over three hours of high-quality stereo sound.
Essentially this is identical to an AIFF file but employs compression complying to the IMA 4:1, MACE 3:1, MACE 6:1 or µ-law standards. The following points refer to the Classic Mac OS:-
A special format used for mono or stereo, 8 or 16-bit sound, on Atari computers.
The content of a CD audio track is effectively the same as an AIFF or AIFF-C file. In theory, IMA, MACE or µ-law compression can be used, since complementary processing isn’t needed in the player. Unfortunately, such systems don’t give very good results, whilst more advanced codecs, such as QDesign Music 2 or Qualcomm PureVoice, aren’t accommodated by a standard CD player.
An audio CD doesn’t usually incorporate any extra information about the recordings on a disc. However, the CD Database (CDDB) Web site, developed in 1993 by Ti Kan, Graham Toal and Steve Scherf, automatically supplies such information to a CD player application such as iTunes. The site provides the current disc’s title, the artist’s name and the current track name. In the Mac OS, the filenames of the files that represent the tracks are also changed automatically.
This type of file, which is unique to the Classic Mac OS, appears as an alert sound in the system’s Sound control panel. You can hear the contents by simply double-clicking on the file.
In common with other Classic Mac OS files, the information is kept in the file’s resource fork. The resource itself has a resource code of
<space> represents a space character.
Most system sound files have a single Type 1 resource, containing a digitised sound sample or a sound generated using frequency modulation (FM) or a wave table. Older files, as designed for use with HyperCard, contain a Type 2 resource that can only contain a digitised sound.
The file format used for high-quality audio on DVD, conveying Dolby Digital 5.1 signals, also known as Left Centre Right Surround (LCRS) sound. This usually requires five loudspeakers as well as a woofer loudspeaker for bass content. A central loudspeaker is placed between the usual stereo speakers and two additional loudspeakers are positioned to the rear of the listener.
This Intel format uses a fast form of Adaptive Differential Pulse Code Modulation (ADPCM), lossy
4:1 compression, and 16-bit data. Sounds are often sampled at 8 kHz.
This file uses the European GSM 06.10 standard for speech transcoding, as required for a Global System for Mobile Communications (GSM) digital cellular phone. It employs Residual Pulse Excitation and Long Term Prediction (RPE/LTP) coding at 13 kbit/s. Being optimised for speech, this kind of file is also used by Internet phone applications.
Files with a
.au.gsm extension use 33-byte frames, sampled in mono at 8 kHz, and shouldn’t be confused with a standard µ-law file, which has an
.au extension.(see below). The WAVE version of this GSM file uses a slightly different algorithm for mono sounds sampled at any rate.
As used in Commodore Amiga computers for mono 8-bit sound at any sampling rate. Samples are encoded as signed values with optional lossy
2:1 compression using the Fibonacci delta algorithm.
This format is used for academic musical software, employing any sample rate with mono or stereo sound. The data can be 8, 16 or 32-bit, in floating point or linear form.
This kind of file, devised by the Motion Pictures Experts Group (MPEG), the International Standards Organisation (ISO) and the International Electrotechnical Commission (IEC), gives better compression than the older MPEG-1 Layer-1 or Layer-2 systems, condensing a CD-quality audio recording to around one eleventh of its original size. Although lossy, the perceptual coding process doesn’t result in a serious loss of quality
An MPEG-1 Layer-3 file conveys mono or stereo at sampling rates of 32, 44.1 or 48 kHz with a 16-bit resolution. However, Layer-3 can also be used in a MPEG-2 file with rates of 16, 22.05 or 24 kHz, again using 16 bits. To add to the confusion, both MPEG-1 and MPEG-2 files, when used for Layer-3 audio, are known as MP3 files and normally have an
As used in the NeXT operating system, although usually the same as standard µ-law file.
This file format, which is in the public domain, claims to give better results with music than MP3 form of lossy coding (see above) and delivers real time streaming. Unfortunately, it’s not widely supported and, at the time of writing, is accommodated by very few hardware players. Based on sampling at 44.1 kHz, it employs data rates of 64 to 500 kbit/s in stereo and 32 to 256 kbit/s in mono.
As used in the Psion Series 3 and Series 5 personal organisers. The file begins with a short header followed by a-law encoded samples at 8 kHz. The Series 5 organiser uses an EPOC 32 file.
Apple’s format for fast-moving multimedia material, including any combination of movies, sounds or musical sequences. IMA 4:1, MACE 3:1, MACE 6:1 or µ-law encoding can be used if required, although modern systems such as QDesign Music 2 or Qualcomm PureVoice give better results.
As used in CD-ROM authoring applications. Each file contains audio sampled at 44.1 kHz with the low bytes of the 16-bit data sent first. This little-endian format is required by Intel processors.
This format is used with Sound Blaster hardware, as commonly fitted within Windows computers. The sample rate fixed to a multiple of the clock rate of the hardware, while samples are encoded as signed values. Sounds can be segmented and looped, or silent portions can be added.
As used in the professional Mac OS sound editor of the same name. See below.
Developed from the original Sound Designer format, allowing any number of channels, rates or bits. Samples are encoded as signed values with other details kept in three
This kind of file was popular on Classic Mac OS computers prior to the introduction of system sounds. The format is also commonly known as a MacNifty, SoundMaster or SoundWave file, by virtue of the associated hardware and software. The original mono format was introduced with the SoundCap digitiser and contains sounds sampled at 5.6, 7.4, 11.1 or 22.2 kHz.
An uncompressed file uses 8-bit unsigned bytes in the data fork while the compressed variety uses Huffman coding and includes a checksum, as well as information about the sample rate. This kind of file is often known as an HCOM file, since its data begins with these four characters.
A SoundEdit file is similar to an uncompressed SoundCap file, but has information in the resource fork concerning the format, sample rates, looping segments, colours and labels. In a stereo sound file the left and right-hand samples are kept next to each other in the data fork. MACE 3:1 or MACE 6:1 compression, as well as proprietary
8:1 coding systems, can be employed in this kind of file.
Later version of the SoundEdit application, such as SoundEdit Pro and SoundEdit 16, use files that allow samples at up to 48 kHz with 16-bit resolution.
This format, used for storing digitally sampled instruments in the Super Studio Session application, is similar to a SoundCap file but employs an extra eight-byte header and contains uncompressed sounds.
As used by Tascam in their digital 8-track tape recorder.
An ASCII file in which the first line contains the number of samples in the file. The remaining lines each contain an 8-bit sample, with a default sample rate of 22.255 kHz. This file type can be useful for transferring sounds between different types of computer.
This format was devised by Microsoft and IBM for storing sounds on Windows computers, accommodating any number of channels, rates or bits. The samples are encoded as signed values in the little-endian format required by Intel processors.
Various optional compression algorithms are used, although Microsoft’s Adaptive Differential Pulse Code Modulation (MS ADPCM) with lossy
4:1 compression is the most common. Others include GSM 9.7:1, IMA 4:1, a-law and µ-law.
This kind of file is also known as a Sun Audio file, although it’s also used in the NeXT operating system. It accommodates any number of channels, rates or bits using linear or logarithmic (log) encoding. Note that an 8-bit sample with log encoding has the same dynamic range as a 12-bit linear sample, although log encoding can suffer from noise problems and can be slow to decompress.
When a log file is converted into another format its dynamic range can exceed that of an 8-bit system. Even with a conversion factor you’ll get quiet portions or clipping distortion in loud passages. Fortunately, both problems can be avoided by initially converting the file into a 16-bit format.
Most files employ standard µ-law encoding, conforming with conventions established in US telephone systems, providing compression of
2:1. However, the European industry was originally based on a-law encoding, which also gives
2:1 compression. Sadly, this means that a µ-law file can actually contain material with a-law encoding. Some files of this kind have an
.al filename extension, implying the use of a-law encoding, unless indicated otherwise by the file’s header.
.snd, in which case most applications assume the file is sampled at 8 kHz, is mono and is µ-law encoded.
MacWorld magazine (UK), IDG Communications, 2003-4
©Ray White 2004.