Sound

Most modern computers incorporate audio hardware for recording and replaying sound. Even if you don’t have a recording facility, you can play audio CDs and transfer audio tracks from such discs into a sound utility where you can manipulate the material or create sound samples.

The sound that arrives at your computer’s input is digitised and can be recorded onto your hard disk as a sound file. Once in this form, it can also be processed by means of a sound utility. When a sound is played the data is converted back into an analogue signal that appears at the output connector.

With more advanced software, normally operating in conjunction with special hardware, you can use your computer as a multi-track recording studio. The sound samples that you’ve manufactured can then be ‘fired off’ and used within a musical sequencing application.

  Inside a Recording Application

You can convert an audio signal into a sound file with a sound recording application. The window shown below is from Coaster (Christian Roth), a neat Classic Mac OS application.

As you can see, the Sound Input is set to Built-in Mic, although some computers don’t come with this kind of microphone. The Sample Rate is set to 44100, equal to 44.1 kHz, and the Sample Size is 16, which means that 16-bit recording is to be used, thereby ensuring CD-quality sound.

You can also use rates of 22050 (22.05 kHz) or 11025 (11.025 kHz), although these give a reduced high-frequency response, giving an increasingly muffled sound quality. Similarly, you can use 8-bit recording, although this results in distortion that makes the sound ‘granular’.

This application includes sliders for adjusting the Input Gain and provides meters for checking the stereo input levels. Although the red record button isn’t activated, the meters still show the levels, which have been excessive on the left-hand channel, as shown by the red clipping indicator.

Once you’ve set the signal levels you can hit the record button, after which the material is saved in a sound file. In this instance, the file has been assigned a Mac OS creator code of TVOD. This means that that when you double-click on the sound file it’ll be opened in QuickTime Player.

Standard Sound File Formats

Applications such as Coaster (see above) create files that conform to the Audio Interchange File Format (AIFF) standard. When used for CD-quality sound, this kind of document is the same as that used for each track on an audio CD. In fact, AIFF files generated from any application can be easily transferred to a CD-R or CD-RW using a CD burner, so as to create your own music disc.

Unfortunately, AIFF documents use up acres of disk space, around 5 MB for every minute of recorded sound, or, putting it another way, 75 KB per second. For stereo you must multiply this by two and for a multi-track recording application you must multiply it by the number of tracks.

Some applications save sounds as a QuickTime movie, which is usually the same size as an AIFF unless compressed with a suitable codec. Sounds can also be stored in a MPEG-1 Level-3 file, also known as MP3 file. Thankfully, this kind of document is usually a tenth of the size of an AIFF, although the lossy compression that it employs can cause some degradation in sound quality.

  Multi-track Recording

In a multi-track sound application you can build up layers of sound in a similar way to using a multi-track tape machine. During playback the sounds from each track can be mixed together as required. The application can be set up to ‘drop’ into overdub mode at a specific time, allowing extra material to be recorded onto those tracks that you’ve already enabled for recording.

Whatever hardware is used, the number of tracks that you can play simultaneously is limited by the computer’s memory and its drive speed. All PowerPCs and later AV models support multi-track applications directly, usually with up to 32 tracks, although these limitations apply to old machines:-

ModelTracks
Quadra 660AV6
PowerMac 840AV8
PowerMac 6100/668-10

Software and Hardware

If you use a stereo or multi-track audio card, or an audio adaptor box connected via a port, you’ll need driver software and compatible applications. A sound card or FireWire device that incorporates digital signal processing (DSP) lets you apply more dramatic effects to your recordings, but can only be used with compatible applications and plug-ins (see below).

The most common of the proprietary hardware systems include Time Division Multiplexing (TDM) and Real Time Audio System (RTAS), both used in Digidesign products, Virtual Studio Technology (VST) from Steinberg, which employs Audio Stream Input Output (ASIO) for multi-track recording, and MOTU Audio System (MAS), as used in products from Mark of the Unicorn. Some hardware also works with Core Audio in Mac OS X, which accommodates any number of audio channels and all sample rates up to 192 kHz, with 24-bit sampling and 32-bit processing.

Plug-ins

Audio and MIDI sequencing applications often accept plug-ins that provide extra digital sound processing (DSP) capability. Traditional types include MAS, RTAS and VST, matching the hardware in your sound card or adaptor, although these formats are being overtaken by Audio Units plug-ins, which work with the Core Audio and Core MIDI elements of Mac OS X.

The following varieties of plug-ins are in common use:-

Audio Units (AU)

As built into Mac OS X and supported by Peak 4, Logic 6 and Digital Performer 4, as well as later versions of these applications. This format is likely to replace most of the older plug-ins for the Mac.

Host Time Division Multiplexing (HTDM)

An updated version of TDM (see below) that also works with older applications designed for TDM plug-ins, including ProTools and some versions of Logic Audio. Unlike the original TDM format, these plug-ins use the computer’s processor for DSP, rather than the card’s hardware, and are really a hybrid of RTAS and TDM technology (see below).

Time Division Multiplexing (TDM)

This common format employs the hardware in a TDM card for signal processing. Note that some TDM plug-ins are also known as virtual instruments.

MOTU Audio System (MAS)

As used with products from Mark of the Unicorn, such as Digital Performer. A special MAS plug-in allows VST plug-ins to also be used with Performer, although the results can be variable.

Real Time Audio Suite (RTAS)

A plug-in that employs the computer for signal processing, as used in the LE version of ProTools.

VST

A popular plug-in, as used in Cubase VST, Nuendo and older versions of Studio Vision and Logic Audio. With Audio Units active in Mac OS X these plug-ins are disabled, although VST-AU Adaptor (FXpansion Audio) can be used to convert any Carbonised plug-in to the Audio Units format. The multi-track form of VST is known as Audio Stream Input Output (ASIO), whilst an improved form of ASIO, known as ASIO 2, reduces the time delays that can be introduced by signal processing.

  Hard Disks and Audio Recording

A hard disk used for serious audio recording can easily become fragmented. As a result, the data for each sound file gets scattered all over the disk. During playback, the drive mechanism must be able to extract the audio data in real time, but if it’s badly fragmented this isn’t possible.

You can reduce the effects of fragmentation by regularly formatting a drive that’s used for audio recording. Since this erases the disk most people prefer to use a disk optimisation application.

  Inside a Sound Utility

A sound utility can often be used to process the sounds in a file or to save them in another file format. Some applications can also extract sounds from various documents or generic sound files, whilst there are Classic Mac OS programs that can get sound resources out of any file, including an application.

The main window of SoundEffects (Alberto Ricci), a Classic Mac OS utility for processing sounds, is shown below. In this instance, a stereo file has been opened and a segment of one of the stereo channels is selected. The top left-hand corner gives details for the file, including its length in bytes. The central boxes give details about the whole of the selected channel while those at the top right show the position and size of the selection: the segment size shown here is measured in samples.

The waveforms shows how the signal level of each channel vary with time: the left-hand channel (channel 1) is at the top and the right-hand channel (channel 2) at the bottom. Below, there are buttons to enlarge, reduce or normalise the viewing magnification. To the right of these are scrolling buttons for moving the view through the sample. The usual tape recorder buttons appear below these. The last button lets you play the file or the selected segment continuously as a loop.

In a typical utility of this kind you can usually process either a selected segment or the entire file. Special effects can include:-

EffectResult
AmplifyIncreases volume, possibly causing distortion
BackwardsPlays sound backwards
ChannelMoves sound to another channel
ChorusAdds replica of sound slightly shifted in time
DitherAdds background noise to improve subjective quality
DownsizeReduces sample resolution from 16-bit to 8-bit
EchoRepeats sound, fading away with time
Fade InFades volume from zero to maximum
Fade OutFades volume from maximum to zero
FilterEmphasises or reduces frequency components
FlangeAs chorus, but with feedback to create a metallic effect
KeyboardFor ‘playing’ the sound with a musical keyboard
MonoConverts sound sampled as stereo into mono
NoiseInserts white noise (hiss) in place of sound
PanMoves sound around the stereo ‘stage’
Pitch BendModifies pitch during playback
ResampleChanges the sample rate (resolution)
ReverbSimilar to Echo
ReverseSame as Backwards
RobotiseRemoves tonal components
SilenceInserts silence in place of sound
SmoothRemoves spiky components
StereoConverts mono samples into stereo for later processing
UpsizeIncreases resolution from 8 to 16-bit, but not quality
WaveformInserts fixed frequency tone in place of sound

  MP3 Files

MP3 files give near-CD quality music, but are small enough to be downloaded over the Internet. They can be played on a computer using an MP3 application, such as iTunes for the Mac OS or iTunes for Windows. They can also be downloaded via a FireWire or USB port to a portable MP3 player: Apple’s iPod is ideal, as it contains a high-capacity hard disk drive.

MP3s on the Internet

You can locate MP3 files on the Internet by using an MP3 search engine, such as Music Seek, Lycos MP3 search or 2look4. To actually download the sound files you can use an FTP client application, although most people use a Web browser, such as Internet Explorer or Netscape.

AAC FIles

Not all Internet sites supply music as standard MP3 files. For example, Apple’s iTunes Music Store provides Advanced Audio Coding (AAC) files, which can be played on recent versions of iTunes, installed on up to three Mac OS machines, burnt onto a CD or transferred to a portable and AAC-compatible MP3 player, such as a ‘version 2’ iPod, or better.

WMA FIles

Some sites, such as OD2, a UK service provided by EMI, offer Windows Media Audio (WMA) files, also known as Windows Media 9 (WM9) files. These are protected by Microsoft’s own digital rights management (DRM) system and can only normally be used with a WMA-compatible application or player.

Other Formats

Formats other than MP3, AAC and WMA aren’t widely supported by MP3 player applications and hardware. This is unfortunate, since the Ogg Vorbis format, in particular, provides a particularly good quality of sound.

Sony’s Connect service uses their ATRAC sound format, which only works with Sony hardware. This allows you to store 45 hours of sound on a Hi-MD MiniDisc player or up to 22 hours on a flash card player.

Playing MP3 Files

MP3 files can be played using any MP3 player application, although a reasonably fast computer is required for best results. In the Mac OS you could use QuickTime Player (Apple), although it’s much more sensible to use iTunes, also available as iTunes for Windows. When first used, this excellent application automatically looks for all the MP3s on your drive and presents them as shown below: it can also be used for listening to Internet radio stations.

Using a Portable MP3 Player

You can listen to MP3s ‘on the move’ with a portable MP3 player, although you must first transfer the required tracks from your computer to the player’s memory via a FireWire or USB connection.

Using a CD Player and MP3 Files

Many CD players, in addition to playing standard audio CDs, can play MP3 or WMA tracks that have been recorded on a CD-R or CD-RW data disc. Although less convenient than a portable MP3 player, this lets you store your recordings safely outside of the player and outside of a computer. As with other players (see above), there are often limitations on the files that can be played (AAC files are particularly difficult), whilst older players can have problems with CD-RW discs.

Making your own MP3 Files

To make MP3s you’ll need an MP3 encoder such as iTunes or MPEG Audio Creator. Both can create MP3s from a CD while the latter can record from alternative sound sources.

Sound Files

The following sections describe sound file formats, complete with filename extensions in order of preference and the standard Classic Mac OS type codes. Those shown with a QuickTime icon can be opened using applications in the QuickTime package. Several of the remaining icons are employed by SoundApp, an excellent Classic Mac OS application from Norman Franke.

The most common formats you’ll encounter are:-

  3G Phone Protocol (3GPP)  .3gp  ????

This file format, based on MPEG-4 and capable of conveying video and text as well as audio, is used for transferring information to and from a third generation (3G) mobile phone.

  Audible Audio (AA)  .aa  ????

A special type of file used for talking books at the audible.com Web site. This kind of document often requires matching software, although it’s supported directly in Mac OS X by iTunes 3 or later.

File size varies according to the required sound quality, involving the use of different formats numbered from 1 to 5, where 1 employs the greatest amount of compression. A medium-length book of ‘MP3’ quality can occupy around 20 MB, about one-third the size of an equivalent MP3 file.

  Advanced Audio Compression (AAC)  .m4a  mpg4

This file format is based on the MPEG-4 standard, providing more effective compression than MP3 (see below) whilst retaining a very high quality of sound. Unfortunately, many older MP3 applications and portable MP3 players are incapable of playing files of this kind. The use of recordings in this format can be restricted by means of FairPlay Digital Rights Management (DRM).

  Audio Interchange File Format (AIFF)  .aif/.aiff  AIFF

This kind of file, used various operating systems, including Apple and SGI machines, is the standard sound format for Mac OS X. Unlike the system sounds used in the Classic Mac OS, this kind of document can’t be played by simply clicking on the file: it requires a suitable sound application.

Data is stored as signed 16-bit samples in the data fork, allowing any number of channels at any sampling rate. The duration is restricted only by the maximum file size permitted by the system. The 2 GB limit in the Classic Mac OS lets you record over three hours of high-quality stereo sound.

  AIFF-Compressed (AIFF-C)  .aif/.aifc  AIFC/AIFF

Essentially this is identical to an AIFF file but employs compression complying to the IMA 4:1, MACE 3:1, MACE 6:1 or µ-law standards. The following points refer to the Classic Mac OS:-

  Audio Visual Research (AVR)  .avr  ????

A special format used for mono or stereo, 8 or 16-bit sound, on Atari computers.

  CD Audio Track  None  AudT

The content of a CD audio track is effectively the same as an AIFF or AIFF-C file. In theory, IMA, MACE or µ-law compression can be used, since complementary processing isn’t needed in the player. Unfortunately, such systems don’t give very good results, whilst more advanced codecs, such as QDesign Music 2 or Qualcomm PureVoice, aren’t accommodated by a standard CD player.

An audio CD doesn’t usually incorporate any extra information about the recordings on a disc. However, the CD Database (CDDB) Web site, developed in 1993 by Ti Kan, Graham Toal and Steve Scherf, automatically supplies such information to a CD player application such as iTunes. The site provides the current disc’s title, the artist’s name and the current track name. In the Mac OS, the filenames of the files that represent the tracks are also changed automatically.

  Classic Mac OS System Sound  .sfil  sfil

This type of file, which is unique to the Classic Mac OS, appears as an alert sound in the system’s Sound control panel. You can hear the contents by simply double-clicking on the file.

In common with other Classic Mac OS files, the information is kept in the file’s resource fork. The resource itself has a resource code of snd<space>, where <space> represents a space character.

Most system sound files have a single Type 1 resource, containing a digitised sound sample or a sound generated using frequency modulation (FM) or a wave table. Older files, as designed for use with HyperCard, contain a Type 2 resource that can only contain a digitised sound.

  Dolby AC-3  .ac3  ????

The file format used for high-quality audio on DVD, conveying Dolby Digital 5.1 signals, also known as Left Centre Right Surround (LCRS) sound. This usually requires five loudspeakers as well as a woofer loudspeaker for bass content. A central loudspeaker is placed between the usual stereo speakers and two additional loudspeakers are positioned to the rear of the listener.

  DVI ADPCM Sound  .adpcm  APCM

This Intel format uses a fast form of Adaptive Differential Pulse Code Modulation (ADPCM), lossy 4:1 compression, and 16-bit data. Sounds are often sampled at 8 kHz.

  GSM Sound  .gsm/.au.gsm  GSM<space>

This file uses the European GSM 06.10 standard for speech transcoding, as required for a Global System for Mobile Communications (GSM) digital cellular phone. It employs Residual Pulse Excitation and Long Term Prediction (RPE/LTP) coding at 13 kbit/s. Being optimised for speech, this kind of file is also used by Internet phone applications.

Files with a .au.gsm extension use 33-byte frames, sampled in mono at 8 kHz, and shouldn’t be confused with a standard µ-law file, which has an .au extension.(see below). The WAVE version of this GSM file uses a slightly different algorithm for mono sounds sampled at any rate.

  Interchange File Format (IFF)  .iff/.8svc/.8sv/.svx  8SVX

As used in Commodore Amiga computers for mono 8-bit sound at any sampling rate. Samples are encoded as signed values with optional lossy 2:1 compression using the Fibonacci delta algorithm.

  IRCAM  .sf  IRCM

This format is used for academic musical software, employing any sample rate with mono or stereo sound. The data can be 8, 16 or 32-bit, in floating point or linear form.

  MP3  .mp3/.mpg/mpeg  MPG3/MP3<space>/Mp3<space>/MPEG

This kind of file, devised by the Motion Pictures Experts Group (MPEG), the International Standards Organisation (ISO) and the International Electrotechnical Commission (IEC), gives better compression than the older MPEG-1 Layer-1 or Layer-2 systems, condensing a CD-quality audio recording to around one eleventh of its original size. Although lossy, the perceptual coding process doesn’t result in a serious loss of quality

An MPEG-1 Layer-3 file conveys mono or stereo at sampling rates of 32, 44.1 or 48 kHz with a 16-bit resolution. However, Layer-3 can also be used in a MPEG-2 file with rates of 16, 22.05 or 24 kHz, again using 16 bits. To add to the confusion, both MPEG-1 and MPEG-2 files, when used for Layer-3 audio, are known as MP3 files and normally have an .mp3 extension.

  NeXT Sound File  .snd/.nxt  NeXT

As used in the NeXT operating system, although usually the same as standard µ-law file.

  Ogg Vorbis  .ogg  ????

This file format, which is in the public domain, claims to give better results with music than MP3 form of lossy coding (see above) and delivers real time streaming. Unfortunately, it’s not widely supported and, at the time of writing, is accommodated by very few hardware players. Based on sampling at 44.1 kHz, it employs data rates of 64 to 500 kbit/s in stereo and 32 to 256 kbit/s in mono.

  Psion Sound File  .wve  PSON/PSI5

As used in the Psion Series 3 and Series 5 personal organisers. The file begins with a short header followed by a-law encoded samples at 8 kHz. The Series 5 organiser uses an EPOC 32 file.

  QuickTime Movie  .mov  MooV

Apple’s format for fast-moving multimedia material, including any combination of movies, sounds or musical sequences. IMA 4:1, MACE 3:1, MACE 6:1 or µ-law encoding can be used if required, although modern systems such as QDesign Music 2 or Qualcomm PureVoice give better results.

  Raw Audio CD Data

As used in CD-ROM authoring applications. Each file contains audio sampled at 44.1 kHz with the low bytes of the 16-bit data sent first. This little-endian format is required by Intel processors.

  Sound Blaster  .voc  VOC<space>

This format is used with Sound Blaster hardware, as commonly fitted within Windows computers. The sample rate fixed to a multiple of the clock rate of the hardware, while samples are encoded as signed values. Sounds can be segmented and looped, or silent portions can be added.

  Sound Designer  .sd  SFIL

As used in the professional Mac OS sound editor of the same name. See below.

  Sound Designer II  .sd2  Sd2f

Developed from the original Sound Designer format, allowing any number of channels, rates or bits. Samples are encoded as signed values with other details kept in three STR<space> resources.

  SoundCap  .hcom  FSSD

This kind of file was popular on Classic Mac OS computers prior to the introduction of system sounds. The format is also commonly known as a MacNifty, SoundMaster or SoundWave file, by virtue of the associated hardware and software. The original mono format was introduced with the SoundCap digitiser and contains sounds sampled at 5.6, 7.4, 11.1 or 22.2 kHz.

An uncompressed file uses 8-bit unsigned bytes in the data fork while the compressed variety uses Huffman coding and includes a checksum, as well as information about the sample rate. This kind of file is often known as an HCOM file, since its data begins with these four characters.

A SoundEdit file is similar to an uncompressed SoundCap file, but has information in the resource fork concerning the format, sample rates, looping segments, colours and labels. In a stereo sound file the left and right-hand samples are kept next to each other in the data fork. MACE 3:1 or MACE 6:1 compression, as well as proprietary 4:1 and 8:1 coding systems, can be employed in this kind of file.

Later version of the SoundEdit application, such as SoundEdit Pro and SoundEdit 16, use files that allow samples at up to 48 kHz with 16-bit resolution.

  Studio Session Instrument  .dewf  DEWF

This format, used for storing digitally sampled instruments in the Super Studio Session application, is similar to a SoundCap file but employs an extra eight-byte header and contains uncompressed sounds.

  Tascam Digital Interface File Format (TDIFF)

As used by Tascam in their digital 8-track tape recorder.

  Text File  .txt  TEXT

An ASCII file in which the first line contains the number of samples in the file. The remaining lines each contain an 8-bit sample, with a default sample rate of 22.255 kHz. This file type can be useful for transferring sounds between different types of computer.

  Windows WAVE File  .wav  WAVE

This format was devised by Microsoft and IBM for storing sounds on Windows computers, accommodating any number of channels, rates or bits. The samples are encoded as signed values in the little-endian format required by Intel processors.

Various optional compression algorithms are used, although Microsoft’s Adaptive Differential Pulse Code Modulation (MS ADPCM) with lossy 4:1 compression is the most common. Others include GSM 9.7:1, IMA 4:1, a-law and µ-law.

  µ-Law  .au  ULAW

This kind of file is also known as a Sun Audio file, although it’s also used in the NeXT operating system. It accommodates any number of channels, rates or bits using linear or logarithmic (log) encoding. Note that an 8-bit sample with log encoding has the same dynamic range as a 12-bit linear sample, although log encoding can suffer from noise problems and can be slow to decompress.

When a log file is converted into another format its dynamic range can exceed that of an 8-bit system. Even with a conversion factor you’ll get quiet portions or clipping distortion in loud passages. Fortunately, both problems can be avoided by initially converting the file into a 16-bit format.

Most files employ standard µ-law encoding, conforming with conventions established in US telephone systems, providing compression of 2:1. However, the European industry was originally based on a-law encoding, which also gives 2:1 compression. Sadly, this means that a µ-law file can actually contain material with a-law encoding. Some files of this kind have an .al filename extension, implying the use of a-law encoding, unless indicated otherwise by the file’s header.

Reference

MacWorld magazine (UK), IDG Communications, 2003-4

©Ray White 2004.