File Basics

The hard disk in a computer can be compared to the filing cabinet in an office, where a computer file equates to a file within such a cabinet. However, in a computer, the file is really an illusion, created for our convenience by the computer’s operating system.

The Disk Catalogue

The information on your hard disk consists of a large amount of data, organised into sectors of a workable size and assembled into files by an invisible file known as a catalogue, also known as a file allocation table (FAT) in some operating systems. In the Classic Mac OS extra invisible files provide a database of information about all of the files on a drive to be used by the Finder.

The location of each file on a computer is commonly defined as a path, starting at the root node (the directory or top level of the drive), passing down levels of index nodes or sub-directories (corresponding to folders), finally reaching the leaf node, the physical location of the file. The main catalogue on a disk is commonly known as a balanced tree (B-Tree), since it has an asymmetrical shape with the root at one end and the leaf nodes at the other.

Ideally, every file should use sectors in an orderly manner across the disk. In practice, as files are recorded and then removed, some documents employ sectors that are interspersed between those of other files. Such files are then said to be fragmented. This needn’t be a problem, although a separate extents B-tree catalogue is usually required to keep track of all the fragments.

File Identification

A computer usually uses numerous kinds of files, making some form of identification essential. Although a file can contain useful information within its header (the first few bytes of data), this isn’t accessible to the user and, being non-standardised, isn’t recognised by some applications. Worse still, if an application is used to open an inappropriate document, it can cause the computer to crash.

Filenames

Depending on your computer’s operating system, there can be restrictions on the number of characters used in a filename or the actual characters employed in a name. Generally speaking, you can use most printable characters in filenames, apart from those used as separators in file paths (see below). In practice, this usually means avoiding : (colon),\ (backslash), or / (forward slash) in your filenames.

Fortunately, your computer will stop you from using unsuitable characters. For example, in the Classic Mac OS, if you enter a : (colon) into a filename it’s automatically replaced by a - (hyphen). In a similar way, Mac OS X forbids you from using a / (forward slash), since this character is used as a separator for file paths in the Unix operating system and also on the Internet.

In the Classic Mac OS, you can also paste special characters, such as a space or carriage return, onto the front of a name so as to organise it in a menu or list, although the effect of such characters in an Open or Save dialogue can be disconcerting. By using special software, you can insert control codes into names, some of which don’t appear as visible characters in the menu, although the actual results vary with the version of Mac OS and the system font that’s in use.

CD Filenames

The filenames on a CD conforming to ISO 9660 are limited to eight characters and a three-character filename extension, as decreed by the rules of MS-DOS. To accommodate the longer names in Windows, discs are normally prepared using a variation of ISO 9660 known as the Joliet format. Such CDs can be viewed on other machines, although only ‘short’ names may appear.

Sadly, the Classic Mac OS doesn’t support the Joliet format. This means that you can only see ‘short’ names, causing most links required for HTML content to fail. To correct this problem you can install Joliet Volume Access (Thomas Tempelmann).

You may also encounter the Mac-extended version of ISO 9660. This allows up to 255 characters in a filename, although such names won’t be recognised by a computer that doesn’t use the Mac OS.

File Paths

In the Classic Mac OS a path is written in the following form, with each item delimited by a colon:-

Macintosh HD:Applications:ClarisWorks:MyBook

where Macintosh HD is the directory, Applications is the sub-directory and MyBook is the filename of the file. Since the colon is used as a separator it can’t be used in the filename itself. The system for Mac OS X is similar, except that a forward slash separator is used, as in:-

Library/Desktop Pictures/Aqua Blue.jpg

In MS-DOS, as employed on older PC machines, a backslash is used as the separator, as in:-

\DOCS\LETTERS\FD010708.TXT

This absolute path begins with a backslash, starting at the outermost directory, the top level of the current disk drive. A relative path, on the other hand, begins with a directory name, indicating that the path starts in the current directory. Once again, the separator can’t be used in a filename.

On the Internet a similar system is used for a uniform resource locator (URL), but in this instance a forward slash separator is used, as shown in the following example:-

file:///Macintosh%20HD/Docs/HTML%20Reference/index.html

As you can see, each space in the original filename has been encoded. All characters of this kind are replaced by a % followed by the corresponding hex code, so, for example, # (hash) becomes %23. Once again, the separator character can’t be used inside a filename.

Filename Extensions

The most common form of file identification consists of a filename extension. This involves using a filename that ends with a full-stop (period) followed by a recognised set of characters.

In early operating systems, such as MS-DOS on a PC, an uppercase filename of eight characters or less, plus a three-character filename extension is used. So a typical file might have the name:-

MY_FILE.TXT

where the TXT extension indicates that this is a text file. MS-DOS doesn’t permit a name to contain a space, so these are replaced by a _ (underscore character), as shown above. More recent operating systems, such as Windows, allow both spaces and mixed capitalisation, together with longer names (up to 255 characters) and longer extensions, as in:-

My HTML Document.html

where the html extension indicates that this is an HTML file. However, if you wanted to put this file onto a Web page on the Internet you would still have to avoid spaces. Also, since the Internet often disregards the case of letters, it’s convenient to change the name to:-

my_html_document.html

Furthermore, if you wanted your page to work via an old MS-DOS Web server (they still exist), you must revert to the old ‘eight plus three’ format, usually with lowercase characters, as in:-

my_doc.htm

Whatever server is used, you should only use letters, numbers, +, -, . and _ in filenames. In addition, you shouldn’t start filenames with a full-stop (period). Other characters are best avoided, either because they have special meaning or because they must be encoded within a URL.

Classic Mac OS File Identification

The Classic Mac OS uses its own system of file identification to automatically connect each document to its parent application. To understand this we must look a more closely at the sophisticated workings of the system. Firstly, you should appreciate that each file in the Classic Mac OS can consist of up to two files, both of which have the same name, although the user only sees one document.

The first of these files, the resource fork, contains standardised ‘building blocks’ of 680x0 executable code, as well other components used in applications, such as images and sounds. These elements are known as resources, each of which are identified by a four-letter code.

The second file, known as the data fork, is usually similar to the data files used on other computer platforms.Most Mac documents use only the data fork whilst most application programmes employ only a resource fork, although some files use both types of fork.

The Desktop File

Most applications incorporate a bundle resource, identified by the BNDL code, which identifies the documents and icons related to the application. When such an application is first installed, this information is transferred to a set of files collectively known as the desktop file. The Finder can then rapidly identify a document or related icon without having to open the application’s file.

On a standard hard disk, the desktop file actually consists of two invisible documents called Desktop DB and Desktop DF, whilst a diskette contains a single file called Desktop. The letters DB stand for Desktop BNDL, which means that this file stores all of the relationships between applications and associated documents, whilst the Desktop DF file contains a complete record of all the files on the disk.

These files provide a database of the files on a drive, allowing the Finder to correctly display each item, and also to provide information in each file’s Get Info window. The latter gives the file’s Name, Size, Where (its location on the disk), Date Created, Date Modified, Version, Label and Comments, as well as settings for the Locked and Stationery Pad check boxes.

The desktop files also keep a record of the four-letter type codes and creator codes that identify each kind of file, although such information isn’t usually accessible unless the file is examined with FileTyper or a similar utility. It’s these codes that are used to ‘link’ each file to its parent application.

Rebuilding the Desktop

Although newly-installed applications are automatically added to the Desktop database, they aren’t removed after you trash them. This means that you should regularly update the database by rebuilding the desktop, a process usually implemented by selecting Restart in the Finder and holding down -Option. On completion the database should take account of all available applications.

Type Codes

In the Classic Mac OS each file usually has a four-letter type code that describes the type of data that it contains, allowing the Finder to identify the applications that can open the file. For example, a standard text file created by any application has a type code of TEXT.

The following files are used by the Classic system or operate directly from within its Finder:-

TypeFile TypeFile
APPLApplication  FFILFont suitcase
cdevControl panel (old) shlbLibrary
APPCControl panel (modern) INITSystem extension
dfilDesk accessory (old) appeSystem extension (application)
APPDDesk accessory (modern) sfilSystem sound
FNDRFinder zsysSystem suitcase

The following examples of generic documents can be opened from within various applications:-

TypeFile TypeFile
EPSFEncapsulated PostScript File ttroSimpleText Read-only document
GIFfGraphic Interchange File TEXTText document
PICTPicture (QuickDraw) TIFFTagged Image File Format

whilst the next group are usually opened using a specific application:-

TypeFile TypeFile
CWDBAppleWorks database  DRWGMacDraw II document
CWGRAppleWorks drawing  PNTGMacPaint document
CWPTAppleWorks painting  WORDMacWrite document
CWSSAppleWorks spreadsheet  ALB6PageMaker 6.x document
CWWPAppleWorks WP document 8BPSPhotoshop document
XLS5Excel 5.x worksheet rsrcResEdit file
FMPRFileMaker Pro document SIT5StuffIt 5 archive
AGD1FreeHand document W8BNWord 98 document
EPSFIllustrator document • .WP6WordPerfect 6.x document
Some Illustrator files use a non-standard variation of EPSF

Creator Codes

Each document’s four-letter creator code, also known as an application’s signature, links the file to its parent application. It’s also used to identify the application file itself. Hence the ClarisWorks application and all its associated files are assigned a creator code of BOBO.

The Finder in the Classic Mac OS uses the creator code to give each document a correct icon and to ensure that the relevant application is launched when you double-click the file. You can change the code of a document by means of a file utility such as FileTyper. Having done this, the Finder treats the file as belonging to the application with the new code, usually giving it a different icon. The appropriate application will then be launched if you double-click the document.

The following creator codes are used by the operating system:-

CreatorFiles
MACSSystem software
movrFonts, desk accessories, sound files

Multiple Applications

More than one application with the same creator can reside on a drive at once, although this can cause confusion if the applications contain BNDL resources that point to other files. In most instances the last installed application takes precedence, even though this may not be the latest version.

Version Codes

In most instances, different versions of the same application use an identical creator code. However, some products use specific codes for given versions of a program. For example, PageMaker 5 has a code of ALD5 but PageMaker 6 uses ALD6. Although this adds a complication, it allows two versions of an application to be used at once without confusion over file identity. Creator codes that identify a specific version often end in a number, although this needn’t correspond to the actual version number. Worse still, some older versions of an application may use a entirely different code.

Most modern applications can open older files that possess a recognised code, although this isn’t always the case. In files created by an application the creator code always matches the contents of the file. However, if the code has been changed manually, it may indicate a different version to the actual contents of the file. The effect of this is dependent on the application: if the document contains a header that defines the file’s content the application may ignore the code and open the file anyway. However, if the application relies entirely on the creator code the whole thing can go horribly wrong.

Codes and Filename Extensions

The creator codes in the Classic Mac OS aren’t recognised by non-Macintosh computers. As a result, such codes are lost when documents are transferred to other computers, usually via networks or over the Internet. In these circumstances, only the filename extension identifies each file. Unfortunately, this is rather too easy for the user to change. To avoid such problems, you should always add the correct filename extension to your files, even if they’re used only on a Mac OS computer.

Although at the time of writing Mac OS X supports creator codes, they’re not compulsory, which means that filename extensions should always be used. Fortunately, applications often append them automatically, while OS X can hide them away, greatly improving the working environment. In fact, the correct use of extensions has made creator codes obsolete: the complications caused by conflicts between extensions and creator codes can be avoided by simply not using them.

File Dates

All modern computer platforms record the date on which each file was created at the date it was last modified. This information is stored in the disk drive’s catalogue, or in the desktop files of the Classic Mac OS.

Mac OS Dates

In the Mac OS, the two dates, known as Date Created and Date Modified, can be viewed from within the Finder’s Get Info window, whilst Date Modified also appears in file lists. The original system also recorded the Date Backed Up value, although this is now rarely used.

The Classic Mac OS measures date and time as a number of seconds since January 1st, 1904. The whole number part shows the number of days that have elapsed, so giving the date. The fractional part, to the right of the decimal point, indicates the actual time of day. This numbering system was designed to last until 2040 but has been extended in recent versions of the Mac OS.

When required, this form of date and time is converted into a localised format, often in Roman calendar form, consisting of years, months, days, hours and seconds. In Mac OS 8 or higher, disks are usually prepared using HFS+ or Extended Volume formatting. This mechanism stores the dates and times for each file in terms of Greenwich Mean Time (GMT), employing the settings in the Date & Time control panel to generate the localised dates and times. Unfortunately, these values change if the Time Zone settings are altered in the Date & Time control panel. In this goes wrong, some backup applications and other software can behave in unexpected ways.

Windows Dates

In a PC, each file is given a time stamp, accommodating dates from January 1st, 1980 to the end of the year 2099. The time stamp is created using the following equations:-

Date = day number + (32 × month number) + 512 × (year − 1980)Time = (seconds/2) + (32 × minutes) + (1024 × hours)

File Attributes

A disk’s catalogue, or the desktop files in the Mac OS, retains additional information about each file. This consists mainly of single bits of data, also known as flags, that set the file’s attributes.

The following attributes apply to the Classic Mac OS, although other platforms use similar flags:-

File Locked

Prevents the file from being modified. This can be set from the Finder’s Get Info dialogue box.

Name Locked

Prevents you from changing the file’s name.

Is Invisible

Makes the file invisible, although it still works normally. This can be useful if a file has to be in a specific location but you don’t actually want to see it or do anything to it.

Has Bundle

A Classic Mac OS flag that indicates that the file can create files that have the same creator code as itself or has a special icon. This usually means that the file is an application program.

Has Been Inited

Indicates that the file is known to the Classic Mac OS Finder, which checks this attribute itself whenever it calls on the item. If you change other attributes, such as Invisible, you should also uncheck this attribute, forcing the Finder to look at it, although some utilities do this automatically.

Has No INITS

Indicates that the file doesn’t contain elements that modify the Classic Mac OS at startup. This attribute should be checked if the file is not a system extension or control panel.

Is Shared

Shows that the file is being shared over a network.

Is Alias

Indicates that the file is an alias.

Is Stationery

Indicates that the file is a stationery document. This means that you can open it as if it were a template, but when you save your completed work the original stationery file remains unaltered. Some Mac OS applications can automatically create a stationery file from within a Save dialogue, although the attribute can also be set directly from within the Finder’s Get Info window.

Has Custom Icon

Shows that a Classic Mac OS custom icon is in use. This attribute is automatically checked whenever you paste graphics into the icon box in the Get Info window. Note that some graphics applications automatically create a special ‘preview icon’ when any file is saved.

©Ray White 2004.