A text file, which is usually identified by an extension of .txt
, frequently contains plain text in the form of ASCII characters that anyone can read. However, it’s possible to convey additional information in a text file by means of a markup language. The most common language is HyperText Markup Language (HTML), which is dealt with elsewhere in these guides.
All markup languages employ special groups of characters, usually known as tags, which convey the extra information, such as text formatting and layout.The result, which can be tricky to read in a standard text editor, is normally viewed in a suitable browser application.
The document itself can be a plain text file or a variety of Unicode Text File (UTF), although some applications, including older Web browsers, can’t accommodate every kind of UTF file.
The creation of a text file containing a markup language can be approached in two ways. By using an advanced text-processor, such as BBEdit, you can work directly on the raw text, although this can be time-consuming and difficult. Alternatively, you can use a WYSIWYG editor suited to your chosen markup language, allowing you to use intuitive methods to create text and other elements.
Traditional markup languages have been developed from the Standard Generalised Markup Language (SGML), using tags to convey text style, formatting and other information. Of these, the most common variations are listed below. In the Classic Mac OS all these files have a type code of TEXT
, although they should always be identified by the filename extensions shown here.
.htm/.html
As used to create Web pages on the World Wide Web, employing Uniform Resource Locators (URLs) for links to other documents or graphics, the latter supplied as GIF, PNG or JPEG files. The formatting of a group of pages, or even an entire Web site, can be defined using a Cascading Style Sheet (CSS). If necessary, different CSSs can be used for specific pages.
.htm/.html
A variation of HTML that accommodates animated layers, behaviours and style sheets by using JavaScript to manipulate a CSS. This can be exploited in recent Web browsers, such as Explorer and Netscape, when working with active channels. Since this kind of coding actually contains ‘executables’ there’s some possibility of conveying a virus.
.htm/.html
An improved version of HTML that uses XML tag methods for HTML. As in XML, lowercase tags have to be used, opening and closing tags must be included and all parameters must be inside quotes.
.xml
Unlike HTML, this describes the content of the text, not its actual formatting. Once coded in XML, the material can be converted, as required, into HTML by means of a suitable application. The conversion relies on the XML tags that have been designed for the content, as specified in a Document Type Definition (DTD) file or in a standalone document declaration (SDD) in the XML file itself. The appearance of a document can be modified by using a CSS file or by using another file written in Extensible Stylesheet Language (XSL), itself a variation of XML. Further information about elements can be contained in an Element Definition Document (EDD).
Used for getting access to databases over the Internet, including the SQL variety.
.wml
This is a variation of HTML that’s specially designed to suit the small displays used on Internet-equipped mobile phones, commonly known as WAP phones. All files, both text and graphics, must contain less than 1,400 bytes and all images have to be in black-and-white form, not greyscale.
Other languages, often based on one the above or XML, include:-
.htm/.html
An expanded version of HTML, as used in Web-creation applications such as Claris Home Page and FileMaker Pro, employing tags of the form <X-CLARIS...>
, which are ignored by a Web browser.
.cfm/.cfml
Devised by Allaire for use with a Web site that has dynamic page content and Java Database Connectivity (JDBC), allowing anyone who visits the site to retrieve database information.
.htm/.html
A variation of HTML that’s used for i-mode devices.
Another markup language based on XML.
Based on XML, but specially designed for creating, viewing and entering data into complex forms, including legal contracts.
Extensible Style Language .xml
Extensible Style Sheet Language (XSL) .xsl
Extensible Style Sheet Language Template (XSLT) .xslt
More languages based on XML:
.fdml
A variant of HTML that uses special tags, such as [FMP-Record]
to link a FileMaker database into a Web page.
An XML-based language, not intended for the Web, but designed for plugging variable data from a database, such as text or images, into a printed document that has fixed elements. The effect is similar to the mail merge feature found in AppleWorks and other general-purpose applications.
Based on PostScript, this language was developed from XML by Adobe for sending vector graphics over the Web. It can be used an an alternative to Flash, which is a popular binary format.
.sgm/.sgml
A general-purpose markup language used for publishing various kinds of documents.
.smil
This XML-based language is understood by RealPlayer G2 and QuickTime 4.1 or later. It can convey synchronised sound and video over the Web, as well as text, images and Flash animations.
.vml
Also based on XML, this alternative to Flash carries vector graphics.
.ivr/.vrml/.wrl/.wrz
This language is used by three-dimensional (3D) Web sites, but also supports ordinary two-dimensional vector images. It can be decoded by Apple’s QuickDraw 3D software.
Although not usually considered a markup language, RTF has many similar characteristics. It’s commonly used with word processors, usually when interchanging documents via Microsoft’s Word application. Each file consists of normal text interspersed with special strings of characters that represent information about font styles and formatting. RTFs are also supported by numerous other applications, including ClarisWorks, MacWrite II, Works and WriteNow.
.rtf
.There are several variations in the RTF standard, causing some applications to reject specific files. Information about the contents of a document can be gleaned by examining it with a text editor such as BBEdit. The type of file is indicated in the first line of text, sometimes known as a file header. For example, a document that uses the Windows (ANSI) character set usually begins with:-
while one that uses the Mac OS character set should begin with:-
As this implies, all RTF commands are preceded by a \
(backslash) whilst tabular lists and other data related to commands are held in groups of {
and }
brackets. Each document begins with a list of fonts, followed by stylesheet data, paper size and margin information. Here’s an example paragraph, with the text written on new lines to make it easier to read:-
\pard…
{\f21 This is }
{\f21 \b bold}
{\f21 now }
{\f21 \i italic}
{\f21 now }
{\f21 \ul underline}
{\f21 .
\par
where …
actually consists of a string of other commands, \f21
relates to the font in the font list at the beginning of the document and \b
, \i
and \u
indicate font styles. The result is rendered as:-
This is bold now italic
now underline.
Mac OS X lets you create RTF files using the TextEdit application. A TextEdit document is often created in a special folder, known as a bundle or package, that behaves as a single file. This folder, also containing the PICT image files for the document, has a filename extension of .rtfd
.
When viewed in a text editor, a TextEdit RTF file begins with:-
reminding us that Cocoa is the native programming environment of OS X. Such documents aren’t acceptable to some applications, although the RTF translator supplied with later versions of MacLinkPlus Deluxe works well. However, embedded graphics, linked to PICT files in the same folder as the RTF, aren’t always understood. As seen in the raw data, these appear as:-
{{\NeXTGraphic
}¨}
where __RES1000__.pict
is the name of the appropriate graphic file. This kind of line, which is really in a non-standard form, also clearly betrays the NeXT origins of Mac OS X.
This proprietary markup language is used to create documents for the Palm OS. A document containing PML can be created in a text editor such as BBEdit or by creating a word-processing file in the Word application and applying a special macro known as word2pml.
.txt
. However, there’s no problem with using .pml
, especially in Mac OS X where this can properly identify the file.The completed PML file can be dropped onto DropBook, a special application that converts the content into an electronic book (eBook). The resultant file, which is identified by a .pdb
extension, can be viewed using Palm Reader (Palm), either on your Palm organiser or on a standard computer.
Palm Reader can also be used to view documents that have been created in DOC format, a standard type of file found in the Palm OS. Various applications can create these files, such as Pordible, which converts standard .txt
files to .pdb
DOC files and vice versa.
.pdb
. Note that the .prc
extension, which is invariably used for Palm OS applications, should be avoided in connection with documents.PML is very simple and is similar in some ways to RTF. The commands are preceded by a \
(backslash) and usually followed by =
(equals) and the parameters within straight quotes.
The following table summarises the standard commands:-
Command | Meaning |
---|---|
\p | Page |
\x | New |
\Xn | New |
\c | Center |
\r | Right-align |
\i | Italic |
\u | Underline |
\o | Overstrike |
\v | Invisible |
\t | Indent |
\T="50%" | Indent |
\W="50%" | Embed |
\n | Normal |
\s | Standard |
\b | Bold |
\l | Large |
\axxx | Non-ASCII |
\m="image | Image |
\q="#link | Link |
\Q="link | Anchor |
\- | Soft |
\B | Bold |
\Sp | Superscript |
\Sb | Subscript |
\Fn="foot | Footnote |
\Sd="side | Sidebar |
\Cn="Chapter | Chapter |
Many of these tags are used at the beginning and end of the required effect. So, for example, if you want the word bold
in the following to be presented using a bold font you must use:-
but if you want to employ the current font in a bold style you must use:-
Footnotes and sidebars content is specified in XML form at the end of the document, as in:-
<sidebar id="sidebar1">
Here's the \itext\i
</sidebar>
Any ASCII text file can be converted to PML using the Find & Replace feature found in a text editing application such as BBEdit. This can be applied to your own work or to non-copyright material, such as that published by Project Gutenberg. The following procedure can be used:-
-
(hyphen) characters from the ends of lines but not a --
(double hyphen).CR
(carriage return) and/or LF
(line feed) codes by replacing, in order:-•••
.•••
in this case, by double line endings.`
(backquote) characters by a '
(straight quote)."'
(a double-quote followed by a single quote) by "\a160'
, where \a160
is a special PML code representing NBSP
(non-breaking space).'"
(a single-quote followed by a double quote) by '\a160"
....
(run of periods) by \a133
, the latter representing an ellipsis in PML.\i
codes, which makes the text italic._
(underscore) by normal characters contained within \u
codes, which will makes the text underlined.Note that the replacement of single line endings by a space, as shown in step 3, may not always be appropriate, as, for example, in poetic verse.
Peanut Press website at www.peanutpress.com.
©Ray White 2004.