The process of reducing an HTML file to the smallest possible size is known as optimisation. This can be achieved by removing unwanted characters or elements, or by using tricks with graphics, all of which are described below. The job itself can be done by hand or by using an optimisation application, the latter working on all the HTML files used in your website.
</noop>, which can be placed around areas of HTML ‘code’ that you don’t want to be optimised.
Although spaces, tabs and
CR (carriage return) or
LF (line feed) characters are useful during the creation of a Web page, they’re of no use to a browser. Hence text in the hierarchic format of:-
My Web Page
can be reduced to a plain format, such as:-
<title>My Web Page</title>
or even to a compact format, such as:-
LFcharacters can produce odd effects in a browser when you select View ➡ Source. Some browsers appear to freeze (although the problem goes away if you wait long enough) while others exhibit cosmetic peculiarities.
Unnecessary spaces can also be removed from other statements. For example, a line such as:-
can be reduced to:-
Some tags don’t actually contribute to the content of a Web page, despite the fact that they contain useful information about the document. For example, many pages don’t include a
DOCTYPE declaration, since this isn’t used by most browsers. However, such lines indicate the form of HTML used in the file: without this information it’s impossible to validate the contents (see below).
Some of the
meta tags found in the head of a document are unnecessary, although, once again, they can contain useful information. The following tags can usually be removed without any problems:-
Contains the name author who created the page, which should be retained in copyright material.
Indicates the kind of content in the Web page and the character set.
Describes the content of the document, which can be of use to a search engine.
Contains the name of the application that the document, although this isn’t usually required.
Provides a list of key words for search engines. You should use 75 characters or less for this purpose.
<x-claris...>, which are created by Web-authoring applications to store extra information. Such tags are best removed before finally publishing your website.
Finally, there are those tags that are produced in error by a Web-authoring application. Such programs can produce internal tags, as in the following:-
while others produce empty tags that are entirely redundant, as in:-
The XHTML 1.0 standard demands that every opening tag has a corresponding closing tag. This means that the the following lines:-
<p>First line<p>Second line
must be replaced by:-
<p>First line</p><p>Second line</p>
The older HTML 4.0 specification permits some closing tags to be omitted, including those used for the
option elements. Unfortunately, some browsers interpret the junctions of such elements in different ways, meaning that leaving out these tags can significantly change the appearance of a Web page. As a rule, closing tags should always be included.
The closing tags for
html, although part of the XHTML and HTML standards, aren’t used by browsers such as Internet Explorer or Netscape, although little is gained by omitting them.
Some tag attributes can influence the speed of page loading. For example, the use of
height attributes within an
image tag can improve performance, while deprecated attributes such as
hspace often slow things down.
The repeated use of attributes can also make a page sluggish. For example, you should avoid using
valign="top" for every
td entry in a
table. Instead, you should employ
<tr valign="top"> at the start of each row. Similarly, you shouldn’t use deprecated tags, especially the
font tag, as in
<font face="times">, but should use stylesheets wherever possible.
If you’re really desperate to reduce the size of a document you can replace long tags with shorter ones. For example, you could consider replacing all the
cite (citation) tags by
i (italic) tags. In practice, however, the saving in space is small. In addition, the
cite tag conveys a real meaning with regards to the nature of the text, rather than just information about the printed style of the material.
Data can also be minimised by using small graphics. Or, if your page contain repetitive graphics you can build up parts of a picture within a
table element. For example the following image
can be created using four images, which are assembled in the following
<table border="0" width="159" cellspacing="0" cellpadding="0">
<td colspan="3"><img src="images/title_bar.gif" alt="" /></td>
<td><img src="images/left_edge.gif" alt="" /></td>
<br /><br /><br />
<img src="images/trash.gif" alt="" />
<td><img src="images/right_edge.gif" alt="" /></td>
<td colspan="3"><img src="images/base_bar.gif" alt="" /></td>
Another small graphic file, which you can call
spacer.gif, can be added to ‘fine tune’ the positioning of any of the parts.
Irrespective of whether a page has been optimised or not, it should conform to an appropriate HTML standard. You can check your material using a process known as validation. There are two ways of doing this: you can visit a special website or employ a suitable HTML text-editing application, such as BBEdit. The latter is preferable if you don’t want to spend too much time on the Internet.
Whatever method you choose, your pages must have a suitable
DOCTYPE declaration. Failure to include such a declaration can cause your validation software to assume, perhaps incorrectly, that you’re using a specific version of HTML.
The causes of validation failure are too numerous to describe here. Suffice to say, most problems are due to tags being in the wrong order (often as a result of producing pages in a WYSIWYG Web-authoring application) or incorrectly encompassing other tags. For example, the inline
span tag (as well as the deprecated
font element) are only allowed inside a single
p element. If such tags cover several paragraphs the Web page concerned is bound to fail its validation test.
The implications of failure aren’t obvious, since a page with hundreds of errors can appear perfect, since modern browsers are tolerant of such mistakes. However, some browsers, perhaps on another computer platform, can make different assumptions about your errors, possibly giving an unexpected result. So, to avoid problems, your Web pages should always conform to the relevant standards.
©Ray White 2004.