Entities

Ideally, an HTML text file should only contain ASCII characters, as represented by the codes from 0 to 127. Despite this, most modern Web browsers can display all the codes up to 255, although the results can be inconsistent, since most computer platforms use their own character set. Although many browsers let you choose a specific character set, some characters may be rendered incorrectly. Worse still, the fonts on some platforms can lack a few of the necessary characters.

For all of these reasons you should avoid using codes from 128 to 255. Having said this, you can force a browser to use a given character set by including a meta tag in the head of a page, such as:-

<meta http-equiv="content-type"content="text/html;charset=iso-8859-1" />

which indicates that the ISO 8859-1 (Latin-1) character set is to be used. Other character sets can be imposed using other content values, such as text/html;charset=x-mac-roman for the Mac OS Roman character set or text/html;charset=windows-1252 for the Windows 1252 set.

Understandably, some Mac OS browsers can’t display characters that aren’t part of the Mac’s character set, while others show them badly, often in the form of bitmap images. These include the following:-

¦  Ð  ð  ½  ¼  ¾  š  Š  ¹  ²  ³  Þ  þ  ý  Ý

Introduction

In theory, the character set problem is solved by encoding non-ASCII characters as a special string of characters known as an HTML entity. In reality, some browsers don’t understand every entity.

The following table shows several ways in which a ® (registered trade mark) can be represented:-

Entity TypeEntity
Named&reg;
Decimal Character Code&#174;
Hex Character Code&#xAE;

As you can see, an entity starts with & (ampersand) and ends with ; (semicolon). Although some browsers accept entities without semicolons, they must be included to ensure compatibility.

Named entities are easy to understand when viewing the source of HTML code and are accepted by most browsers. Unfortunately, they don’t always work in older browsers. For example, Netscape 4.7 doesn’t recognise the following named entities and shows them as a character string:-

dagger, bull, trade, ne, infin, le, ge, part, sum, prod, pi, int, Omega, radic, fnof, asymp, hellip, OElig, oelig, ndash, mdash, ldquo, rdquo, lsquo, rsquo, loz, Yuml, frasl, lsaquo, rsaquo, Dagger, sbquo, bdquo, permil, circ, tilde.

A character code uses a number assigned to the character instead of a name and is suitable where content is automatically generated, or for non-Western languages where named entities have no meaning. They can also be used for characters without a named entity, such as the codes from 128 to 159, which are outside the ISO 8859-1 (Latin-1) character set. Unfortunately, some older browsers don’t understand every character code, especially those in hexadecimal (hex) form.

Special ASCII Entities

Since the & (ampersand) identifies the start of each entity it must itself be presented as an entity when used in the contents of a Web page, even though it’s a standard ASCII character. This means that the text three pounds & ten pence must be entered into a page as:-

three pounds &amp; ten pence

In fact, all the special ASCII characters used in HTML must be encoded, as shown below:-

DescriptionCharHTML FunctionNameDecimalHex
Double quote"Encloses value&quot;&#34;&#x22;
Apostrophe'• XHTML &apos;&#39;&#x25;
Ampersand&Starts entity&amp;&#38;&#x26;
Less-than>Starts tag&lt;&#60;&#x3C;
Greater-than<Ends tag&gt;&#62;&#x3E;
XHTML named entity, not recognised by older browsers

The Codes in Detail

The tables in this section give details about the entities used for common characters. Unfortunately, some browsers don’t render them correctly, even with the decimal codes shown in the Char columns. You can check the presentation of codes in your browser by clicking on one of the links below. When you’ve finished looking at a table simply click on your browser’s Back button to return to this page.

Codes 0 to 1023

Codes 7876 to 8899

Codes 8900 to 9923

Codes 9924 to 10947

Although most browsers accept most of the codes in the first two groups, some don’t accommodate those characters that use higher codes. Any codes not included in the above are mainly used for non-Roman languages.

Codes 0 to 127: ASCII Characters

These raw HTML codes can be used to force a browser to use specified characters. Most of these entities are very rarely used, apart from the special cases described above. The remaining characters don’t have a named entity so you must use numerical codes to represent them.

DescriptionCharNameDecimalHex
Control codes- &#00; - 08;&#x00; - 08;
Horizontal tabHT &#09;&#x09;
Line feedLF &#10;&#x0A;
Control codes- &#11; - 12;&#x11; - 12;
Carriage returnCR &#13;&#x0D;
Control codes- &#14; - 31;&#x0E; - 19;
Space  &#32;&#x20;
Exclamation mark! &#33;&#x21;
Double Quote"&quot;&#34;&#x22;
Number sign or hash# &#35;&#x23;
Dollar$ &#36;&#x24;
Percent% &#37;&#x25;
Ampersand&&amp;&#38;&#x26;
Apostrophe •'&apos;&#39;&#x27;
Left Parenthesis( &#40;&#x28;
Right Parenthesis) &#41;&#x29;
Asterisk* &#42;&#x2A;
Plus+ &#43;&#x2B;
Comma, &#44;&#x2C;
Hyphen- &#45;&#x2D;
Period. &#46;&#x2E;
Solidus or slash/ &#47;&#x2F;
Digits 0 - 90 - 9 &#48; - 57;&#x30; - 39;
Colon: &#58;&#x3A;
Semicolon; &#59;&#x3B;
Less-than<&lt;&#60;&#x3C;
Equals= &#61;&#x3D;
Greater-than>&gt;&#62;&#x3E;
Question mark? &#63;&#x3F;
Commercial at@ &#64;&#x40;
Letters A - ZA - Z &#65; - 90;&#x41 - 5A;
Left square bracket[ &#91;&#x5B;
Backslash\ &#92;&#x5C;
Right square bracket] &#93;&#x5D;
Caret^ &#94;&#x5E;
Horizontal bar or underbar_ &#95;&#x5F;
Grave accent` &#96;&#x60;
Letters a - za - z &#97; - 122;&#x61; - 7A;
Left curly brace{ &#123;&#x7B;
Vertical bar| &#124;&#x7C;
Right curly brace} &#125;&#x7D;
Tilde~ &#126;&#x7E;
Unused &#127;&#x7F;
XHTML named entity, not recognised by older browsers

Codes 128 to 159: Special Characters and Punctuation

According to the ISO 8859 1 (Latin-1) standard these values are not assigned to characters that can be displayed, meaning they shouldn’t be used for characters in HTML. In reality, the codes shown in the following table are widely recognised and correspond to codes in the Windows character set. Note that these characters don’t have named entities, which means only numerical codes can be used. Once again, the characters shown in the Char column are as interpreted by your browser, so they may appear differently to the intended character detailed in the Description column. In fact, some browsers don’t show any of these characters, instead displaying a ? or a keyboard button symbol.

DescriptionCharDecimalHex
Unused&#128;&#x80;
Unused&#129;&#x81;
Single low-9 quote •&#130;&#x82;
Small hook f *ƒ&#131;&#x83;
Double low-9 quote •&#132;&#x84;
Ellipsis&#133;&#x85;
Dagger •&#134;&#x86;
Double dagger •&#135;&#x87;
Letter circumflex *ˆ&#136;&#x88;
Per mille •&#137;&#x89;
S caron *Š&#138;&#x8A;
Left small brace&#139;&#x8B;
OE ligature *Œ&#140;&#x8C;
Unused&#141;&#x8D;
Unused or Z caron *Ž&#142;&#x8E;
Unused&#143;&#x8F;
Unused&#144;&#x90;
Left single quote •&#145;&#x91;
Right single quote •&#146;&#x92;
Left double quote •&#147;&#x93;
Right double quote •&#148;&#x94;
Bullet&#149;&#x95;
En-dash •&#150;&#x96;
Em-dash •&#151;&#x97;
Small tilde˜&#152;&#x98;
Trade mark&#153;&#x99;
s caron *š&#154;&#x9A;
Right small brace&#155;&#x9B;
oe ligature *œ&#156;&#x9C;
Unused&#157;&#x9D;
Unused or z caron *ž&#158;&#x9E;
Y diaeresis *Ÿ&#159;&#x9F;

* Similar characters available in codes 256 to 912

Similar characters available in codes 8192 to 12287

In theory, it’s preferable to use the codes for the similar Unicode characters that appear later in this document. Unfortunately, some older browsers simply don’t support these newer codes and even the behaviour of browsers with older codes can be unpredictable.

Codes 160 to 255: Unicode and ISO Latin-1 Characters

The ISO 8859 1 (Latin-1) characters are used in languages that employ a Roman or Latin script, as in most of western Europe and other western countries. Name or code entities can be used.

Unfortunately, the following Mac OS characters, as well as the standard Apple symbol, aren’t available in the Latin-1 set:-

≠  ∞  ≤  ≥  ∂  ∑  ∏  π  ∫  Ω  √  ≈  ∆  ◊  fi  fl  ı  ˘  ˙  ˚  ˝  ˛  ˇ

The Latin-1 set, as shown below, is also part of the Unicode standard, although many browsers don’t recognise the less common, showing instead a ? or a keyboard button symbol. The first block of characters are mainly concerned with punctuation and internationally-recognised symbols:-

DescriptionCharNameDecimalHex
Non-breaking space &nbsp;&#160;&#xA0;
Inverted exclamation mark¡&iexcl;&#161;&#xA1;
Cent¢&cent;&#162;&#xA2;
Pound£&pound;&#163;&#xA3;
Currency¤&curren;&#164;&#xA4;
Yen¥&yen;&#165;&#xA5;
Broken vertical bar¦&brvbar;&#166;&#xA6;
Section§&sect;&#167;&#xA7;
Spacing diaeresis¨&uml;&#168;&#xA8;
Copyright©&copy;&#169;&#xA9;
Feminine ordinalª&ordf;&#170;&#xAA;
Left double angle quote«&laquo;&#171;&#xAB;
Not¬&not;&#172;&#xAC;
Soft hyphen­&shy;&#173;&#xAD;
Registered trade mark®&reg;&#174;&#xAE;
Macron¯&macr;&#175;&#xAF;
Degree°&deg;&#176;&#xB0;
Plus-or-minus±&plusmn;&#177;&#xB1;
Superscript 2²&sup2;&#178;&#xB2;
Superscript 3³&sup3;&#179;&#xB3;
Spacing acute´&acute;&#180;&#xB4;
Microµ&micro;&#181;&#xB5;
Paragraph&para;&#182;&#xB6;
Middle dot·&middot;&#183;&#xB7;
Spacing cedilla¸&cedil;&#184;&#xB8;
Superscript 1¹&sup1;&#185;&#xB9;
Masculine ordinalº&ordm;&#186;&#xBA;
Right double angle quote»&raquo;&#187;&#xBB;
One quarter1/4&frac14;&#188;&#xBC;
One half1/2&frac12;&#189;&#xBD;
Three quarters3/4&frac34;&#190;&#xBE;
Inverted question mark¿&iquest;&#191;&#xBF;

The next set consists of accented or special uppercase characters:-

DescriptionCharNameDecimalHex
A grave                À&Agrave;&#192;&#xC0;
A acuteÁ&Aacute;&#193;&#xC1;
A circumflexÂ&Acirc;&#194;&#xC2;
A tildeÃ&Atilde;&#195;&#xC3;
A diaeresisÄ&Auml;&#196;&#xC4;
A ringÅ&Aring;&#197;&#xC5;
AE ligatureÆ&AElig;&#198;&#xC6;
C cedillaÇ&Ccedil;&#199;&#xC7;
E graveÈ&Egrave;&#200;&#xC8;
E acuteÉ&Eacute;&#201;&#xC9;
E circumflexÊ&Ecirc;&#202;&#xCA;
E diaeresisË&Euml;&#203;&#xCB;
I graveÌ&Igrave;&#204;&#xCC;
I acuteÍ&Iacute;&#205;&#xCD;
I circumflexÎ&Icirc;&#206;&#xCE;
I diaeresisÏ&Iuml;&#207;&#xCF;
ETHÐ&ETH;&#208;&#xD0;
N tildeÑ&Ntilde;&#209;&#xD1;
O graveÒ&Ograve;&#210;&#xD2;
O acuteÓ&Oacute;&#211;&#xD3;
O circumflexÔ&Ocirc;&#212;&#xD4;
O tildeÕ&Otilde;&#213;&#xD5;
O diaeresisÖ&Ouml;&#214;&#xD6;
Multiplication×&times;&#215;&#xD7;
O slashØ&Oslash;&#216;&#xD8;
U graveÙ&Ugrave;&#217;&#xD9;
U acuteÚ&Uacute;&#218;&#xDA;
U circumflexÛ&Ucirc;&#219;&#xDB;
U diaeresisÜ&Uuml;&#220;&#xDC;
Y acuteÝ&Yacute;&#221;&#xDD;
THORNÞ&THORN;&#222;&#xDE;
sz ligatureß&szlig;&#223;&#xDF;

while the final group contains special or accented lowercase letters:-

DescriptionCharNameDecimalHex
a grave                 à&agrave;&#224;&#xE0;
a acuteá&aacute;&#225;&#xE1;
a circumflexâ&acirc;&#226;&#xE2;
a tildeã&atilde;&#227;&#xE3;
a diaeresisä&auml;&#228;&#xE4;
a ringå&aring;&#229;&#xE5;
ae ligatureæ&aelig;&#230;&#xE6;
c cedillaç&ccedil;&#231;&#xE7;
e graveè&egrave;&#232;&#xE8;
e acuteé&eacute;&#233;&#xE9;
e circumflexê&ecirc;&#234;&#xEA;
e diaeresisë&euml;&#235;&#xEB;
i graveì&igrave;&#236;&#xEC;
i acuteí&iacute;&#237;&#xED;
i circumflexî&icirc;&#238;&#xEE;
i diaeresisï&iuml;&#239;&#xEF;
ethð&eth;&#240;&#xF0;
n tildeñ&ntilde;&#241;&#xF1;
o graveò&ograve;&#242;&#xF2;
o acuteó&oacute;&#243;&#xF3;
o circumflexô&ocirc;&#244;&#xF4;
o tildeõ&otilde;&#245;&#xF5;
o diaeresisö&ouml;&#246;&#xF6;
Division÷&divide;&#247;&#xF7;
o slashø&oslash;&#248;&#xF8;
u graveù&ugrave;&#249;&#xF9;
u acuteú&uacute;&#250;&#xFA;
u circumflexû&ucirc;&#251;&#xFB;
u diaeresisü&uuml;&#252;&#xFC;
y acuteý&yacute;&#253;&#xFD;
thornþ&thorn;&#254;&#xFE;
y diaeresisÿ&yuml;&#255;&#xFF;

Codes 256 to 912: Unicode Special Characters and Accents

These entities correspond to modern Unicode character assignments, the most useful being:-

DescriptionCharNameDecimalHex
OE ligatureŒ&OElig;&#338;&#x152;
oe ligatureœ&oelig;&#339;&#x153;
S caronŠ&Scaron;&#352;&#x160;
s caronš&scaron;&#353;&#x161;
Y diaeresisŸ&Yuml;&#376;&#x178;
Small hook fƒ&fnof;&#402;&#x192;
Letter circumflexˆ&circ;&#710;&#x2C6;
Small tilde˜&tilde;&#732;&#x2DC;

Codes 913 to 982: Unicode Greek characters

These characters, as used in mathematics, aren’t supported by some older browsers.

DescriptionCharNameDecimalHex
Capital alphaΑ&Alpha;&#913;&#x391;
Capital betaΒ&Beta;&#914;&#x392;
Capital gammaΓ&Gamma;&#915;&#x393;
Capital deltaΔ&Delta;&#916;&#x394;
Capital epsilonΕ&Epsilon;&#917;&#x395;
Capital zetaΖ&Zeta;&#918;&#x396;
Capital etaΗ&Eta;&#919;&#x397;
Capital thetaΘ&Theta;&#920;&#x398;
Capital iotaΙ&Iota;&#921;&#x399;
Capital kappaΚ&Kappa;&#922;&#x39A;
Capital lambdaΛ&Lambda;&#923;&#x39B;
Capital muΜ&Mu;&#924;&#x39C;
Capital nuΝ&Nu;&#925;&#x39D;
Capital xiΞ&Xi;&#926;&#x39E;
Capital omicronΟ&Omicron;&#927;&#x39F;
Capital piΠ&Pi;&#928;&#x3A0;
Capital rhoΡ&Rho;&#929;&#x3A1;
Capital sigmaΣ&Sigma;&#931;&#x3A3;
Capital tauΤ&Tau;&#932;&#x3A4;
Capital upsilonΥ&Upsilon;&#933;&#x3A5;
Capital phiΦ&Phi;&#934;&#x3A6;
Capital chiΧ&Chi;&#935;&#x3A7;
Capital psiΨ&Psi;&#936;&#x3A8;
Capital omegaΩ&Omega;&#937;&#x3A9;
Small alphaα&alpha;&#945;&#x3B1;
Small betaβ&beta;&#946;&#x3B2;
Small gammaγ&gamma;&#947;&#x3B3;
Small deltaδ&delta;&#948;&#x3B4;
Small epsilonε&epsilon;&#949;&#x3B5;
Small zetaζ&zeta;&#950;&#x3B6;
Small etaη&eta;&#951;&#x3B7;
Small thetaθ&theta;&#952;&#x3B8;
Small iotaι&iota;&#953;&#x3B9;
Small kappaκ&kappa;&#954;&#x3BA;
Small lambdaλ&lambda;&#955;&#x3BB;
Small muμ&mu;&#956;&#x3BC;
Small nuν&nu;&#957;&#x3BD;
Small xiξ&xi;&#958;&#x3BE;
Small omicronο&omicron;&#959;&#x3BF;
Small piπ&pi;&#960;&#x3C0;
Small rhoρ&rho;&#961;&#x3C1;
Small final sigmaς&sigmaf;&#962;&#x3C2;
Small sigmaσ&sigma;&#963;&#x3C3;
Small tauτ&tau;&#964;&#x3C4;
Small upsilonυ&upsilon;&#965;&#x3C5;
Small phiφ&phi;&#966;&#x3C6;
Small chiχ&chi;&#967;&#x3C7;
Small psiψ&psi;&#968;&#x3C8;
Small omegaω&omega;&#969;&#x3C9;
Small thetaϑ&thetasym;&#977;&#x3D1;
Upsilon with hookϒ&upsih;&#978;&#x3D2;
Pi symbolϖ&piv;&#982;&#x3D6;

Codes 8192 to 12287: Other Unicode Characters

These codes replicate many of the items contained in codes 128 to 159. although some browsers don’t support these newer codes, meaning that some of the older codes are still used.

Special Characters and Punctuation

DescriptionCharNameDecimalHex
En-space&ensp;&#8194;&#x2002;
Em-space&emsp;&#8195;&#x2003;
Thin space&thinsp;&#8201;&#x2009;
Zero width non-joiner&zwnj;&#8204;&#x200C;
Zero width joiner&zwj;&#8205;&#x200D;
Left-to-right mark&lrm;&#8206;&#x200E;
Right-to-left mark&rlm;&#8207;&#x200F;
En-dash&ndash;&#8211;&#x2013;
Em-dash&mdash;&#8212;&#x2014;
Left single quote&lsquo;&#8216;&#x2018;
Right single quote&rsquo;&#8217;&#x2019;
Single low 9 quote&sbquo;&#8218;&#x201A;
Left double quote&ldquo;&#8220;&#x201C;
Right double quote&rdquo;&#8221;&#x201D;
Double low 9 quote&bdquo;&#8222;&#x201E;
Dagger&dagger:&#8224;&#x2020;
Double dagger&Dagger:&#8225;&#x2021;
Bullet&bull;&#8226;&#x2022;
Horizontal Ellipsis&hellip;&#8230;&#x2026;
Per mille&permil;&#8240;&#x2030;
Prime&prime;&#8242;&#x2032;
Double prime&Prime;&#8243;&#x2033;
Left single angle quote&lsaquo;&#8249;&#x2039;
Right single angle quote&rsaquo;&#8250;&#x203A;
Overline&oline;&#8254;&#x203E;
Fraction slash&frasl;&#8260;&#x2044;
Euro&euro;&#8364;&#x20AC;

Letter-like Symbols

DescriptionCharNameDecimalHex
Script capital P&weierp;&#8472;&#x2118;
Imaginary part (I)&image;&#8465;&#x2111;
Real part (R)&real;&#8476;&#x211C;
Trade mark sign&trade;&#8482;&#x2122;
Alef symbol&alefsym;&#8501;&#x2135;

Arrows

DescriptionCharNameDecimalHex
Left arrow&larr;&#8592;&#x2190;
Up arrow&uarr;&#8593;&#x2191;
Right arrow&rarr;&#8594;&#x2192;
Down arrow&darr;&#8595;&#x2193;
Left right arrow&harr;&#8596;&#x2194;
Carriage return arrow&crarr;&#8629;&#x21B5;
Left double arrow&lArr;&#8656;&#x21D0;
Up double arrow&uArr;&#8657;&#x21D1;
Right double arrow&rArr;&#8658;&#x21D2;
Down double arrow&dArr;&#8659;&#x21D3;
Left right double arrow&hArr;&#8660;&#x21D4;

Mathematical Operators

DescriptionCharNameDecimalHex
For all&forall;&#8704;&#x2200;
Partial differential&part;&#8706;&#x2202;
There exists&exist;&#8707;&#x2203;
Empty set&empty;&#8709;&#x2205;
Nabla/backward difference&nabla;&#8711;&#x2207;
Element of&isin;&#8712;&#x2208;
Not an element of&notin;&#8713;&#x2209;
Contains as member&ni;&#8715;&#x220B;
N-ary product&prod;&#8719;&#x220F;
N-ary sumation&sum;&#8721;&#x2211;
Minus sign&minus;&#8722;&#x2212;
Asterisk operator&lowast;&#8727;&#x2217;
Square root/radical&radic;&#8730;&#x221A;
Proportional to&prop;&#8733;&#x221D;
Infinity&infin;&#8734;&#x221E;
Angle&ang;&#8736;&#x2220;
Logical AND&and;&#8743;&#x2227;
Logical OR&or;&#8744;&#x2228;
Intersection cap&cap;&#8745;&#x2229;
Union cup&cup;&#8746;&#x222A;
Integral&int;&#8747;&#x222B;
Therefore&there4;&#8756;&#x2234;
Similar to&sim;&#8764;&#x223C;
Approximately equal&cong;&#8773;&#x2245;
Asymptotic to&asymp;&#8776;&#x2248;
Not equal&ne;&#8800;&#x2260;
Identical to&equiv;&#8801;&#x2261;
Less-than or equal&le;&#8804;&#x2264;
Greater-than or equal&ge;&#8805;&#x2265;
Subset of&sub;&#8834;&#x2282;
Superset of&sup;&#8835;&#x2283;
Not a subset of&nsub;&#8836;&#x2284;
Subset of or equal to&sube;&#8838;&#x2286;
Superset of or equal to&supe;&#8839;&#x2287;
Circled plus&oplus;&#8853;&#x2295;
Circled times&otimes;&#8855;&#x2297;
Perpendicular&perp;&#8869;&#x22A5;
Dot operator&sdot;&#8901;&#x22C5;

Miscellaneous Technical

DescriptionCharNameDecimalHex
Left ceiling/APL upstile&lceil;&#8968;&#x2308;
Right ceiling&rceil;&#8969;&#x2309;
Left floor/APL downstile&lfloor;&#8970;&#x230A;
Right floor&rfloor;&#8971;&#x230B;
Left angle bracket&lang;&#9001;&#x2329;
Right angle bracket&rang;&#9002;&#x232A;

Geometric Shapes and Other Symbols

DescriptionCharNameDecimalHex
Lozenge&loz;&#9674;&#x25CA;
Black spade&spades;&#9824;&#x2660;
Black club&clubs;&#9827;&#x2663;
Black heart&hearts;&#9829;&#x2665;
Black diamond&diams;&#9830;&#x2666;

©Ray White 2004.