Normal 8-bit coding provides 256 characters in a standard character set, which is insufficient for all the special symbols, punctuation and accented characters used in various languages. And it’s certainly inadequate for the vast range of symbols used in pictographic languages such as Chinese or Japanese.
Unicode, also known as ISO 10646-M, uses 16-bit codes to define up to 65,536 characters. These are grouped as shown in the following table:-
| From | To | Usage |
|---|---|---|
| 0 | 8191 | Alphabetic Characters (0-255 as ISO) |
| 8192 | 12287 | Alphabetic punctuation, symbols, dingbats |
| 12288 | 16383 | Pictographic, auxiliary alphabets, punctuation |
| 16384 | 59391 | Pictographic characters |
| 59392 | 65024 | Special |
| 65025 | 65535 | Software development |
Although all of these Unicode codes are fully standardised, many applications or computer operating systems are limited to showing only some of the characters.
Further details regarding some these groups appear in the following sections.
The characters generated by these codes are identical to those defined by the ASCII standard and the ISO 8859-1 standard, the latter also known as Latin-1. This makes it easy to convert material that’s coded using a western-based character set, often known as Roman, into Unicode form.
The ISO 8859-1 character set is as follows:-
| Hex | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | |
| Dec | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 00 | 0 | NUL | SOH | STX | ETX | EOT | ENQ | ACK | BEL | BS | HT | LF | VT | FF | CR | SO | SI |
| 10 | 16 | DLE | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | EM | SUB | ESC | FS | GS | RS | US |
| 20 | 32 | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | |
| 30 | 48 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
| 40 | 64 | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
| 50 | 80 | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
| 60 | 96 | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
| 70 | 112 | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
| 80 | 128 | | | | | | | | | | | | | | | | |
| 90 | 144 | | | | | | | | | | | | | | | | |
| A0 | 160 | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | ª | « | ¬ | - | ® | ¯ | |
| B0 | 176 | ° | ± | ² | ³ | ´ | µ | ¶ | · | ¸ | ¹ | º | » | ¼ | ½ | ¾ | ¿ |
| C0 | 192 | À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë | Ì | Í | Î | Ï |
| D0 | 208 | Ð | Ñ | Ò | Ó | Ô | Õ | Ö | × | Ø | Ù | Ú | Û | Ü | Ý | Þ | ß |
| E0 | 224 | à | á | â | ã | ä | å | æ | ç | è | é | ê | ë | ì | í | î | ï |
| F0 | 240 | ð | ñ | ò | ó | ô | õ | ö | ÷ | ø | ù | ú | û | ü | ý | þ | ÿ |
The codes from 128 to 159 should be avoided, as these can be used for non-standard characters.
This area is used for less common accented characters, many of which appear with an accent separate to the letter itself, as shown below. Characters that can’t be displayed by your browser appear as a ? (query) or as a keyboard button symbol.
| Hex | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | |
| Dec | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 100 | 256 | Ā | ā | Ă | ă | Ą | ą | Ć | ć | Ĉ | ĉ | Ċ | ċ | Č | č | Ď | ď |
| 110 | 272 | Đ | đ | Ē | ē | Ĕ | ĕ | Ė | ė | Ę | ę | Ě | ě | Ĝ | ĝ | Ğ | ğ |
| 120 | 288 | Ġ | ġ | Ģ | ģ | Ĥ | ĥ | Ħ | ħ | Ĩ | ĩ | Ī | ī | Ĭ | ĭ | Į | į |
| 130 | 304 | İ | ı | IJ | ij | Ĵ | ĵ | Ķ | ķ | ĸ | Ĺ | ĺ | Ļ | ļ | Ľ | ľ | Ŀ |
| 140 | 320 | ŀ | Ł | ł | Ń | ń | Ņ | ņ | Ň | ň | ʼn | Ŋ | ŋ | Ō | ō | Ŏ | ŏ |
| 150 | 336 | Ő | ő | Œ | œ | Ŕ | ŕ | Ŗ | ŗ | Ř | ř | Ś | ś | Ŝ | ŝ | Ş | ş |
| 160 | 352 | Š | š | Ţ | ţ | Ť | ť | Ŧ | ŧ | Ũ | ũ | Ū | ū | Ŭ | ŭ | Ů | ů |
| 170 | 368 | Ű | ű | Ų | ų | Ŵ | ŵ | Ŷ | ŷ | Ÿ | Ź | ź | Ż | ż | Ž | ž | ſ |
Of these, and other codes in this group, the following are the most useful:-
| Hex | Dec | Description | Char | Hex | Dec | Description | Char | |
|---|---|---|---|---|---|---|---|---|
| 152 | 338 | OE ligature | Œ | 2C6 | 710 | Letter circumflex | ˆ | |
| 153 | 339 | oe ligature | œ | 2D8 | 728 | Breve accent | ˘ | |
| 160 | 352 | S caron | Š | 2D9 | 729 | Dot accent | ˙ | |
| 161 | 353 | s caron | š | 2DA | 730 | Ring accent | ˚ | |
| 178 | 376 | Y diaeresis | Ÿ | 2DB | 731 | Ogonek | ˛ | |
| 192 | 402 | Small hook f | ƒ | 2DC | 732 | Small tilde | ˜ |
The remaining codes in this block are assigned to other obscure characters.
These characters are often used in maths and other applications. For simplicity, unassigned codes have been omitted from the following table.
| Hex | Dec | Description | Char | Hex | Dec | Description | Char | |
|---|---|---|---|---|---|---|---|---|
| 391 | 913 | Capital alpha | Α | 3B4 | 948 | Small delta | δ | |
| 392 | 914 | Capital beta | Β | 3B5 | 949 | Small epsilon | ε | |
| 393 | 915 | Capital gamma | Γ | 3B6 | 950 | Small zeta | ζ | |
| 394 | 916 | Capital delta | Δ | 3B7 | 951 | Small eta | η | |
| 395 | 917 | Capital epsilon | Ε | 3B8 | 952 | Small theta | θ | |
| 396 | 918 | Capital zeta | Ζ | 3B9 | 953 | Small iota | ι | |
| 397 | 919 | Capital eta | Η | 3BA | 954 | Small kappa | κ | |
| 398 | 920 | Capital theta | Θ | 3BB | 955 | Small lambda | λ | |
| 399 | 921 | Capital iota | Ι | 3BC | 956 | Small mu | μ | |
| 39A | 922 | Capital kappa | Κ | 3BD | 957 | Small nu | ν | |
| 39B | 923 | Capital lambda | Λ | 3BE | 958 | Small xi | ξ | |
| 39C | 924 | Capital mu | Μ | 3BF | 959 | Small omicron | ο | |
| 39D | 925 | Capital nu | Ν | 3C0 | 960 | Small pi | π | |
| 39E | 926 | Capital xi | Ξ | 3C1 | 961 | Small rho | ρ | |
| 39F | 927 | Capital omicron | Ο | 3C2 | 962 | Small final sigma | ς | |
| 3A0 | 928 | Capital pi | Π | 3C3 | 963 | Small sigma | σ | |
| 3A1 | 929 | Capital rho | Ρ | 3C4 | 964 | Small tau | τ | |
| 3A3 | 931 | Capital sigma | Σ | 3C5 | 965 | Small upsilon | υ | |
| 3A4 | 932 | Capital tau | Τ | 3C6 | 966 | Small phi | φ | |
| 3A5 | 933 | Capital upsilon | Υ | 3C7 | 967 | Small chi | χ | |
| 3A6 | 934 | Capital phi | Φ | 3C8 | 968 | Small psi | ψ | |
| 3A7 | 935 | Capital chi | Χ | 3C9 | 969 | Small omega | ω | |
| 3A8 | 936 | Capital psi | Ψ | 3D1 | 977 | Small theta symbol | ϑ | |
| 3A9 | 937 | Capital omega | Ω | 3D2 | 978 | Upsilon with hook symbol | ϒ | |
| 3B1 | 945 | Small alpha | α | 3D5 | 981 | Symbol | ϕ | |
| 3B2 | 946 | Small beta | β | 3D6 | 982 | Pi symbol | ϖ | |
| 3B3 | 947 | Small gamma | γ |
These codes are used for rather less common characters and punctuation. The following table only shows the more usual characters, with numerous rows omitted for clarity. Some codes don’t appear to create any visible character but are in fact used for a range of different types of spaces. Those characters that can’t be displayed by your browser are indicated by a ? or by a keyboard button symbol.
| Hex | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | |
| Dec | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2000 | 8192 | | | | | | |||||||||||
| 2010 | 8208 | ‐ | ‑ | ‒ | – | — | ― | ‖ | ‗ | ‘ | ’ | ‚ | ‛ | “ | ” | „ | ‟ |
| 2020 | 8224 | † | ‡ | • | ‣ | ․ | ‥ | … | ‧ | | | | | | |||
| 20A0 | 8352 | ₠ | ₡ | ₢ | ₣ | ₤ | ₥ | ₦ | ₧ | ₨ | ₩ | ₪ | ₫ | € | ₭ | ₮ | ₯ |
| 2110 | 8464 | ℐ | ℑ | ℒ | ℓ | ℔ | ℕ | № | ℗ | ℘ | ℙ | ℚ | ℛ | ℜ | ℝ | ℞ | ℟ |
| 2120 | 8480 | ℠ | ℡ | ™ | ℣ | ℤ | ℥ | Ω | ℧ | ℨ | ℩ | K | Å | ℬ | ℭ | ℮ | ℯ |
| 2130 | 8496 | ℰ | ℱ | Ⅎ | ℳ | ℴ | ℵ | ℶ | ℷ | ℸ | ℹ | ℺ | ℻ | ℼ | ℽ | ℾ | ℿ |
| 2190 | 8592 | ← | ↑ | → | ↓ | ↔ | ↕ | ↖ | ↗ | ↘ | ↙ | ↚ | ↛ | ↜ | ↝ | ↞ | ↟ |
| 21B0 | 8624 | ↰ | ↱ | ↲ | ↳ | ↴ | ↵ | ↶ | ↷ | ↸ | ↹ | ↺ | ↻ | ↼ | ↽ | ↾ | ↿ |
| 21D0 | 8656 | ⇐ | ⇑ | ⇒ | ⇓ | ⇔ | ⇕ | ⇖ | ⇗ | ⇘ | ⇙ | ⇚ | ⇛ | ⇜ | ⇝ | ⇞ | ⇟ |
| 2210 | 8720 | ∐ | ∑ | − | ∓ | ∔ | ∕ | ∖ | ∗ | ∘ | ∙ | √ | ∛ | ∜ | ∝ | ∞ | ∟ |
| 2230 | 8752 | ∰ | ∱ | ∲ | ∳ | ∴ | ∵ | ∶ | ∷ | ∸ | ∹ | ∺ | ∻ | ∼ | ∽ | ∾ | ∿ |
| 2240 | 8768 | ≀ | ≁ | ≂ | ≃ | ≄ | ≅ | ≆ | ≇ | ≈ | ≉ | ≊ | ≋ | ≌ | ≍ | ≎ | ≏ |
| 2260 | 8800 | ≠ | ≡ | ≢ | ≣ | ≤ | ≥ | ≦ | ≧ | ≨ | ≩ | ≪ | ≫ | ≬ | ≭ | ≮ | ≯ |
| 2280 | 8832 | ⊀ | ⊁ | ⊂ | ⊃ | ⊄ | ⊅ | ⊆ | ⊇ | ⊈ | ⊉ | ⊊ | ⊋ | ⊌ | ⊍ | ⊎ | ⊏ |
| 2290 | 8848 | ⊐ | ⊑ | ⊒ | ⊓ | ⊔ | ⊕ | ⊖ | ⊗ | ⊘ | ⊙ | ⊚ | ⊛ | ⊜ | ⊝ | ⊞ | ⊟ |
| 22A0 | 8864 | ⊠ | ⊡ | ⊢ | ⊣ | ⊤ | ⊥ | ⊦ | ⊧ | ⊨ | ⊩ | ⊪ | ⊫ | ⊬ | ⊭ | ⊮ | ⊯ |
| 22B0 | 8880 | ⊰ | ⊱ | ⊲ | ⊳ | ⊴ | ⊵ | ⊶ | ⊷ | ⊸ | ⊹ | ⊺ | ⊻ | ⊼ | ⊽ | ⊾ | ⊿ |
| 22C0 | 8896 | ⋀ | ⋁ | ⋂ | ⋃ | ⋄ | ⋅ | ⋆ | ⋇ | ⋈ | ⋉ | ⋊ | ⋋ | ⋌ | ⋍ | ⋎ | ⋏ |
| 22D0 | 8912 | ⋐ | ⋑ | ⋒ | ⋓ | ⋔ | ⋕ | ⋖ | ⋗ | ⋘ | ⋙ | ⋚ | ⋛ | ⋜ | ⋝ | ⋞ | ⋟ |
| 22E0 | 8928 | ⋠ | ⋡ | ⋢ | ⋣ | ⋤ | ⋥ | ⋦ | ⋧ | ⋨ | ⋩ | ⋪ | ⋫ | ⋬ | ⋭ | ⋮ | ⋯ |
| 22F0 | 8944 | ⋰ | ⋱ | ⋲ | ⋳ | ⋴ | ⋵ | ⋶ | ⋷ | ⋸ | ⋹ | ⋺ | ⋻ | ⋼ | ⋽ | ⋾ | ⋿ |
| 2300 | 8960 | ⌀ | ⌁ | ⌂ | ⌃ | ⌄ | ⌅ | ⌆ | ⌇ | ⌈ | ⌉ | ⌊ | ⌋ | ⌌ | ⌍ | ⌎ | ⌏ |
| 2310 | 8976 | ⌐ | ⌑ | ⌒ | ⌓ | ⌔ | ⌕ | ⌖ | ⌗ | ⌘ | ⌙ | ⌚ | ⌛ | ⌜ | ⌝ | ⌞ | ⌟ |
| 25C0 | 9664 | ◀ | ◁ | ◂ | ◃ | ◄ | ◅ | ◆ | ◇ | ◈ | ◉ | ◊ | ○ | ◌ | ◍ | ◎ | ● |
| 2660 | 9824 | ♠ | ♡ | ♢ | ♣ | ♤ | ♥ | ♦ | ♧ | ♨ | ♩ | ♪ | ♫ | ♬ | ♭ | ♮ | ♯ |
Of these, the following are commonly used:-
| Hex | Dec | Description | Char | Hex | Dec | Description | Char | |
|---|---|---|---|---|---|---|---|---|
| 2002 | 8194 | N-space | 2205 | 8709 | Empty set | ∅ | ||
| 2003 | 8195 | M-space | 2207 | 8711 | Nabla | ∇ | ||
| 2009 | 8201 | Thin space | 2208 | 8712 | Element of | ∈ | ||
| 200C | 8204 | Zero width non-joiner | | 2209 | 8713 | Not an element | ∉ | |
| 200D | 8205 | Zero width joiner | | 220B | 8715 | Contains as member | ∋ | |
| 200E | 8206 | Left-to-right mark | | 220F | 8719 | Product | ∏ | |
| 200F | 8207 | Right-to-left mark | | 2211 | 8721 | Sum | ∑ | |
| 2013 | 8211 | N-dash | – | 2212 | 8722 | Minus | − | |
| 2014 | 8212 | M-dash | — | 2217 | 8727 | Low asterisk | ∗ | |
| 2018 | 8216 | Left quote | ‘ | 221A | 8730 | Radical or square root | √ | |
| 2019 | 8217 | Right quote | ’ | 221D | 8733 | Proportional | ∝ | |
| 201A | 8218 | Single low-9 quote | ‚ | 221E | 8734 | Infinity | ∞ | |
| 201C | 8220 | Left double quote | “ | 2220 | 8736 | Angle | ∠ | |
| 201D | 8221 | Right double quote | ” | 2227 | 8743 | Logical AND | ∧ | |
| 201E | 8222 | Double low-9 quote | „ | 2228 | 8744 | Logical OR | ∨ | |
| 2020 | 8224 | Dagger | † | 2229 | 8745 | Cap | ∩ | |
| 2021 | 8225 | Double dagger | ‡ | 222A | 8746 | Cup | ∪ | |
| 2022 | 8226 | Bullet | • | 222B | 8747 | Integral | ∫ | |
| 2026 | 8230 | Horizontal ellipsis | … | 2234 | 8756 | Therefore | ∴ | |
| 2030 | 8240 | Per mille sign | ‰ | 223C | 8764 | Similar to | ∼ | |
| 2032 | 8242 | Prime | ′ | 2245 | 8773 | Approximately equal | ≅ | |
| 2033 | 8243 | Double prime | ″ | 2248 | 8776 | Asymptotic | ≈ | |
| 2039 | 8249 | Single left angle quote | ‹ | 2260 | 8800 | Not equal | ≠ | |
| 203A | 8250 | Single right angle quote | › | 2261 | 8801 | Equivalent | ≡ | |
| 203E | 8254 | Overline | ‾ | 2264 | 8804 | Less-than or equal | ≤ | |
| 2044 | 8260 | Fraction slash | ⁄ | 2265 | 8805 | Greater-than or equal | ≥ | |
| 20AC | 8364 | Euro symbol | € | 2282 | 8834 | Subset | ⊂ | |
| 2111 | 8465 | Imaginary part | ℑ | 2283 | 8835 | Superset | ⊃ | |
| 2118 | 8472 | Weierstrass p | ℘ | 2284 | 8836 | Not subset | ⊄ | |
| 211C | 8476 | Real part | ℜ | 2286 | 8838 | Subset or equal | ⊆ | |
| 2122 | 8482 | Trade mark | ™ | 2287 | 8839 | Superset or equal | ⊇ | |
| 2135 | 8501 | Alef symbol | ℵ | 2295 | 8853 | Circled plus | ⊕ | |
| 2190 | 8592 | Left arrow | ← | 2297 | 8855 | Circled times | ⊗ | |
| 2191 | 8593 | Up arrow | ↑ | 22A5 | 8869 | Perpendicular | ⊥ | |
| 2192 | 8594 | Right arrow | → | 22C5 | 8901 | Dot operator | ⋅ | |
| 2193 | 8595 | Down arrow | ↓ | 2308 | 8968 | Left ceiling | ⌈ | |
| 2194 | 8596 | Left right arrow | ↔ | 2309 | 8969 | Right ceiling | ⌉ | |
| 21B5 | 8629 | Carriage return arrow | ↵ | 230A | 8970 | Left floor | ⌊ | |
| 21D0 | 8656 | Left double arrow | ⇐ | 230B | 8971 | Right floor | ⌋ | |
| 21D1 | 8657 | Up double arrow | ⇑ | 2329 | 9001 | Left angle bracket | 〈 | |
| 21D2 | 8658 | Right double arrow | ⇒ | 232A | 9002 | Right angle bracket | 〉 | |
| 21D3 | 8659 | Down double arrow | ⇓ | 25CA | 9674 | Lozenge | ◊ | |
| 21D4 | 8660 | Left right double arrow | ⇔ | 2660 | 9824 | Black spades | ♠ | |
| 2200 | 8704 | For all | ∀ | 2663 | 9827 | Black clubs | ♣ | |
| 2202 | 8706 | Partial differential | ∂ | 2665 | 9829 | Black hearts | ♥ | |
| 2203 | 8707 | There exists | ∃ | 2666 | 9830 | Black diamonds | ♦ |
©Ray White 2004.