HTML & CSS Tips and Tricks |
UNICODE: NICKNAMES OR NUMBERS?
Various bloggers maintain that, when specifying a Unicode special character,
only the numeric code should be used. My response to that is:
<b></u>ll<s>h<i>t!
An entity reference, or named character reference as it is called
in HTML5, is a mnemonic equivalent, meaning that an English-speaking person
can recognize it for what it is without having to memorize a number.
There are entity references for more than a thousand Unicode characters, presumably because they are the ones most commonly used. Virtually every coder utilizes the following four mnemonics a lot, possibly without even thinking of them as such:
& | & | ampersand |
< | < | less than |
> | > | greater than |
" | " | quotation mark |
In fact, according to my reading, those are the only entity references that are
acceptable in XML documents. If you are doing any of that, you might as well
skip the rest of this page; otherwise, I daresay that you have occasion to use certain
other non-keyboard characters with some frequency. Which ones they are
will depend upon the typical content of your own pages.
The special character that I use most frequently is the non-breaking space
( ), mostly because I learned in grade-school to put
two spaces between my sentences. If I had to type  
all the time, I would go mad. Another frequently used character is the
em-dash (—). When I type in —,
I can see what is going on in my text; whereas the equivalent —
would just clutter the page (and also make me crazy). Excepting the examples
on this page, there is not a single numeric character code to be found on this website,
and I intend to keep it that way.
There is a sizable collection of spacing characters; three of them are far more useful than the others:
| space | |
  | em space | |
  | thin space |
& is useful for separating italicized text from subsequent normal text, because the standard spacing usually has been compromised, causing the words to 'run together'. (Admittedly, I might be the only one who cares.)
Some other nicknames most frequently used on these pages:
— | — | m-dash |
– | – | n-dash |
• | • | bullet |
· | · | smaller than a bullet |
⇔ | ⇔ | horizontal double-arrow |
° | ° | degrees |
§ | § | section delimiter |
These entity references satisfy most mathematical requirements:
× | × | multiplication |
÷ | ÷ | division |
± | ± | plus or minus |
≥ | ≥ | greater than or equal |
≤ | ≤ | less than or equal |
≠ | ≠ | not equal |
≈ | ≈ | approximately equal |
² | ² | second power |
³ | ³ | third power |
The invisible soft-hyphen lets a browser line-wrap long words without
otherwise displaying a hyphen on the page:
| ­ | optional hyphenation |
These are useful for writing such terms as "déjà vu":
é | é | French accents |
à | à | " |
There is nothing wrong with using the apostrophe or standard double-quote
character on the keyboard, except that some applications, particularly word processors,
tend to hijack them and replace them with what they deem to be better-looking
characters. (That is why many web pages and emails have some utter garbage
characters on them even though those functions can be adjusted in the writer's software
settings; but that's another topic.) These two can 'pretty-up'
some quotes:
“ | “ | left double-quote |
” | ” | right double-quote |
Fractions:
¼ | ¼ | one-quarter |
½ | ½ | one-half |
¾ | ¾ | three-quarters |
⅛ | ⅛ | one-eighth |
⅜ | ⅜ | three-eighths |
⅝ | ⅝ | five-eighths |
⅞ | ⅞ | seven-eighths |
⅓ | ⅓ | one-third |
⅔ | ⅔ | two-thirds |
⅕ | ⅕ | one-fifths |
⅖ | ⅖ | two-fifths |
⅗ | ⅗ | three-fifths |
⅘ | ⅘ | four-fifths |
⅙ | ⅙ | one-sixth |
⅚ | ⅚ | five-sixths |
Miscellaneous items:
© | © | copyright |
® | ® | registered |
¢ | ¢ | cent |
€ | € | euro |
Finally, for what it's worth I would share a largely forgotten shortcut
that was popular in the days of MS-DOS. In fact, any of
the printable ascii characters can be displayed on the screen or placed
into a text document simply by holding down an <Alt>
key while typing its ascii value on the numeric keypad!
For example, <Alt>171 will produce the
½-character, and <Alt>172 begets
the ¼-symbol. Although those two numbers were
committed to memory a long time ago, I cannot remember all the ascii
values; so I still use ° for degrees instead of
<Alt>176, and ÷ instead of
<Alt>246 for division.
Be aware that certain operating systems and word-processing programs utilize a
shortcut system involving four-digit numbers that don't match the older
method; so, for example, one enters <Alt>0189 for the
½-symbol.
Of all the symbols listed here, only the soft-hyphen and double-arrow
are not ascii characters. Using this method with values greater than 255
will produce unexpected results.