Ted's Computer World HTML & CSS
Tips and Tricks

UNICODE:  NICKNAMES OR NUMBERS?

Various bloggers maintain that, when specifying a Unicode special character, only the numeric code should be used.  My response to that is:  <b></u>ll<s>h<i>t!

An entity reference, or named character reference as it is called in HTML5, is a mnemonic equivalent, meaning that an English-speaking person can recognize it for what it is without having to memorize a number.

There are entity references for more than 1,000 Unicode characters, presumably because they are the ones most commonly used.  Virtually every coder utilizes the following four mnemonics a lot, possibly without even thinking of them as such:

& &amp; ampersand
< &lt; less than
> &gt; greater than
" &quot; quotation mark

In fact, according to my reading, those are the only entity references that are acceptable in XML documents.  If you are doing any of that, you might as well skip the rest of this page; otherwise, I daresay that you have occasion to use certain other non-keyboard characters with some frequency.  Which ones they are will depend upon the typical content of your own pages.

The special character that I use most frequently is the non-breaking space (&nbsp;), mostly because I learned in grade-school to put two spaces between my sentences.  If I had to type &#160; all the time, I would go mad.  Another character that I use a great deal is the em-dash ().  When I type in &mdash;, I can see what is going on in my text; whereas the equivalent &#8212; would just clutter the page (and also make me crazy).

These entity references satisfy most mathematical requirements:

× &times; multiplication
÷ &divide; division
± &plusmn; plus or minus
&ge; greater than or equal
&le; less than or equal
&ne; not equal
&asymp; approximately equal

The invisible soft-hyphen lets a browser line-wrap long words without otherwise displaying a hyphen on the page:

­ &shy; optional hyphenation

These are useful for writing such terms as "déjà vu":

é &eacute; French accents
à &agrave;       "

There is nothing wrong with using the apostrophe or standard double-quote character on the keyboard, except that some applications, particularly word processors, tend to hijack them and replace them with what they deem to be better-looking characters.  (That is why many web pages and emails have some utter garbage characters on them even though those functions can be adjusted in the software settings; but that's another topic.)  These two can "pretty-up" some quotes:

&ldquo; left double-quote
&rdquo; right double-quote

Used myself mostly on hiking journals, these have other uses as well:

&hArr; horizontal double-arrow
° &deg; degrees
§ &sect; section delimiter
· &middot; smaller than a bullet

Miscellaneous items:

&ndash; n-dash
&bull; bullet
© &copy; copyright
® &reg; registered
¢ &cent; cent
&euro; euro
¼ &frac14; one-quarter
½ &frac12; one-half
¾ &frac34; three-quarters
&frac18; one-eighth
&frac38; three-eighths
&frac58; five-eighths
&frac78; seven-eighths
&frac13; one-third
&frac23; two-thirds

Similar nicknames have been established for fifths and sixths; but those characters are rather ugly-looking, so I avoid them.  Examples:

&frac16; one-sixth
&frac35; three-fifths

Now, for what it's worth, I would share a mostly forgotten method that was popular in the days of MS-DOS.  In fact, any of the printable ascii characters can be displayed on the screen or placed into a text document simply by holding down an <Alt> key while typing its ascii value on the numeric keypad!

For example, <Alt>171 will produce the ½-character, and <Alt>172 begets the ¼-symbol.  Although those two numbers were committed to memory a long time ago, I cannot remember all the ascii values; so I still use &deg; for degrees instead of <Alt>176, and &divide; instead of <Alt>246 for division.

Of all the symbols listed here, only the soft-hyphen and double-arrow are not ascii characters.  Using this method with values greater than 255 will produce unexpected results.  The choice is yours.

Go Back