Ted's Computer World HTML & CSS
Tips and Tricks

UNICODE:  NICKNAMES OR NUMBERS?

Various bloggers maintain that, when specifying a Unicode special character, only the numeric code should be used.  My response to that is:  <b></u>ll<s>h<i>t!

An entity reference, or named character reference as it is called in HTML5, is a mnemonic equivalent, meaning that an English-speaking person can recognize it for what it is without having to memorize a number.

There are entity references for more than a thousand Unicode characters, presumably because they are the ones most commonly used.  Virtually every coder utilizes the following four mnemonics a lot, possibly without even thinking of them as such:

& &amp; ampersand
< &lt; less than
> &gt; greater than
" &quot; quotation mark

In fact, according to my reading, those are the only entity references that are acceptable in XML documents.  If you are doing any of that, you might as well skip the rest of this page; otherwise, I daresay that you have occasion to use certain other non-keyboard characters with some frequency.  Which ones they are will depend upon the typical content of your own pages.

The special character that I use most frequently is the non-breaking space (&nbsp;), mostly because I learned in grade-school to put two spaces between my sentences.  If I had to type &#160; all the time, I would go mad.  Another frequently used character is the em-dash ().  When I type in &mdash;, I can see what is going on in my text; whereas the equivalent &#8212; would just clutter the page (and also make me crazy).  Excepting the examples on this page, there is not a single numeric character code to be found on this website, and I intend to keep it that way.

There is a sizable collection of spacing characters; three of them are far more useful than the others:

  &nbsp; space
&emsp; em space
&thinsp; thin space

&  is useful for separating italicized text from subsequent normal text, because the standard spacing usually has been compromised, causing the words to 'run together'.  (Admittedly, I might be the only one who cares.)

Some other nicknames most frequently used on these pages:

&mdash; m-dash
&ndash; n-dash
&bull; bullet
· &middot; smaller than a bullet
&hArr; horizontal double-arrow
° &deg; degrees
§ &sect; section delimiter

These entity references satisfy most mathematical requirements:

× &times; multiplication
÷ &divide; division
± &plusmn; plus or minus
&ge; greater than or equal
&le; less than or equal
&ne; not equal
&asymp; approximately equal
² &sup2; second power
³ &sup3; third power

The invisible soft-hyphen lets a browser line-wrap long words without otherwise displaying a hyphen on the page:

­ &shy; optional hyphenation

These are useful for writing such terms as "déjà vu":

é &eacute; French accents
à &agrave;       "

There is nothing wrong with using the apostrophe or standard double-quote character on the keyboard, except that some applications, particularly word processors, tend to hijack them and replace them with what they deem to be better-looking characters.  (That is why many web pages and emails have some utter garbage characters on them even though those functions can be adjusted in the writer's software settings; but that's another topic.)  These two can 'pretty-up' some quotes:

&ldquo; left double-quote
&rdquo; right double-quote

Fractions:

¼ &frac14; one-quarter
½ &frac12; one-half
¾ &frac34; three-quarters
&frac18; one-eighth
&frac38; three-eighths
&frac58; five-eighths
&frac78; seven-eighths
&frac13; one-third
&frac23; two-thirds
&frac15; one-fifths
&frac25; two-fifths
&frac35; three-fifths
&frac45; four-fifths
&frac16; one-sixth
&frac56; five-sixths

Miscellaneous items:

© &copy; copyright
® &reg; registered
¢ &cent; cent
&euro; euro

Finally, for what it's worth I would share a largely forgotten shortcut that was popular in the days of MS-DOS.  In fact, any of the printable ascii characters can be displayed on the screen or placed into a text document simply by holding down an <Alt> key while typing its ascii value on the numeric keypad!

For example, <Alt>171 will produce the ½-character, and <Alt>172 begets the ¼-symbol.  Although those two numbers were committed to memory a long time ago, I cannot remember all the ascii values; so I still use &deg; for degrees instead of <Alt>176, and &divide; instead of <Alt>246 for division.

Be aware that certain operating systems and word-processing programs utilize a shortcut system involving four-digit numbers that don't match the older method; so, for example, one enters <Alt>0189 for the ½-symbol.

Of all the symbols listed here, only the soft-hyphen and double-arrow are not ascii characters.  Using this method with values greater than 255 will produce unexpected results.

Go Back