by Markus Kuhn
Microsoft extended the ISO 8859-1 character set by 27 additional characters in the range 0x80 to 0x9f and called the result Code Page 1252. These characters are important and widely used in high-quality typesetting of English and other European languages. The proper way to use these characters in HTML documents is via their corresponding Unicode equivalents, either in UTF-8 or as a numeric character reference (NCR).
The following table shows all 27 characters that were added to CP1252 as the corresponding Unicode characters:
Hex NCR | Dec NCR | HTML Char Entity | ||
---|---|---|---|---|
U+0152 | LATIN CAPITAL LIGATURE OE | Œ | Œ | Œ |
U+0153 | LATIN SMALL LIGATURE OE | œ | œ | œ |
U+0160 | LATIN CAPITAL LETTER S WITH CARON | Š | Š | Š |
U+0161 | LATIN SMALL LETTER S WITH CARON | š | š | š |
U+0178 | LATIN CAPITAL LETTER Y WITH DIAERESIS | Ÿ | Ÿ | Ÿ |
U+017D | LATIN CAPITAL LETTER Z WITH CARON | Ž | Ž | Ž |
U+017E | LATIN SMALL LETTER Z WITH CARON | ž | ž | ž |
U+0192 | LATIN SMALL LETTER F WITH HOOK | ƒ | ƒ | ƒ |
U+02C6 | MODIFIER LETTER CIRCUMFLEX ACCENT | ˆ | ˆ | ˆ |
U+02DC | SMALL TILDE | ˜ | ˜ | ˜ |
U+2013 | EN DASH | – | – | – |
U+2014 | EM DASH | — | — | — |
U+2018 | LEFT SINGLE QUOTATION MARK | ‘ | ‘ | ‘ |
U+2019 | RIGHT SINGLE QUOTATION MARK | ’ | ’ | ’ |
U+201A | SINGLE LOW-9 QUOTATION MARK | ‚ | ‚ | ‚ |
U+201C | LEFT DOUBLE QUOTATION MARK | “ | “ | “ |
U+201D | RIGHT DOUBLE QUOTATION MARK | ” | ” | ” |
U+201E | DOUBLE LOW-9 QUOTATION MARK | „ | „ | „ |
U+2020 | DAGGER | † | † | † |
U+2021 | DOUBLE DAGGER | ‡ | ‡ | ‡ |
U+2022 | BULLET | • | • | • |
U+2026 | HORIZONTAL ELLIPSIS | … | … | … |
U+2030 | PER MILLE SIGN | ‰ | ‰ | ‰ |
U+2039 | SINGLE LEFT-POINTING ANGLE QUOTATION MARK | ‹ | ‹ | ‹ |
U+203A | SINGLE RIGHT-POINTING ANGLE QUOTATION MARK | › | › | › |
U+20AC | EURO SIGN | € | € | € |
U+2122 | TRADE MARK SIGN | ™ | ™ | ™ |
Some example usages:
Markus Kuhn
created 2000-02-02 -- last modified 2000-05-15 --
http://www.cl.cam.ac.uk/~mgk25/ucs/CP1252.html