CP1252 Support Test

by Markus Kuhn

Microsoft extended the ISO 8859-1 character set by 27 additional characters in the range 0x80 to 0x9f and called the result Code Page 1252. These characters are important and widely used in high-quality typesetting of English and other European languages. The proper way to use these characters in HTML documents is via their corresponding Unicode equivalents, either in UTF-8 or as a numeric character reference (NCR).

The following table shows all 27 characters that were added to CP1252 as the corresponding Unicode characters:

Hex NCRDec NCRHTML Char Entity
U+0152LATIN CAPITAL LIGATURE OEŒŒŒ
U+0153LATIN SMALL LIGATURE OEœœœ
U+0160LATIN CAPITAL LETTER S WITH CARONŠŠŠ
U+0161LATIN SMALL LETTER S WITH CARONššš
U+0178LATIN CAPITAL LETTER Y WITH DIAERESISŸŸŸ
U+017DLATIN CAPITAL LETTER Z WITH CARONŽŽŽ
U+017ELATIN SMALL LETTER Z WITH CARONžžž
U+0192LATIN SMALL LETTER F WITH HOOKƒƒƒ
U+02C6MODIFIER LETTER CIRCUMFLEX ACCENTˆˆˆ
U+02DCSMALL TILDE˜˜˜
U+2013EN DASH
U+2014EM DASH
U+2018LEFT SINGLE QUOTATION MARK
U+2019RIGHT SINGLE QUOTATION MARK
U+201ASINGLE LOW-9 QUOTATION MARK
U+201CLEFT DOUBLE QUOTATION MARK
U+201DRIGHT DOUBLE QUOTATION MARK
U+201EDOUBLE LOW-9 QUOTATION MARK
U+2020DAGGER
U+2021DOUBLE DAGGER
U+2022BULLET
U+2026HORIZONTAL ELLIPSIS
U+2030PER MILLE SIGN
U+2039SINGLE LEFT-POINTING ANGLE QUOTATION MARK
U+203ASINGLE RIGHT-POINTING ANGLE QUOTATION MARK
U+20ACEURO SIGN
U+2122TRADE MARK SIGN

Some example usages:

Markus Kuhn
created 2000-02-02 -- last modified 2000-05-15 -- http://www.cl.cam.ac.uk/~mgk25/ucs/CP1252.html