A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on IRC (such as one of these permanent autoconfirmed members) or send an e-mail to admin@wiki.whatwg.org with your desired username and an explanation of the first edit you'd like to make. (Do not use this e-mail address for any other inquiries, as they will be ignored or politely declined.)

Note: This wiki is used to supplement, not replace, specification discussions. If you would like to request changes to existing specifications, please use IRC or a mailing list first.

Encoding

From WHATWG Wiki
Jump to: navigation, search

This page tracks notes related to the Encoding Standard. See Web Encodings for some historical data with respect to encodings and their labels.

Implementations

Legacy implementations

Gecko
http://mxr.mozilla.org/mozilla-central/source/intl/uconv/
Chromium
http://src.chromium.org/svn/trunk/deps/third_party/icu46/README.chromium
http://src.chromium.org/svn/trunk/deps/third_party/icu46/source/data/mappings/convrtrs.txt
http://src.chromium.org/svn/trunk/deps/third_party/icu46/source/data/mappings/ucmlocal.mk

Japanese encodings

Got these links after the standard was written:

Sniffing

Gecko notes

  • Thai detection was not enabled before June 2009
  • Encodings not in the spec: iso-2022-cn, euc-tw
  • GB2312 is supported, but superset gbk/gb18030 is not
  • UTF-16 sniffing is HTML specific

Misc

Labels

Labels in Opera that are not in the spec: 'ansi_x3.4-1986', 'cn-gb', 'cp367', 'cp50220', 'cp51932', 'cp932', 'cp936', 'csascii', 'cscp50220', 'cscp51932', 'csinvariant', 'csiso646basic1983', 'csunicode', 'csunicode11', 'csunicode11utf7', 'csunicodeascii', 'csunicodejapanese', 'csunicodelatin1', 'csviscii', 'cswindows31j', 'euc-cn', 'euc-tw', 'extended_unix_code_packed_format_for_japanese', 'ibm367', 'invariant', 'iso-10646', 'iso-10646-j-1', 'iso-10646-ucs-2', 'iso-10646-ucs-basic', 'iso-10646-unicode-latin1', 'iso-2022-cn', 'iso-2022-jp-1', 'iso-celtic', 'iso-ir-199', 'iso-ir-226', 'iso-ir-6', 'iso646-us', 'iso8859-16', 'iso885916', 'iso_646.basic:1983', 'iso_646.irv:1991', 'iso_8859-10:1992', 'iso_8859-14', 'iso_8859-14:1998', 'iso_8859-16', 'iso_8859-16:2001', 'iso_8859-6-e', 'iso_8859-6-i', 'iso_8859-8-e', 'iso_8859-8-i', 'l10', 'l8', 'latin-9', 'latin10', 'latin8', 'microsoft-cp1250', 'microsoft-cp1251', 'microsoft-cp1252', 'microsoft-cp1253', 'microsoft-cp1254', 'microsoft-cp1255', 'microsoft-cp1256', 'microsoft-cp1257', 'microsoft-cp1258', 'ms932', 'ms936', 'ref', 'tis-620-2533', 'unicode-1-1', 'unicode-1-1-utf-7', 'us', 'utf-7', 'viscii', 'windows-936', 'x-mac-ce', 'x-mac-greek', 'x-mac-turkish', 'x-user-defined'