A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.
To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).
Character Encoding Detection: Difference between revisions
Jump to navigation
Jump to search
Lachlan Hunt (talk | contribs) (Added some Mozilla and IE observations) |
(+{{obsolete|spec=http://encoding.spec.whatwg.org/}}) |
||
Line 1: | Line 1: | ||
{{obsolete|spec=http://encoding.spec.whatwg.org/}} | |||
This page is for documenting the way browsers handle character encoding detection. | This page is for documenting the way browsers handle character encoding detection. | ||
Latest revision as of 13:49, 9 July 2013
This document is obsolete.
For the current specification, see: http://encoding.spec.whatwg.org/
This page is for documenting the way browsers handle character encoding detection.
Mozilla Observations
- When there is a BOM, it parses the first 2048 bytes.
- If a complete meta element is found, that encoding is used.
- If no meta element is found, UTF-8 is used.
- When there is no BOM, it parses the document
- Upon encountering a meta element,
- If the encoding declared is not compatible with the default (i.e. anything but US-ASCII, ISO-8859-1 or Windows-1252)
- And if non-US-ASCII characters have been detected
- And the declared encoding is compatible with what's already been seen
- The document is re-parsed using that encoding declared.
- Upon encountering a meta element,
IE Observations
- The BOM is authoritative.