A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on IRC (such as one of these permanent autoconfirmed members).

Difference between revisions of "Character Encoding Detection"

From WHATWG Wiki
Jump to: navigation, search
(Added some Mozilla and IE observations)
 
(+{{obsolete|spec=http://encoding.spec.whatwg.org/}})
 
Line 1: Line 1:
 +
{{obsolete|spec=http://encoding.spec.whatwg.org/}}
 
This page is for documenting the way browsers handle character encoding detection.
 
This page is for documenting the way browsers handle character encoding detection.
  

Latest revision as of 13:49, 9 July 2013

This document is obsolete.

For the current specification, see: http://encoding.spec.whatwg.org/

This page is for documenting the way browsers handle character encoding detection.

Mozilla Observations

  • When there is a BOM, it parses the first 2048 bytes.
    • If a complete meta element is found, that encoding is used.
    • If no meta element is found, UTF-8 is used.
  • When there is no BOM, it parses the document
    • Upon encountering a meta element,
      • If the encoding declared is not compatible with the default (i.e. anything but US-ASCII, ISO-8859-1 or Windows-1252)
      • And if non-US-ASCII characters have been detected
      • And the declared encoding is compatible with what's already been seen
        • The document is re-parsed using that encoding declared.

IE Observations

  • The BOM is authoritative.