A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

Changes from HTML4

From WHATWG Wiki
Revision as of 13:05, 5 April 2007 by Annevk (talk | contribs) (→‎Syntax)
Jump to navigation Jump to search

Global overview

HTML5 is different from HTML4 in a way that it addresses both document and application semantics making it more suitable for the web applications created today. HTML5 also reflects implementations better where they differ from HTML4 to ensure the language is implementable and compatible with the web. Inspired by the forward compatible error handling in CSS HTML5 defines detailed processing models where necessary to ensure that implementations become interoperable and that the language stays extensible in the future.

HTML5 also integrates DOM Level 2 HTML so the element specific APIs are defined along with the rest of the language. Because the language is mostly defined in terms of the DOM it's very easy to get an XML serialization as well. This XML serialization is called XHTML5 and is basically an update to XHTML1.x.


Writing HTML5
HTML5 specifies its own syntax rules authors have to follow. These syntax rules are compatible with the XHTML syntax rules althoug this does not imply that parsing such a document with an HTML parser will give the same result as parsing it with an XML parser.
Parsing HTML5
HTML5 defines its own parsing rules (including "error correction") for text/html resources and no longer assumes SGML features are supported.

New elements

Document Structure

  • article
  • aside
  • dialog
  • figure
  • footer
  • header
  • nav
  • section


  • audio
  • embed
  • m
  • meter
  • source
  • time
  • video


  • canvas
  • command
  • datagrid
  • details
  • datalist (Web Forms 2)
  • event-source
  • output (Web Forms 2)
  • progress

Changed elements

These elements have a new meaning in HTML5 which is incompatible with HTML4. The new meaning better reflects the way they are used on the web or gives them a purpose so people can start using them.

represents a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as key words in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is boldened
represents a paragraph-level thematic break
represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized
redefined to be useful for actual menu's
small print (for side comments and legal print)
represents importance rather than strong emphasis

Dropped elements

That these elements are dropped means that authors are no longer allowed to use them. User agents will still have to support them and HTML5 will probably get a rendering section in due course that says exactly how. (isindex for instance is already supported by the parser.)

  • acronym (use abbr instead)
  • applet (use object instead)
  • basefont
  • big
  • center
  • dir
  • font (allowed when inserted by WYSIWYG editors)
  • frame
  • frameset
  • isindex
  • noframes
  • noscript (only dropped in XHTML5)
  • s
  • strike
  • tt
  • u