A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

Changes from HTML4: Difference between revisions

From WHATWG Wiki
Jump to navigation Jump to search
(Add links and fix some editorial stuff)
Line 3: Line 3:
HTML5 is different from HTML4 in a way that it addresses both document and application semantics making it more suitable for the web applications created today. HTML5 also reflects implementations better where they differ from HTML4 to ensure the language is implementable and compatible with the web. Inspired by the forward compatible error handling in CSS HTML5 defines detailed processing models where necessary to ensure that implementations become interoperable and that the language stays extensible in the future.
HTML5 is different from HTML4 in a way that it addresses both document and application semantics making it more suitable for the web applications created today. HTML5 also reflects implementations better where they differ from HTML4 to ensure the language is implementable and compatible with the web. Inspired by the forward compatible error handling in CSS HTML5 defines detailed processing models where necessary to ensure that implementations become interoperable and that the language stays extensible in the future.


HTML5 also integrates DOM Level 2 HTML so the element specific APIs are defined along with the rest of the language. Because the language is mostly defined in terms of the DOM it's very easy to get an XML serialization as well. This XML serialization is called XHTML5 and is basically an update to XHTML1.x.
HTML5 also integrates a new version of DOM Level 2 HTML so the element-specific APIs are defined along with the rest of the language. Because the language is mostly defined in terms of the DOM it's very easy to get an XML serialization as well. This XML serialization is called XHTML5 and is basically an update to XHTML1.x.


== Syntax ==
== Syntax ==


; Writing HTML5: HTML5 specifies its own syntax rules authors have to follow. These syntax rules are compatible with the XHTML syntax rules althoug this does not imply that parsing such a document with an HTML parser will give the same result as parsing it with an XML parser.
; Writing HTML5: HTML5 specifies its own syntax rules authors have to follow. HTML5 documents can be written in a way that looks exactly like XHTML although this does not imply that parsing such a document with an HTML parser will give the same result as parsing it with an XML parser.
; Parsing HTML5: HTML5 defines its own parsing rules (including "error correction") for text/html resources and no longer assumes SGML features are supported.
; Parsing HTML5: HTML5 defines its own parsing rules (including "error correction") for text/html resources and no longer pretends that HTML is an application of SGML.


== New elements ==
== New elements ==
Line 14: Line 14:
=== Document Structure ===
=== Document Structure ===


* article
* [http://www.whatwg.org/specs/web-apps/current-work/#the-article article]
* aside
* [http://www.whatwg.org/specs/web-apps/current-work/#the-aside aside]
* dialog
* [http://www.whatwg.org/specs/web-apps/current-work/#the-dialog dialog]
* figure
* [http://www.whatwg.org/specs/web-apps/current-work/#the-figure figure]
* footer
* [http://www.whatwg.org/specs/web-apps/current-work/#the-footer footer]
* header
* [http://www.whatwg.org/specs/web-apps/current-work/#the-header header]
* nav
* [http://www.whatwg.org/specs/web-apps/current-work/#the-nav nav]
* section
* [http://www.whatwg.org/specs/web-apps/current-work/#the-section section]


=== Data ===
=== Data ===


* audio
* [http://www.whatwg.org/specs/web-apps/current-work/#audio audio]
* embed
* [http://www.whatwg.org/specs/web-apps/current-work/#the-embed embed]
* m
* [http://www.whatwg.org/specs/web-apps/current-work/#the-m m]
* meter
* [http://www.whatwg.org/specs/web-apps/current-work/#the-meter meter]
* source
* [http://www.whatwg.org/specs/web-apps/current-work/#the-source source]
* time
* [http://www.whatwg.org/specs/web-apps/current-work/#the-time time]
* video
* [http://www.whatwg.org/specs/web-apps/current-work/#the-video video]


=== Applications ===
=== Applications ===


* canvas
* [http://www.whatwg.org/specs/web-apps/current-work/#the-canvas canvas]
* command
* [http://www.whatwg.org/specs/web-apps/current-work/#the-command command]
* datagrid
* [http://www.whatwg.org/specs/web-apps/current-work/#the-datagrid datagrid]
* details
* [http://www.whatwg.org/specs/web-apps/current-work/#the-details details]
* datalist (Web Forms 2)
* [http://www.whatwg.org/specs/web-forms/current-work/#the-datalist datalist] (Web Forms 2)
* event-source
* [http://www.whatwg.org/specs/web-apps/current-work/#the-event-source event-source]
* output (Web Forms 2)
* [http://www.whatwg.org/specs/web-forms/current-work/#the-output output] (Web Forms 2)
* progress
* [http://www.whatwg.org/specs/web-apps/current-work/#the-progress progress]


== Changed elements ==
== Changed elements ==


These elements have a new meaning in HTML5 which is incompatible with HTML4. The new meaning better reflects the way they are used on the web or gives them a purpose so people can start using them.
These elements have new meanings in HTML5 which are incompatible with HTML4. The new meanings better reflects the way they are used on the Web or gives them a purpose so people can start using them.


; a: The a element without an href attribute represents a "placeholder link".
; [http://www.whatwg.org/specs/web-apps/current-work/#the-a a]: The a element without an href attribute represents a "placeholder link".
; address: The address element is now scoped by the new concept of sectioning.
; [http://www.whatwg.org/specs/web-apps/current-work/#the-address address]: The address element is now scoped by the new concept of sectioning.
; b: The b element now represents a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as key words in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is boldened
; [http://www.whatwg.org/specs/web-apps/current-work/#the-b b]: The b element now represents a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as key words in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is boldened
; hr: The hr element now represents a paragraph-level thematic break
; [http://www.whatwg.org/specs/web-apps/current-work/#the-hr hr]: The hr element now represents a paragraph-level thematic break
; i: The i element now represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized
; [http://www.whatwg.org/specs/web-apps/current-work/#the-i i]: The i element now represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized
; menu: The menu element is redefined to be useful for actual menus
; [http://www.whatwg.org/specs/web-apps/current-work/#the-menu menu]: The menu element is redefined to be useful for actual menus
; small: The small element now represents small print (for side comments and legal print)
; [http://www.whatwg.org/specs/web-apps/current-work/#the-small small]: The small element now represents small print (for side comments and legal print)
; strong: The strong element now represents importance rather than strong emphasis
; [http://www.whatwg.org/specs/web-apps/current-work/#the-strong strong]: The strong element now represents importance rather than strong emphasis


== Dropped Elements ==
== Dropped Elements ==
Line 61: Line 61:
That these elements are dropped means that authors are no longer allowed to use them. User agents will still have to support them and HTML5 will probably get a rendering section in due course that says exactly how. (isindex for instance is already supported by the parser.)
That these elements are dropped means that authors are no longer allowed to use them. User agents will still have to support them and HTML5 will probably get a rendering section in due course that says exactly how. (isindex for instance is already supported by the parser.)


* acronym (use abbr instead)
* acronym (use [http://www.whatwg.org/specs/web-apps/current-work/#the-abbr abbr] instead)
* applet (use object instead)  
* applet (use [http://www.whatwg.org/specs/web-apps/current-work/#the-object object] instead)  
* basefont  
* basefont  
* big  
* big  
* center  
* center  
* dir
* dir
* font (allowed when inserted by WYSIWYG editors)
* [http://www.whatwg.org/specs/web-apps/current-work/#the-font font] (allowed when inserted by WYSIWYG editors)
* frame  
* frame  
* frameset  
* frameset  
* isindex  
* isindex  
* noframes  
* noframes  
* noscript (only dropped in XHTML5)  
* [http://www.whatwg.org/specs/web-apps/current-work/#the-noscript noscript] (only dropped in XHTML5)  
* s  
* s  
* strike  
* strike  
Line 85: Line 85:
! Element !! Attributes
! Element !! Attributes
|-
|-
| a || rev, charset
| [http://www.whatwg.org/specs/web-apps/current-work/#the-a a] || rev, charset
|-
|-
| area || nohref
| [http://www.whatwg.org/specs/web-apps/current-work/#the-area area] || nohref
|-
|-
| form || target
| [http://www.whatwg.org/specs/web-apps/current-work/#the-form form] || target
|-
|-
| head || profile
| [http://www.whatwg.org/specs/web-apps/current-work/#the-head head] || profile
|-
|-
| html || version
| [http://www.whatwg.org/specs/web-apps/current-work/#the-html html] || version
|-
|-
| link || rev, target, charset
| [http://www.whatwg.org/specs/web-apps/current-work/#the-link link] || rev, target, charset
|-
|-
| meta || scheme
| [http://www.whatwg.org/specs/web-apps/current-work/#the-meta meta] || scheme
|-
|-
| object || archive, standby
| [http://www.whatwg.org/specs/web-apps/current-work/#the-object object] || archive, standby
|-
|-
| param || valuetype
| [http://www.whatwg.org/specs/web-apps/current-work/#the-param param] || valuetype
|-
|-
| script || charset
| [http://www.whatwg.org/specs/web-apps/current-work/#the-script script] || charset
|-
|-
| table || summary
| [http://www.whatwg.org/specs/web-apps/current-work/#the-table table] || summary
|-
|-
| td, th || headers, axis
| [http://www.whatwg.org/specs/web-apps/current-work/#the-td td], [http://www.whatwg.org/specs/web-apps/current-work/#the-th th] || headers, axis
|}
|}


Line 116: Line 116:
HTML5 introduces a number of APIs that should help in creating web applications. These can be used together with the new elements introduced for applications:
HTML5 introduces a number of APIs that should help in creating web applications. These can be used together with the new elements introduced for applications:


* 2D drawing API which can be used with the new canvas element
* [http://www.whatwg.org/specs/web-apps/current-work/#the-2d 2D drawing API] which can be used with the new [http://www.whatwg.org/specs/web-apps/current-work/#the-canvas canvas] element
* API for playing of video and audio which can be used with the new video and audio elements
* [http://www.whatwg.org/specs/web-apps/current-work/#media API for playing of video and audio] which can be used with the new [http://www.whatwg.org/specs/web-apps/current-work/#video video] and [http://www.whatwg.org/specs/web-apps/current-work/#audio audio] elements
* Persistent storage
* [http://www.whatwg.org/specs/web-apps/current-work/#storage Persistent storage]
* Online / offline events
* [http://www.whatwg.org/specs/web-apps/current-work/#offline Online / offline events]
* Editing API in combination with a new global contenteditable attribute
* [http://www.whatwg.org/specs/web-apps/current-work/#editing Editing API] in combination with a new global [http://www.whatwg.org/specs/web-apps/current-work/#contenteditable0 contenteditable] attribute
* Drag & drop API in combination with a draggable attribute.
* [http://www.whatwg.org/specs/web-apps/current-work/#dnd Drag & drop API] in combination with a [http://www.whatwg.org/specs/web-apps/current-work/#draggable draggable] attribute.
* Network API
* [http://www.whatwg.org/specs/web-apps/current-work/#network Network API]
* API that exposes the history and allows pages to add to it to prevent breaking the back button.
* [http://www.whatwg.org/specs/web-apps/current-work/#history API that exposes the history] and allows pages to add to it to prevent breaking the back button.
* Cross document messaging
* [http://www.whatwg.org/specs/web-apps/current-work/#crossDocumentMessages Cross document messaging]
* Listening to server sent events
* [http://www.whatwg.org/specs/web-apps/current-work/#server-sent-events Listening to server sent events]


== Character Encoding ==
== Character Encoding ==

Revision as of 17:42, 5 April 2007

Global overview

HTML5 is different from HTML4 in a way that it addresses both document and application semantics making it more suitable for the web applications created today. HTML5 also reflects implementations better where they differ from HTML4 to ensure the language is implementable and compatible with the web. Inspired by the forward compatible error handling in CSS HTML5 defines detailed processing models where necessary to ensure that implementations become interoperable and that the language stays extensible in the future.

HTML5 also integrates a new version of DOM Level 2 HTML so the element-specific APIs are defined along with the rest of the language. Because the language is mostly defined in terms of the DOM it's very easy to get an XML serialization as well. This XML serialization is called XHTML5 and is basically an update to XHTML1.x.

Syntax

Writing HTML5
HTML5 specifies its own syntax rules authors have to follow. HTML5 documents can be written in a way that looks exactly like XHTML although this does not imply that parsing such a document with an HTML parser will give the same result as parsing it with an XML parser.
Parsing HTML5
HTML5 defines its own parsing rules (including "error correction") for text/html resources and no longer pretends that HTML is an application of SGML.

New elements

Document Structure

Data

Applications

Changed elements

These elements have new meanings in HTML5 which are incompatible with HTML4. The new meanings better reflects the way they are used on the Web or gives them a purpose so people can start using them.

a
The a element without an href attribute represents a "placeholder link".
address
The address element is now scoped by the new concept of sectioning.
b
The b element now represents a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as key words in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is boldened
hr
The hr element now represents a paragraph-level thematic break
i
The i element now represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized
menu
The menu element is redefined to be useful for actual menus
small
The small element now represents small print (for side comments and legal print)
strong
The strong element now represents importance rather than strong emphasis

Dropped Elements

That these elements are dropped means that authors are no longer allowed to use them. User agents will still have to support them and HTML5 will probably get a rendering section in due course that says exactly how. (isindex for instance is already supported by the parser.)

  • acronym (use abbr instead)
  • applet (use object instead)
  • basefont
  • big
  • center
  • dir
  • font (allowed when inserted by WYSIWYG editors)
  • frame
  • frameset
  • isindex
  • noframes
  • noscript (only dropped in XHTML5)
  • s
  • strike
  • tt
  • u

Dropped Attributes

Some attributes that were defined in HTML4 are not included in HTML5. Here's a current list (subject to change, see the spec):

Element Attributes
a rev, charset
area nohref
form target
head profile
html version
link rev, target, charset
meta scheme
object archive, standby
param valuetype
script charset
table summary
td, th headers, axis

In addition, HTML5 has none of the presentational attributes that were in HTML4 (including those on <table>. Any attributes defined on elements that are not in HTML5 are (obviously) also not in HTML5.

APIs

HTML5 introduces a number of APIs that should help in creating web applications. These can be used together with the new elements introduced for applications:

Character Encoding

The character encoding can be declared using the meta element, but the syntax of the meta element has changed. In HTML 4.01 and earlier, the meta element was:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

In HTML5, the syntax was simplified to remove the unnecessary markup, yet still remain compatible with the encoding detection implemented in most existing browsers.

<meta charset="UTF-8">

HTML 4 Algorithm

Source 5.2.2 Specifying the character encoding, HTML 4.01 Specification.

  1. An HTTP "charset" parameter in a "Content-Type" field.
  2. A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
  3. The charset attribute set on an element that designates an external resource.

HTML 5 Algorithm

The exact algorithm that browsers must follow in order to determine the character encoding is specified in HTML 5. The basic algorithm works as follows:

  1. If the transport layer specifies an encoding, use that, and abort these steps. (e.g. The HTTP Content-Type header).
  2. Read the first 512 bytes of the file, or at least as much as possible if less than that.
  3. If the file starts with a UTF-8, UTF-16 or UTF-32 BOM, then use that and abort these steps.
  4. Otherwise use the special algorithm to search the first 512 bytes for a meta element that declares the encoding. The algorithm is relatively lenient in what it will detect, though since it doesn't use the normal parsing algorithm, there are some restrictions.