Rationale

This document serves a rationale document for various parts of the HTML5 specification. Over time this page will be a complete rationale document.

General Rationale

One Vendor, One Veto

Part of the the goal of the WHATWG is to document how web browsers actually handle HTML. As such, browser vendors already have veto power—by not following the standard. The W3C and WHATWG do not have any enforcement power and can only write what browsers are willing to implement. Not removing features from the HTML standard that at least one browser vendor has stated they are unwilling to implement causes the HTML spec to not accurately document reality.^[1]^[2]. The veto isn’t a power that we grant browsers; it’s a right that they earn on their own by virtue of having users. The minimum market share for a veto is somewhere around 1%.^[3]

Using elements where scripts "work"

In addition, arguments were made that JavaScript-based implementations of details suffer from problems and limitations. Scripting behavior may be inconsistent across browsers, or even unavailable in some contexts. Accessibility is "bolted on", allowing more opportunity for author error, even when using libraries. The data model is not exposed in a consistent way in the markup. And matching native appearance and behavior across a range of platforms may be impractical.^[4]

It isn't just about web browsers

Web browsers are not the only programs that use HTML. Sometimes elements and features are needed even when browsers won't use them in any meaningful way. Document authoring tools, validators, search engines, screen readers, outliners, researchers, etc. all need and can use more information than a browser can. Furthermore if you provide more information than is currently used by browsers it opens up room for innovation.

Experimenting with features

New unknown and untested features are unlikely to get accepted into the WHATWG spec. Browsers and browser extensions (like Google Gears) are expected to first establish use cases and implementation possibilities before the spec is changed. ^[5]

Versioning the spec

Most authors don't care about whether or not an implementation supports an entire, full specification; they just want to know "Can I use this feature in this browser?" So saying that all major implementations support much of CSS 2 to a high degree of correctness is useless for knowing if, say, the author can use display: run-in. In other words, the feature tables are what web authors would actually use in real life.^[6]

Modifying existing semantics

Some elements have different semantics than what HTML4 users would expect. Semantic markup isn't very useful if most pages use elements in a manner that conflicts with the defined semantics. For example, if a search engine treated dd as enclosing a term being defined, for the purposes of searching for definitions (or excluding defining occurrences from results), it would not find many definitions, and it would misclassify things.^[7]

Specific Elements

`DOCTYPE`

Since HTML has moved to an unversioned model, the DOCTYPE does not a have version number. It is necessary for legacy browsers that will operate in quirks mode (a non-spec compliant rendering mode) if a DOCTYPE is absent.

Sections

`hgroup` and other heading elements

The point of hgroup is to hide the subtitle from the outlining algorithm.

`header` and `footer`

The primary purpose of these elements is merely to help the author write self-explanatory markup that is easy to maintain and style; they are not intended to impose specific structures on authors.^[8]

Grouping content

`blockquote`

Attributions and inline citations do not belong inside the blockquote element because the specification does not consider them to be part of the block quote proper.^[9] In other words, the blockquote element represents only the quote itself.

Text-level semantics

`b`, `i`, `em`, `strong`, and `mark`

em is meant to indicate that some text is emphasized. strong is meant to confer importance upon text. b is meant for text that is stylistically offset from the rest of the text. Finally i is used to indicate that some text is meant to be read in an alternate mood.

For example

   Cats are <em>cute</em> animals.

could mean that cats are specifically cute.

   Cats are <strong>cute</strong> animals.

could mean that the word cute is in some way important

   Cats are <b>cute</b> animals.

could mean that the word cute is a new word (perhaps in a language lesson) but is not important

   Cats are <i>cute</i> animals.

could mean that the word cute is meant to be read in a different tone (sarcastically for example)

   Cats are <mark>cute</mark> animals.

means that the sentence is to be read normally but the word “cute” should be highlighted or marked in some way. This could be used for search terms on the page or alterations to an original text.

Embedded content

On the status of `image`

The image element is treated as an alternate (but invalid) name for img. This is because some sites (around 0.2%^[10]) make this mistake. It is already treated as an image by most major browsers.

The `img` element and alternate (`alt`) text

On certain types of pages adding alternate text (with the alt attribute) is impossible (e.g., sites where the user can upload but with no mechanism to supply a description). Because of this, the alt attribute is optional. ^[11]^[12]^[13] A longdesc attribute is not needed ^[14]

Forms

`textarea`

The text area defaults to soft wrapping of the text area. The attribute @wrap can have one of the following values: soft, hard, or off.^[15]. "off" is considered a non-conforming value because it appears to have no purpose other than a visual presentational effect. ^[16]^[17]

`meter` and `progress` (are not the same thing)

meter is not just a special case of progress. The meter element represents a scalar measurement within a known range, such as storage quota usage, a relative popularity rating or relevance indicator. The control allows for the indication of high and low ranges, or minimum, maximum and optimal levels.

The progress element, on the other hand, represents the completion progress of a task. This could be a real time indicator for background processing task (e.g. using Web Workers or a file upload). progress elements can also be in the indeterminate state, indicating that something is in progress, but its completion progress is unknown.^[18]

The default rendering for a meter element could look something like the following:

Whereas, the default rendering for the progress element could look like this:

Alternatively, an indeterminate progress bar could also be styled as a throbber, which indicates progress without any indication of the remaining progress:

See Re: <progress> draft for details.

Interactive elements

`details` element

The details element is needed to provide an accessible way of reflecting a common application widget in HTML-based applications without requiring authors to use extensive scripting, ARIA, and platform-specific CSS to get the same effect.^[19]^[20]

HTML parsing

script element

Why the restrictions for contents of script elements? Why the complicated parsing rules for script elements?

See http://lists.w3.org/Archives/Public/public-html-comments/2010Mar/0017.html

@DEFER and @ASYNC

ASYNC tells the browsers to run the script with its following content at the SAME time(namely, asynchronously). DEFER tells the browsers to run the script LATER, and to run the following content first(the browsers will run the script until the page is ready).^[21]

quirks mode

The HTML parser has the following behavior difference in quirks mode:

A start tag whose tag name is "table"
If the Document is not set to quirks mode, and the stack of open elements has a p element in scope, then act as if an end tag with the tag name "p" had been seen.

Why? See http://hsivonen.iki.fi/last-html-quirk/

ignored white space before head

White space before the <head> tag is ignored. The main reason is that, given the markup

<!DOCTYPE html>
<html>
 <head>
  <title>Sample page</title>
...,

some people expect

document.documentElement.firstChild

to return the head element.^[22]

Failed proposals

A dedicated element for comments

There is no dedicated comment element (e.g., <comment>) for marking up content composed by a user in response to a newspaper or magazine article, blog entry, discussion topic, status update, image, video, etc. The spec suggests using nested articles instead.

Several arguments have been put forth in favor adding a comment element to the spec ^[23]. Arguments and counterarguments are as follows:

Argument: Comments are not articles according to the commonly understood (dictionary) definition of “article.” Articles are relatively “long” pieces of writing, and they are not responses to what others have written. It’s clear that comments are not articles.

Counterargument: The element name “article” isn’t intended to carry the same meaning as its corresponding dictionary entry or any colloquial understanding of the term. A composition’s and whether or not it’s authored by a site’s owner/staff or by it’s readers is completely irrelevant.

Argument: Comments are not articles according to the specification’s definition because, in general, articles are clearly not complete, self-contained, independently distributable or reusable pieces of writing. They are dependent on the context of what they are responding to. For example, “LOL” or “Yeah, especially when talking about your lobbyist friends!” are, on their own, unintelligible.

Counterargument: The definition of article does not require that a piece of writing be fully intelligible on its own. The terms “complete,” “self-contained,” and “independent” are really meant to convey the idea of separateness. Comments are separate from what they are commenting on—they are not part of the piece of writing they are referring to. Therefore they can be independently distributed or reused. An example of this is the website reddit (consider, for example, http://www.reddit.com/r/bestof/, where every post is an example of a comment that was independently reused and syndicated). It’s largely a matter of the author’s intent as to whether or not a piece of writing is something unto itself or part of a larger whole.

Argument: Comments can appear in reference to things that are not articles, such as blog posts, forum topics, social network status updates, images, videos, links, etc., which should not have to be marked up as articles just so the comments can be marked up as nested articles.

Counterargument: All of content types mentioned are in fact articles according to the spec’s definition.

Argument: Robots and plugins can extract comments from web pages more easily if they have their own element. Comments can then be more easily syndicated, displayed, hidden, styled, etc.

Counterargument: There’s no compelling argument that a separate element would make this meaningfully easier than than nested elements.

Argument: Comments sometimes appear in a different region of the page than the item that they are referencing, hence the markup for comments should not have to be contained within the markup of the item.

Counterargument: No evidence has been brought forward that this is a significant authorship issue.

A dedicated element for advertisements

There is no dedicated advertisement element (<ad>, or <advert>, or the like) because it would give users a relatively easy method of hiding or otherwise disabling ads (and therefore the element would very likely end up not being used by content authors).^[24]^[25]

sandbox attribute on the html element

HTML is the wrong level for disabling scripts or other features. This is the kind of thing we should do at the HTTP layer.^[26]^[27]

feature queries

Various proposals have come up with the idea of being able to determine of a certain feature is available.^[28] These fail for a variety of reasons: Part of the problem is that browser vendors will be economical with the truth. Marketing people always have an over-optimistic view of the compliance of their product, and will always give themselves the benefit of the doubt in borderline cases. Also, changing the compliance statement, to remove false claims that are exposed, is likely to a very low priority for the developers.^[29] With regard to CSS feature compliance: Remember that CSS provides hints and implementations don't have to accept those hints, and hardware may sometimes prevent their being implemented.^[30] Some other reasons can be found in the footnotes.^[31]^[32]

custom HTML elements

Custom elements make it impossible for search engines, developers, and browsers to understand the semantics of a page.^[33]

Other Pages

References

[1] ttp://lists.w3.org/Archives/Public/public-html/2009Jul/0257.html -- Re: Codecs for <video> and <audio></a>

[2] ttp://lists.w3.org/Archives/Public/www-archive/2009Jul/0075.html --Formal Objection to One vendor, One Veto

[3] ttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-June/026897.html

[4] ttp://lists.w3.org/Archives/Public/public-html/2010Jun/att-0659/issue-93-decision.html

[5] ttp://www.mail-archive.com/[email protected]/msg22577.html

[6] ttp://www.mail-archive.com/[email protected]/msg23306.html

[7] ttp://lists.whatwg.org/htdig.cgi/help-whatwg.org/2010-October/000668.html

[8] ttp://developers.whatwg.org/sections.html#the-footer-element

[9] ttp://developers.whatwg.org/grouping-content.html#the-blockquote-element

[10] Email from Ian Hickson; comment in spec source

[11] ttp://www.paciellogroup.com/resources/articles/altinhtml5.html

[12] ttp://juicystudio.com/article/requiring-alt-attribute-html5.php

[13] ttp://lists.w3.org/Archives/Public/public-html/2007Jun/0393.html

[14] ttp://juicystudio.com/article/html5-image-element-no-alt.php

[15] ttp://www.whatwg.org/specs/web-apps/current-work/#the-textarea-element-0

[16] ttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-August/022022.html

[17] ttp://www.mail-archive.com/[email protected]/msg22660.html

[18] ttp://html5doctor.com/your-questions-answered-11/

[19] ttp://www.w3.org/Bugs/Public/show_bug.cgi?id=8379#c13

[20] ttp://www.w3.org/html/wg/wiki/ChangeProposals/removedetails

[21] ttp://www.mail-archive.com/[email protected]/msg22436.html

[22] [whatwg] several messages about the tree construction stage of HTML parsing

[23] ttp://lists.w3.org/Archives/Public/public-whatwg-archive/2012Jan/0226.html

[24] ttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-February/013939.html

[25] ttp://lists.w3.org/Archives/Public/public-whatwg-archive/2012Jan/0226.html

[26] ttp://www.w3.org/Bugs/Public/show_bug.cgi?id=8849

[27] ttps://wiki.mozilla.org/Security/CSP

[28] ttp://lists.w3.org/Archives/Public/www-style/2009Dec/0130.html

[29] ttp://lists.w3.org/Archives/Public/www-style/2010Jul/0097.html

[30] ttp://lists.w3.org/Archives/Public/www-style/2003Nov/0000.html

[31] ttp://lists.w3.org/Archives/Public/www-style/2003Oct/0074.html

[32] ttp://lists.w3.org/Archives/Public/www-style/2004Mar/0282.html

[33] ttp://html5doctor.com/your-questions-13/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+html5doctor+%28HTML5doctor%29

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

Rationale

Contents

General Rationale

One Vendor, One Veto

Using elements where scripts "work"

It isn't just about web browsers

Experimenting with features

Versioning the spec

Modifying existing semantics

Specific Elements

`DOCTYPE`

Sections

`hgroup` and other heading elements

`header` and `footer`

Grouping content

`blockquote`

Text-level semantics

`b`, `i`, `em`, `strong`, and `mark`

Embedded content

On the status of `image`

The `img` element and alternate (`alt`) text

Forms

`textarea`

`meter` and `progress` (are not the same thing)

Interactive elements

`details` element

HTML parsing

script element

@DEFER and @ASYNC

quirks mode

ignored white space before head

Failed proposals

A dedicated element for comments

A dedicated element for advertisements

sandbox attribute on the html element

feature queries

custom HTML elements

Other Pages

References

Navigation menu

Rationale

General Rationale

One Vendor, One Veto

Using elements where scripts "work"

It isn't just about web browsers

Experimenting with features

Versioning the spec

Modifying existing semantics

Specific Elements

DOCTYPE

Sections

hgroup and other heading elements

header and footer

Grouping content

blockquote

Text-level semantics

b, i, em, strong, and mark

Embedded content

On the status of image

The img element and alternate (alt) text

Forms

textarea

meter and progress (are not the same thing)

Interactive elements

details element

HTML parsing

script element

@DEFER and @ASYNC

quirks mode

ignored white space before head

Failed proposals

A dedicated element for comments

A dedicated element for advertisements

sandbox attribute on the html element

feature queries

custom HTML elements

Other Pages

References

Navigation menu

Search

`DOCTYPE`

`hgroup` and other heading elements

`header` and `footer`

`blockquote`

`b`, `i`, `em`, `strong`, and `mark`

On the status of `image`

The `img` element and alternate (`alt`) text

`textarea`

`meter` and `progress` (are not the same thing)

`details` element