A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on IRC (such as one of these permanent autoconfirmed members).

Difference between revisions of "Rationale"

From WHATWG Wiki
Jump to: navigation, search
(Other Pages: added link to WHATWG FAQ. Section 5 specifically has lots of relevant materiel)
(HTML parsing: Add)
Line 97: Line 97:
  
 
Why? See http://hsivonen.iki.fi/last-html-quirk/
 
Why? See http://hsivonen.iki.fi/last-html-quirk/
 +
 +
=== ignored white space before head ===
 +
 +
White space before the <code>&lt;head></code> tag is ignored. The main reason is that, given the markup
 +
 +
<pre>
 +
<!DOCTYPE html>
 +
<html>
 +
<head>
 +
  <title>Sample page</title>
 +
...,
 +
</pre>
 +
 +
some people expect
 +
 +
<pre>
 +
document.documentElement.firstChild
 +
</pre>
 +
 +
to return the <code>head</code> element.<ref><cite>[http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-March/014148.html &#91;whatwg&#93; several messages about the tree construction stage of HTML parsing]</cite></ref>
  
 
== Failed proposals ==
 
== Failed proposals ==

Revision as of 15:54, 12 July 2010

This document serves a rationale document for various parts of the HTML5 specification. Over time this page will be a complete rationale document.

General Rationale

One Vendor, One Veto

Part of the the goal of the WHATWG is to document how web browsers actually handle HTML. As such browser vendors already have veto power - by not following the standard. The W3C and WHATWG do not have any enforcement power and can only write what browsers are willing to implement. Not removing features from the HTML standard that at least one browser vendor has stated they are unwilling to implement causes the HTML spec to not accurately document reality.[1][2]. The veto isn't a power that we grant browsers, it's a right that they earn on their own by virtue of having users. The minimum market share for a veto is somewhere around 1%.[3]

Using elements where scripts "work"

In addition, arguments were made that JavaScript-based implementations of details suffer from problems and limitations. Scripting behavior may be inconsistent across browsers, or even unavailable in some contexts. Accessibility is "bolted on", allowing more opportunity for author error, even when using libraries. The data model is not exposed in a consistent way in the markup. And matching native appearance and behavior across a range of platforms may be impractical.[4]

Specific Elements

Plaintext

the <plaintext> element is a obsolete precursor to the <pre> element. [5] It is is now in the HTML5 spec as a method of stopping all further html token parsing. It lacks an end tag and just emits the rest of the page as plain text. It throws a parse error upon reaching the end of the document as it is not considered a valid element (and it is missing an end-tag).

Image

<image> element is treated as an alternate (but invalid) name for <img>. This is because some sites (around 0.2%[6]) make this mistake. It is already treated as an image by most major browsers.

Meter and Progress (are not the same thing)

<meter> is not just a special case of <progress>. The meter element represents a scalar measurement within a known range, such as storage quota usage, a relative popularity rating or relevance indicator. The control allows for the indication of high and low ranges, or minimum, maximum and optimal levels.

The progress element, on the other hand, represents the completion progress of a task. This could be a real time indicator for background processing task (e.g. using Web Workers or a file upload). Progress elmements can also be in the indeterminate state, indicating that something is in progress, but it's completion progress is unknown.

The default rendering for a meter element could look something like the following:

example of proper rendering for the meter element

Whereas, the default rendering for the progress element could look like this:

Alternatively, an indeterminate progress bar could also be styled as a throbber, which indicates progress without any indication of the remaining progress:

picture of the default apple throbber

See Re: <progress> draft for details.

B, I, EM, STRONG, and MARK

<em> is meant to indicate that some text is emphasized. <strong> is meant to confer importance upon text. <b> is meant for text that is stylistically offset from the rest of the text. Finally <i> is used to indicate that some text is meant to be read in an alternate mood.

For example

   Cats are <em>cute</em> animals.

could mean that cats are specifically cute.

   Cats are <strong>cute</strong> animals.

could mean that the word cute is in some way important

   Cats are <b>cute</b> animals.

could mean that the word cute is a new word (perhaps in a language lesson) but is not important

   Cats are <i>cute</i> animals.

could mean that the word cute is meant to be read in a different tone (sarcastically for example)

   Cats are <mark>cute</mark> animals.

means that the sentence is to be read normally but the work "cute" should be highlighted or marked in some way. This could be used for search terms on the page or alterations to an original text.

IMG tag & alt text

On certain types of pages adding alt text is impossible (like sites that the user could upload images but does not supply a description). Because of this the alt attribute is optional [7][8][9] A longdesc attribute is not needed [10]

textarea

The text area defaults to soft wrapping of the text area. The attribute @wrap can have one of the following values: soft, hard, or off.[11]. "off" is considered a non-conforming value because it appears to have no purpose other than a visual presentational effect. [12]

hgroup and other heading elements

The point of <hgroup> is to hide the subtitle from the outlining algorithm.

details element

The <details> element is needed to provide an accessible way of reflecting a common application widget in HTML-based applications without requiring authors to use extensive scripting, ARIA, and platform-specific CSS to get the same effect.[13][14]

HTML parsing

script element

Why the restrictions for contents of script elements? Why the complicated parsing rules for script elements?

See http://lists.w3.org/Archives/Public/public-html-comments/2010Mar/0017.html

quirks mode

The HTML parser has the following behavior difference in quirks mode:

A start tag whose tag name is "table"
If the Document is not set to quirks mode, and the stack of open elements has a p element in scope, then act as if an end tag with the tag name "p" had been seen.

Why? See http://hsivonen.iki.fi/last-html-quirk/

ignored white space before head

White space before the <head> tag is ignored. The main reason is that, given the markup

<!DOCTYPE html>
<html>
 <head>
  <title>Sample page</title>
...,

some people expect

document.documentElement.firstChild

to return the head element.[15]

Failed proposals

An "advert" tag for advertisements

There is no advert tag because if users had an easy method of plainly disabling all ads from downloading or appearing content authors would cease to use the tag.[16]

sandbox attribute on the html element

HTML is the wrong level for disabling scripts or other features. This is the kind of thing we should do at the HTTP layer.[17][18]

feature queries

Various proposals have come up with the idea of being able to determine of a certain feature is available.[19] These fail for a variety of reasons: Part of the problem is that browser vendors will be economical with the truth. Marketing people always have an over-optimistic view of the compliance of their product, and will always give themselves the benefit of the doubt in borderline cases. Also, changing the compliance statement, to remove false claims that are exposed, is likely to a very low priority for the developers.[20] With regard to CSS feature compliance: Remember that CSS provides hints and implementations don't have to accept those hints, and hardware may sometimes prevent their being implemented.[21] Some other reasons can be found in the footnotes.[22][23]

Other Pages

References

  1. http://lists.w3.org/Archives/Public/public-html/2009Jul/0257.html -- Re: Codecs for <video> and <audio></a>
  2. http://lists.w3.org/Archives/Public/www-archive/2009Jul/0075.html --Formal Objection to One vendor, One Veto
  3. http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-June/026897.html
  4. http://lists.w3.org/Archives/Public/public-html/2010Jun/att-0659/issue-93-decision.html
  5. http://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt
  6. Email from Ian Hickson; comment in spec source
  7. http://www.paciellogroup.com/resources/articles/altinhtml5.html
  8. http://juicystudio.com/article/requiring-alt-attribute-html5.php
  9. http://lists.w3.org/Archives/Public/public-html/2007Jun/0393.html
  10. http://juicystudio.com/article/html5-image-element-no-alt.php
  11. http://www.whatwg.org/specs/web-apps/current-work/#the-textarea-element-0
  12. http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-August/022022.html
  13. http://www.w3.org/Bugs/Public/show_bug.cgi?id=8379#c13
  14. http://www.w3.org/html/wg/wiki/ChangeProposals/removedetails
  15. [whatwg] several messages about the tree construction stage of HTML parsing
  16. http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-February/013939.html
  17. http://www.w3.org/Bugs/Public/show_bug.cgi?id=8849
  18. https://wiki.mozilla.org/Security/CSP
  19. http://lists.w3.org/Archives/Public/www-style/2009Dec/0130.html
  20. http://lists.w3.org/Archives/Public/www-style/2010Jul/0097.html
  21. http://lists.w3.org/Archives/Public/www-style/2003Nov/0000.html
  22. http://lists.w3.org/Archives/Public/www-style/2003Oct/0074.html
  23. http://lists.w3.org/Archives/Public/www-style/2004Mar/0282.html