FAQ

What is the WHATWG and why did it form?

In 2004, after a W3C workshop, Apple, Mozilla and Opera were becoming increasingly concerned about the W3C’s direction with XHTML, lack of interest in HTML and apparent disregard for the needs of real-world authors. So, in response, these organisations set out to with a mission to address these concerns and the Web Hypertext Application Technology Working Group was born.

These days, the WHATWG is a growing community of browser vendors, web developers, and other people interested in the development of the the next generation of HTML and related technologies, specifically designed to allow authors to write and deploy applications over the World Wide Web.

What are “Web Applications”?

The term “Web Application” in this context refers to applications accessed over the World Wide Web by using a Web browser. This group is not attempting to describe APIs for writing high-end sophisticated programs such as office productivity suites, graphics manipulation packages, or 3D games.

Some of the most famous examples of Web applications currently deployed are eBay and Amazon.

Aren’t “Web Applications” already possible?

Yes. This working group aims to make their development easier, and hopes to specify new technologies that make it possible to make much prettier and more usable interfaces with less dependence on complex scripts, less dependence on server-generated pages, and a more seamless user experience.

For example, currently HTML forms do not specify a way to specify that a control is a required control that must be filled in before submission: such features have to be scripted explicitly.

What exactly are you working on?

The work is currently split between three specifications although the main focus is on the HTML 5 draft.

Web Forms 2.0 is a superset of the HTML 4 and XHTML 1.0 form chapters. More advanced controls like RTF controls, menus and toolbars are the domain of HTML 5 which is a rewrite of HTML 4, XHTML 1.0 and DOM Level 2 HTML. These drafts are in active development. Web Forms 2 is the most mature and will in due course be integrated into HTML 5.

Web Controls 1.0 is intended to add functionality to Javascript and CSS that aid the creation of custom widgets. However, this will be influenced by the design and implementations of XBL2, and so will not be available in the near future.

What is (X)HTML 5?

[Web Applications 1.0, Web Forms 2.0…]

What about XHTML 2.0?

[Find a simple, diplomatic way to talk about the status of XHTML 2.0. See Web Apps notes about XHTML2]

What about XForms?

[see Web Forms notes about XForms]

Why do we need both HTML 5 and XHTML 2.0?

We don’t. What the wider Web community needs is a language which, if implemented by a Web browser, will result in a browser that can render all the existing content on the Web, and which will have new features to make the Web a better place. XHTML2.0 is not such a language — a browser that only supports XHTML2 could not render existing Web content correctly.

Why improve HTML?

[Because that’s what most authors are using and all browsers support.]

Why not improve XHTML instead?

[HTML 5 is actually making improvements to both HTML and XHTML…]

Will (X)HTML 5 finally put an end the XHTML as `text/html` debate?

Yes. Unlike HTML 4.01 and XHTML 1.0, the choice of HTML or XHTML is solely dependent upon the choice of MIME type, rather than the DOCTYPE. See [/html-vs-xhtml HTML vs. XHTML]

Is XHTML better than HTML?

[Some people say XHTML is better because… But see the next question.]

Is HTML better than XHTML?

[Some people say HTML is better because… But see the previous question.]

What will the DOCTYPE be?

In HTML: <!doctype html>. The only reason there is one for HTML is to trigger standards mode in browsers.

In XHTML: no DOCTYPE. You may include one if you wish, though this is not recommended as they are only relevant when using a validating parser which web browsers do not have.

If there is no DTD, how can I validate my page?

With a conformance checker.

What is an HTML Serialisation?

The HTML serialisation refers to the syntax of an HTML document defined in HTML5. The syntax is inspired by the SGML syntax from earlier versions of HTML, but it is defined to be more compatible with the way browsers actually handle HTML in reality.

Any document whose MIME type is determined to be text/html is considered to be an HTML serialisation, even if the author has tried to use XML syntax.

What is an XML (or XHTML) Serialisation?

The XML Serialization refers to the syntax defined by XML 1.0 and Namespaces in XML 1.0. A resource that has an XML MIME type, such as application/xhtml+xml or application/xml, is an XML document and if it uses elements in the HTML namespace, it contains XHTML. If the root element is “html” in the HTML namespace, the document is referred to as an XHTML document.

What MIME type does HTML5 use?

The HTML serialisation must be served using the text/html MIME type.

The XHTML serialisation must be served using an XML MIME type, such as application/xhtml+xml or application/xml. Unlike XHTML 1.0, XHTML 5 must not be served as text/html.

Using the incorrect MIME type (text/html) for XHTML will cause the document to be parsed according to parsing requirements for HTML. In other words, it will be treated as tag soup. Ensuring the use of an XML MIME type is the only way to ensure that browsers handle the document as XML.

Browsers don’t even fully support HTML 4.01 and XHTML 1.0 yet, so why create a new version?

[HTML 4.01 and XHTML 1.0 are underspecified. In its current state, it is not possible for browsers to interoperably implement HTML 4.01 while still remaining compatible.]

HTML isn’t broken, why fix it?

[HTML is actually extremely broken. (explain…)]

Which browser vendors support the development of HTML 5?

Apple, Mozilla and Opera.

Which browser vendors support the development of XHTML 2.0?

None that we are aware of.

What about Microsoft and Internet Explorer?

HTML 5 is being developed with IE compatibility in mind. Support for many features can be simulated using JavaScript.

When will we be able to start using these new features?

As soon as browsers begin to support them. You do not need to wait till HTML5 becomes a recommendation, because that can’t happen until after the implementations are completely finished. [Need to expand on this answer]

Is HTML 5 backwards compatible?

Yes. (explain)

How can I get involved?

There are lots of ways you can get involved, take a look and see what you can do!

Do I have to pay a membership fee to participate?

No, participation is open to everyone. You may easily subscribe to the WHATWG mailing lists. You may also join the the W3C’s new HTMLWG by going through the slightly longer application process.

How can I keep track of changes to the spec?

The specification is available in the subversion repository. You may use any svn client to check out the latest version and use your clients diff tools in order compare revisions and see what has been changed. You may also use the online (X)HTML5 Tracker Tool. The tool provides an online interface for selecting and comparing revisions of the spec.

What sort of classes and ids are commonly placed on DIV elements?

[Link to related research on this issue. Need to search the archives

for previous discussion of this.]

When will HTML 5 be finished?

It is estimated that HTML5 will reach a W3C recommendation in the year 2022 or later. This will be approximately 18-20 years of development, since beginning in mid-2004.

For a spec to become a REC, it requires two 100% complete and fully interoperable implementations, which is proven by each successfully passing literally thousands of test cases (20,000 tests for the whole spec would probably be a conservative estimate). When you consider how long it takes to write that many test cases and how long it takes to implement each feature, you’ll begin to understand why the time frame seems so long.

However, the WHATWG recognises and understands the problem with this: different parts of the specification are at different maturity levels. Some sections are already relatively stable and there are implementations that are already quite close to completion, and those features can be used today (e.g. <canvas>). But other sections are still being actively worked on and changed regularly, or not even written yet.

The details are still being worked out, but the plan is to indicate the maturity level on a per-section basis. Sections like the Link Types, which is relatively simple, isn’t going to take long to become interoperably implemented. In fact, Mozilla is already implementing the new autodiscovery features for Firefox 3.0, and it shouldn’t take long for places like Technorati, Bloglines, etc. to implement follow.

Once a section is interoperably implemented, it’s quite stable and unlikely to change significantly. Any changes to such a section would most likely only be editorial in nature, particularly if the feature is already in widespread use (as autodiscovery already is today).

The point to all this is that you shouldn’t place too much weight on the status of the specification as a whole. You need to consider the stability and maturity level of each section individually.

Should I close empty elements with `/>` or `>`?

Void elements in HTML (the new name for empty elements) do not require a trailing slash. e.g. Instead of writing <br />, you only need to write <br>. This applies to all void elements, including img, input, etc.

However, due to the widespread attempts to use XHTML 1.0, there are a significant number of pages using the trailing slash. Because of this, the syntax has been permitted (though it is not recommended) in order to ease migration from XHTML 1.0 to HTML5.

It is important to realise that this syntax serves no purpose in HTML, it is just ignored by browsers. Despite the fact that it is based upon the XML syntax, it does not mean that HTML documents can be parsed with XML tools. HTML and XHTML are serparate serialisations and they each must be processed using tools designed to handle each format.

If I’m careful with the syntax I use in my HTML document, can I process it with an XML parser?

No, HTML and XML have [#differences many significant differences], particularly parsing requirements, and you cannot process one using tools designed for the other. However, since HTML5 is defined in terms of the DOM, in most cases there are both HTML and XHTML serialisations available that can represent the same document. There are, however, a few differences explained later that make it impossible to represent some HTML documents accurately as XHTML and vice versa.

If you wish to process an HTML document as XHTML, it requires that you and convert it into XHTML first; and vice versa for processing XHTML as HTML.

What is the namespace declaration?

In XHTML, you are required to specify the namespace. (need to find a simple explanation for what namespaces are for)

<code><html xmlns="http://www.w3.org/1999/xhtml"></code>

In HTML, the xmlns attribute is only allowed on the html element, and only if it has the value “http://www.w3.org/1999/xhtml“. It doesn’t do anything at all, it is merely allowed to ease migration from XHTML 1.0. It is not actually a namespace declaration in HTML, because HTML doesn’t support namespaces.

Will there be support for namespaces in HTML?

HTML5 is being defined in terms of the DOM and all HTML elements will exist in the HTML namespace: http://www.w3.org/1999/xhtml. However, unlike the XHTML serialisation, there is no real namespace syntax available in the HTML serialisation (see previous question). In other words, you do not need to declare the namespace in your HTML markup, like you do in XHTML.

[#namespace-decl XHTML requires that the namespace to be declared] appropriately using the xmlns attribute. The namespace declaration is unnecessary in HTML because browsers will already know it’s an HTML document based on the [#mime-type MIME type] (text/html).

There have been proposals for introducing MathML markup into HTML, which would exist in the MathML namespace. However, if that proposal is accepted, it will be supported in a way that does not require explicit namespace declaration in the markup.

How do I specify the character encoding?

For HTML, it is strongly recommended that you specify the encoding using the HTTP Content-Type header. If you are unable to configure your server to send the correct headers, then you may use the meta element. The meta element used for this purpose must occur as the first element in the head (even before the title), and within the first 512 bytes of the file.

<meta charset="UTF-8">

Note that this meta element is different from HTML 4, though it is compatible with many browsers because of the way encoding detection has been implemented.

In XHTML, XML rules for determining the character encoding apply. You may use either the HTTP Content-Type header or the XML declaration to specify the encoding.

<?xml version="1.0" encoding="UTF-8"?>

Otherwise, you must use the default of UTF-8 or UTF-16. It is recommended that you use UTF-8.

What are the differences between HTML and XHTML?

See the list of differences between HTML and XHTML in the wiki.

Does HTML5 support `href` on any element like XHTML 2.0?

No, supporting href on any element has several problems associated with it that make it difficult to support in HTML5.

It isn’t backwards compatible with existing browsers.
It adds no new functionality that can’t already be achieved using the a element.
It doesn’t make sense for all elements, such as interactive elements like input and button, where the use of href would interfere with their normal function.
Browser vendors have reported that implementing it would be extremely complex.

The only advantage it seems to add is that it reduces typing for authors in some cases, but that is not a strong enough reason to support it in light of the other reasons.

Why does HTML5 legitimise tag soup?

Actually it doesn’t. This is a misconception that comes from the confusion between conformance requirements for documents, and the requirements for user agents.

Due to the fundamental design principle of supporting existing content, the spec must define how to handle all HTML, regardless of whether documents are conforming or not. Therefore, the spec defines (or will define) precisely how to handle and recover from erroneous markup, much of which would be considered tag soup.

For example, the spec defines algorithms for dealing with syntax errors such as misnested tags, which will ensure that a well structured DOM tree can be produced.

Defining that is essential for one day achieving interoperability between browsers and reducing the dependence upon reverse engineering each other.

However, the conformance requirements for authors are defined separately from the processing requirements. Just because browsers are required to handle erroneous content, it does not make such markup conforming.

For example, user agents will be required to support the marquee element, but authors must not use the marquee element in conforming documents.

It is important to make the distinction between the rules that apply to user agents and the rules that apply to authors for produce conforming documents. They are completely orthogonal.

FAQ