Common Subset

The common subset intersecting HTML5 and XHTML5 is a subset of both syntaxes meant to create . The common subset is only implicitly defined by the HTML and XHTML specification because they have many syntax elements in common. A document is said to use the common subset when it can parse correctly with both the XML parser and the HTML parser.

A document using the conforming common subset is conforming with the specification whether it is interpreted as HTML or XHTML. The conforming common subset rejects any element with are not conforming in either of the two DOM variants.

A document using the common subset can be served as HTML (text/html media type) or XHTML (with an XML media type). The media type is what the browser use to decide if it'll be parsed as HTML or XHTML and which varient of the DOM is used.

Common Syntax

[TBD]

Limitations from HTML

[TBD]

Limitations from XHTML

[TBD]

Markup Issues and Workarounds

Base URI

HTML:

<base src="uri">

XML / XHTML:

<html xml:base="uri">

Workaround: HTTP Content-Location header:

Content-Location: uri

HTML 4 Spec

Character Set

HTML

<meta http-equiv="Content-Type" value="text/html;charset=utf-8">

XHTML / XML

<?xml version="1.0" encoding="utf-8"?>

Workaround: HTTP Content-Type header with encoding specified:

Content-Type: text/html;charset=utf-8
Content-Type: application/xhtml+xml;charset=utf-8

Language

HTML

<html lang="en">

XML / XHTML

<html xml:lang="en">

Workaround: HTTP Content-Language header:

Content-Language: en

HTML 5 Spec HTML 4 Spec

Note that there is no conforming workaround to switch language for different parts of a document. There is a method which will work however: if you use HTML's lang attribute, instead of the conformant xml:lang, browser will correctly deduce the language of the element. But this will make the document non-conforming when served with an XML media type and interpreted as XHTML.

Common Subset

Contents

Common Syntax

Limitations from HTML

Limitations from XHTML

Markup Issues and Workarounds

Base URI

Character Set

Language

Navigation menu

Common Subset

Common Syntax

Limitations from HTML

Limitations from XHTML

Markup Issues and Workarounds

Base URI

Character Set

Language

Navigation menu

Search