Change Proposal for ISSUE-120
Simplify the specification by removing features that are documented to be confusing to users.
The premise of this rationale, which is argued in detail below, is that mechanisms that bind arbitrary strings ("prefixes") to other arbitrary strings ("bases"), which can then be used in conjunction with a third set of arbitrary strings ("values") to form identifiers ("terms") that are never explicitly stated in the source, are a language design anti-pattern in the context of technology intended for broad Web deployment (e.g. in text/html).
In the context of RDFa, there are a number of mechanisms to define the mappings of "prefixes" and "bases": the prefix="" attribute, the profile="" attribute (which has an additional layer of indirection), and for legacy reasons the xmlns="" attribute. This change proposal proposes to remove all three of these features based on the same rationale. (At a high level, these features are essentially equivalent, being little more than syntactic sugar for each other.) As a result, the places where RDFa accepts a "value" that could have used one of these arbitrary "prefixes" to create a "term" can no longer do so, and is limited to either giving the "term", or using a predefined syntax with known prefixes (specifically, the empty prefix ":foo" which is short for "http://www.w3.org/1999/xhtml/vocab#foo", and the bnode syntax "_:foo").
Why arbitrary prefix mechanisms are bad
Copy and paste
Copy-and-paste of the source becomes very brittle when two separate parts of a document are needed to make sense of the content. Copy-and-paste is how the Web evolved, so I think it is important to keep it functional and easy.
Prefixes are notoriously hard for authors to understand. The most widely known arbitrary prefix mechanism on the Web is the XML namespaces feature, which is very similar to the three prefix mechanisms in RDFa. It can thus be used as a case study for the problem.
As far back as 2004, Micah wrote "As the author of an O'Reilly book on XForms, I can report that 90% of the technical questions from readers involve confusion related to namespaces".
Parand Darugar has said similar things: "Experience shows XML namespaces can be a common cause of confusion and a major complicating factor in XML adoption."
Derek Denny-Brown, who had been the lead developer of MSXML and System.Xml: "If there is any one of the W3C's family of XML specifications, that has caused me the most grief, XML Namespaces is probably it."
Fundamentally, prefixes are an indirection model. Indirection models are very, very hard for people to understand.
Maciej has also said things to this effect: "Namespaces are an example of the Fundamental Software Engineering Error, which is that something too terrible to actually use can be fixed by adding a level of indirection. Sometimes that is true but software engineers try to do it even when it clearly is not."
Questions about namespaces come up again and again, over many years:
Prefixes are notoriously hard for implementors to get right:
(This covers bugs by such vendors as Sun, Google, Yahoo!, MySpace — and these aren't just bugs that don't affect end-users, like forgetting to quote attributes in HTML.)
Prefixes are notoriously hard for implementors to document:
Prefixes have been notoriously hard for even people in the standards community to understand:
- http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Aug/0085.html ([initially suspected to be sarcasm http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Aug/0086.html]!)
Arbitrary prefixes in dynamically changing content (like HTML) are even worse because they require than an observing software agent not only track the value that they are concerned about, but also all possible ways for the value's prefixes to change meaning. So for instance, here:
<test prefixes="a=http://example.com/"> <foo> <bar> <baz content="a:b"/> </bar> </foo> </test>
...if a software agent wants to see when <baz>'s content="" attribute changes to include the value http://example.net/b, he has to not only watch the content="" attribute, but also the prefixes="" attribute of all ancestor elements up the tree, just in case they redefine the prefix "a" to mean "http://example.net/".
This change proposal describes changes to this draft, relative to the specification as it stood on January 11th 2011: http://dev.w3.org/html5/rdfa/
- In "2.2 RDFa Processor Conformance", add to the first bullet point an exception for features overridden by the Extensions section.
- In the "3. Extensions to RDFa Core 1.1" section, add a section requiring that when processing CURIEs in attributes on elements is the HTML namespace, the set of mappings from prefixes to URIs must always be treated as empty.
- In the "3. Extensions to RDFa Core 1.1" section, add a section requiring that the "prefix" attribute not be used in conforming documents and add a section requiring that user agents ignore prefix="" attributes on elements in the HTML namespace.
- In the "3. Extensions to RDFa Core 1.1" section, add a section requiring that the "profile" attribute not be used in conforming documents and add a section requiring that user agents ignore profile="" attributes on elements in the HTML namespace.
- In the "3. Extensions to RDFa Core 1.1" section, remove all the sections that refer to the "xmlns:" attributes and replace them with a single section saying that HTML+RDFa does not use the "xmlns:" attributes and that user agents must not let their processing model be affected by "xmlns:" attributes in no namespace (such as those found in text/html).
- Updates examples and other text accordingly, at the editor's discretion.