A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

Why no namespaces: Difference between revisions

From WHATWG Wiki
Jump to navigation Jump to search
m (Blockquote)
(Add Tab's arguments)
 
(2 intermediate revisions by 2 users not shown)
Line 4: Line 4:


Unfortunately, XML namespaces don't work so well in HTML -- in practice, the web is more like a bunch of piles than like a well-organized filing cabinet.
Unfortunately, XML namespaces don't work so well in HTML -- in practice, the web is more like a bunch of piles than like a well-organized filing cabinet.
----


Maciej Stachowiak explained some of the problems very well in  
Maciej Stachowiak explained some of the problems very well in  
Line 15: Line 17:
Long namespace names, including URLS, are a pain to type, make the code unreadable, and are magnets for cut-and-paste editing which sometimes ends up invalidating the use anyhow.
Long namespace names, including URLS, are a pain to type, make the code unreadable, and are magnets for cut-and-paste editing which sometimes ends up invalidating the use anyhow.
</div></blockquote>
</div></blockquote>
----
Tab Atkins also gave his own take at https://lists.w3.org/Archives/Public/public-webapps/2015JanMar/0512.html:
* URLs are not a good fit for namespaces. Humans make a number of assumptions about how URLs can be changed (capitalization, trailing /, http vs https, www or not, etc) which are often true for real URLs due to nice server software, but are not true for namespaces, which are opaque strings.
* There's no consistency in the URL structure used: some namespaces end in a word, some in a slash, some in a hash, etc.
* You can't actually fetch namespace URLs.  Again, they're opaque strings, not URLs, so there's no guarantee or expectation that there's anything useful on the other side, or that what is on the other side is parseable in any way.  As a given XML namespace becomes more popular, fetching the namespace URL constitutes a DDOS attack; the W3C, for example, has to employ sophisticated caching to prevent namespace URL requests from taking down their website.
* URLs contain a bunch of extra typing baggage that don't serve to uniquify anything, just make it longer to type.  The "http://" prefix, for example, is identical for all namespaces (and if it's not, it's one more hurdle for authors to run into).  Using a string with a higher information content is better for authors.
* Domain names don't mean much. For example, Dublin Core's namespace starts with "http://purl.org/", which is effectively meaningless.
* Similarly, path components often exist which are worthless and just lengthen the namespace for no uniquifying gain, such as the SVG namespace http://www.w3.org/2000/svg which contains /2000/ for some historical reason (it was minted in 2000, and at the time the W3C put the year in most URLs for some reason).  (Note the use of www in this URL, compared to no www in the DC namespace. Inconsistency!)
* The ability to redefine namespaces at various points in the tree make generic processing far more complicated than it should be, as <foo:bar> can refer to two completely different elements in different parts of the tree.
* The ns prefix is actually significant - you can't just refer to an element by its expanded name, you ''must'' stuff the namespace into a prefix and use it.  Again, hard for generic processing.  It's impossible to just move an element from one part of the tree to another, because its prefix may have been redefined to mean something else, and you can't just expand away the prefix to make it unambiguous; instead, you have to maintain logic to check the prefixes in use on the element (and all of its descendants) in effect in the new location, and if there are any conflicts, rename the conflicting ones on the element (and its descendants) to new unique prefixes and associate those prefixes with the namespaces in question.
----


See also: [[Namespace confusion]]
See also: [[Namespace confusion]]
[[Category:Justifications]]

Latest revision as of 03:26, 6 February 2015

In most of computer science, namespaces are a wonderful invention and greatly simplify maintenance.

XML namespaces were intended to provide these advantages to markup languages, and they have had some success within XML, perhaps because its draconian error-handling encourages careful use.

Unfortunately, XML namespaces don't work so well in HTML -- in practice, the web is more like a bunch of piles than like a well-organized filing cabinet.


Maciej Stachowiak explained some of the problems very well in http://lists.w3.org/Archives/Public/public-html/2009Jul/0919.html

An unfairly short summary is that:

Short names like "rdf:" or "foaf:" often work OK, if they are treated as globally unique. But by definition, they aren't globally unique, and they really can't be. In some cases, HTML has opted to define prefixes itself (such as aria-*), but the general XML solution is to use a URL.

Long namespace names, including URLS, are a pain to type, make the code unreadable, and are magnets for cut-and-paste editing which sometimes ends up invalidating the use anyhow.


Tab Atkins also gave his own take at https://lists.w3.org/Archives/Public/public-webapps/2015JanMar/0512.html:

  • URLs are not a good fit for namespaces. Humans make a number of assumptions about how URLs can be changed (capitalization, trailing /, http vs https, www or not, etc) which are often true for real URLs due to nice server software, but are not true for namespaces, which are opaque strings.
  • There's no consistency in the URL structure used: some namespaces end in a word, some in a slash, some in a hash, etc.
  • You can't actually fetch namespace URLs. Again, they're opaque strings, not URLs, so there's no guarantee or expectation that there's anything useful on the other side, or that what is on the other side is parseable in any way. As a given XML namespace becomes more popular, fetching the namespace URL constitutes a DDOS attack; the W3C, for example, has to employ sophisticated caching to prevent namespace URL requests from taking down their website.
  • URLs contain a bunch of extra typing baggage that don't serve to uniquify anything, just make it longer to type. The "http://" prefix, for example, is identical for all namespaces (and if it's not, it's one more hurdle for authors to run into). Using a string with a higher information content is better for authors.
  • Domain names don't mean much. For example, Dublin Core's namespace starts with "http://purl.org/", which is effectively meaningless.
  • Similarly, path components often exist which are worthless and just lengthen the namespace for no uniquifying gain, such as the SVG namespace http://www.w3.org/2000/svg which contains /2000/ for some historical reason (it was minted in 2000, and at the time the W3C put the year in most URLs for some reason). (Note the use of www in this URL, compared to no www in the DC namespace. Inconsistency!)
  • The ability to redefine namespaces at various points in the tree make generic processing far more complicated than it should be, as <foo:bar> can refer to two completely different elements in different parts of the tree.
  • The ns prefix is actually significant - you can't just refer to an element by its expanded name, you must stuff the namespace into a prefix and use it. Again, hard for generic processing. It's impossible to just move an element from one part of the tree to another, because its prefix may have been redefined to mean something else, and you can't just expand away the prefix to make it unambiguous; instead, you have to maintain logic to check the prefixes in use on the element (and all of its descendants) in effect in the new location, and if there are any conflicts, rename the conflicting ones on the element (and its descendants) to new unique prefixes and associate those prefixes with the namespaces in question.



See also: Namespace confusion