A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on IRC (such as one of these permanent autoconfirmed members).


From WHATWG Wiki
Revision as of 02:08, 23 August 2010 by Ocolon (talk | contribs) (canonical-wwwnone: added a question on handling different values in different files under one domain)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Memorandum of understanding

This Web page and associated Web services, if any, are provided with the understanding that providing them is cheap and relatively reliable, and that the URIs for these resources will not change after HTML5's completion, unless circumstances outside of the administrator's control force such a change.

Organisations that believe these pages to be of vital importance are encouraged to put aside funds for the defence of this page should the need arise.

--Hixie 23:40, 10 December 2007 (UTC)

XFN noise

Could XFN relations be separated? I'm under impression that they inflate the list.

I think that they should be removed, or moved to a microformat list where they would be named eg xfn.friend. I also want to emphasize that I dislike XFN as it implies to complex relation, that can be hard to resolve. For example "the maintaner of that document had that relation to the maintaner of the referenced document when the document was last edited/created" is not a good example of a relation. If we assume one knows the maintainer of the current document, how is one to know who the maintainer of the referenced document is? Someone could also be confused by the name of the relation and think that the current resource if a friend of the author…

Re: shortlink

shortlink has also been proposed (on the blagosphere) as shorter and shorturl

On canonical-domain, canonical-wwwnone, and canonical-first

Responding to your summary on the last reversion:

Two were analogous services, as noted, not the exact services. The concepts were discussed by Google and were justified there.

For canonical-wwwnone, for the concept underlying it, Google says this at the linked page under "Set your preferred domain": "Setting your preferred domain tells Google which version of your site's URL (http://www.example.com or http://example.com) you prefer. "If you set your preferred domain as http://example.com, we'll treat links to http://www.example.com exactly the same as links to your preferred domain."

Since Google supports this by requiring the site owner to tell Google through a tool apparently not on the website owner's site, I proposed a method apropos to all search engines that wish to implement it.

For canonical-domain, while the same Google page supports only intradomain canonicalization, it does support that (see under "Specify the canonical link for each version of the page"). I extended their concept to interdomain canonicalization, because many institutions have multiple domains for identical content or the same website. An owner of one website with multiple domains who wants to do well in search engine results would likely prefer that all external inbound links point to one domain but can't ask everyone else to edit their links without risking links being deleted and a lowering in search results, and just asking everyone is a bunch of work. My proposal reduces the workload and eliminates that risk while raising site credibility that helps with positioning in search engine results.

Examples of domains that probably share sites (although I didn't drill down to check beyond the home pages and didn't have the plugin to view the home page/s on the first pair): <http://esbnyc.com/> and <http://empirestatebuilding.com/>, which might redirect, and <http://www.networksolutions.com/> and <netsol.com>, as I typed it, which redirects.

While redirection is an easy solution, the site owner might prefer to display a message to users of one domain and not the other as part of educating the general public about the preferred domain (as used to be the case at yaho.com (one "o")). In that case, canonical-domain would inform search engines who wish to implement it while allowing the site owner to educate the part of the public that types the wrong name. For example, in New York City the agency that intervenes for abused children is still widely known by initials that have been abolished for 40 years (I heard it in conversation on the 15th of this month).

However, I do see one defect with my proposal. Where a rel link works, a rev link can usually be expected to work, and rev for canonical-domain could be used to steal traffic from a competitor or other site. So I will add that rev must be meaningless.

I will do the same for canonical-wwwnone, since I understand that it's possible, albeit unusual, to have separate websites for www and bare forms of the same domain, depending on the host's architecture, which means it's possible, albeit very unlikely, that the two websites could be under separate managers, and perhaps separate owners, who bitterly compete. So I will add that rev must be meaningless for that keyword, too.

For canonical-first, I did not cite Google. Because it is possible to have a canonical set of pages and one noncanonical page in situations where the keywords alternate and print wouldn't be technically adequate or semantically apt, and the canonical set of pages might not fully occupy a directory, it might be more convenient to have canonical-first as shorthand. That's easier because only one page needs coding rather than two or more. Because it is shorthand, however, it is more dispensable, if people would like to reject it. I'm putting it back for the time being, i.e., for decision, since it seems to have been removed on the basis of the Google position, but I hadn't asserted that Google supported it. I will also add that rev must be meaningless for this keyword, too.

Thank you. Nick 20:10, 20 September 2009 (UTC)

On footnote, note, and jump

The footnote and jump both seem useful if a browser opens an additional window so that the main text or prejump text, respectively, remains visible, especially useful for endnotes after long texts. This is analogous to the behavior for sidebar (HTML5, working draft of 8-25-09, section To that end, I'm renaming this proposed keyword note, as a more generic name, and adding footnote and endnote to the description. I don't know what a sidenote is and, while I can guess, I don't recall it being used or mentioned in literature, a local central public librarian doesn't remember ever seeing one, my OOo 3.0.0 spellchecker rejects it, and a callout isn't a sidenote, although I suppose this link pointing to a callout would be okay. While I'm taking sidenote out of the synonymy, which is only for legacy support (I gather it wasn't a past rel keyword), I'm still adding it to the description, in case, to preserve the original proposer's intent.

Thanks. Nick 00:33, 8 October 2009 (UTC)


Error in the description

Currently it says:

The rel value is the form preferred for indexing, e.g., rel="http://example.net".

I am pretty sure this should be:

The href value is the form preferred for indexing, e.g., href="http://example.net".
Ocolon 09:40, 21 August 2010 (UTC)


Nick, my impression is that your proposal canonical-domain already covers everything your proposal canonical-wwwnone achieves. Wouldn't it be better to concentrate on the more powerful canonical-domain and drop canonical-wwwnone? Ocolon 09:40, 21 August 2010 (UTC)



You're right about href; that's my stupid error. I corrected it.

Usage conflict

Since canonical-domain is for the case where someone owns widgetstore.example and bigfabulouswidgetstore.example and would rather everyone find the latter in search engine results, folding canonical-wwwnone into canonical-domain raises a question: Could there be a case (often enough) in which the site owner would not want a preference between the www and bare forms of bigfabulouswidgetstore.example? I understand that there are unusual but real cases in which an owner has separate sites for www and no-www.

I assume there is a realistic although small frequency, so I'd leave the range of rel values in their hands, plus keep the clarity for page authors of there being separate rel values for somewhat different purposes. But if the case is too obscure and too rare, maybe it should be folded in. In that case, the multi-domain-name site owner will either not pick canonicality at all (unlikely except for amateurs) or will have to choose either bare or www when doing so.

One possible use case would be measuring marketing success by comparing sales by what people typed, thus what advertising they saw. Getting people to say what ad they saw so often fails that companies use many department numbers, 800 phone numbers, domain names, and so on. Bare-or-www in a log might serve that same purpose.

(I'm using the .example TLD per RFC 2606, akin to example.net, in case subdomains of the latter are considered a special case.)

Thanks. Nick 17:42, 22 August 2010 (UTC)

Thank you for your examples – I see that cases like the ones you describe can occur. However, bigfabulouswidgetstore.example and www.bigfabulouswidgetstore.example are de facto two different domains, because technically "www" is a subdomain like "wiki" or any other prefix would be. Therefore canonical-wwwnone is already covered by canonical-domain (it doesn't have to be explicitly folded into it anymore), unless it is explicitly defined as not being implied. I don't think such a definition would be justified by the given example.
Let's have a closer look at two cases that are similar to the scenario you described:
  • Case 1: A website owner wants to name bigfabulouswidgetstore.example without www as canonical-domain for widgetstore.example. He also wants to name www.bigfabulouswidgetstore.example with www as canonical-domain for appstore.example. And he doesn't want to define a general preference for bigfabulouswidgetstore.example or www.bigfabulouswidgetstore.example.
    • Does this work when canonical-domain includes the power of canonical-wwwnone as well ("www-sensitive")? Yes. That's because a general preference for using bigfabulouswidgetstore.example with or without www, meaning one is the canonical domain of the other, would only be defined if canonical-domain was set with one of these values at (www.)bigfabulouswidgetstore.example. Setting different values at appstore.example and widgetstore.example is perfectly possible and only affects appstore.example and widgetstore.example, not (www.)bigfabulouswidgetstore.example.
    • Does this work if canonical-domain does not differentiate between a domain with or without www ("www-neutral")? No. The websites appstore.example and widgetstore.example can only define an unspecific (www.)bigfabulouswidgetstore.example as canonical domain name for both or with the add of canonical-wwwnone at (www.)bigfabulouswidgetstore.example a general preference.
It turns out that the "www-sensitive" canonical-domain allone offers more possibilities in this case than canonical-wwwnone and/or a "www-neutral" canonical-domain.
  • Case 2: A website owner wants to specify a canonical-domain at widgetstore.example, namely (www.)bigfabulouswidgetstore.example, without specifying whether bigfabulouswidgetstore.example or www.bigfabulouswidgetstore.example should be the canonical domain for widgetstore.example. I'm not sure which benefit could arise from this but let's consider it anyway:
    • This is clearly not possible with a single "www-sensitive" canonical-domain. It would be possible to use two different "www-sensitive" canonical-domains though. This would look weird and I wouldn't recommend to allow this, but finally that's what the domain owner wants in this scenario: Two canonical domains, with and without www.
    • Does this make more sense with a "www-neutral" canonical-domain? I don't think it does: While it doesn't look as weird it does the same, it assigns two domains, with and without www. However, there's a side effect here: greatfruitstore.example and www.greatfruitstore.example become impossible canonical-domain values if these sites don't share the same content (or else "half of the canonical domain" would be defunct or misleading).
It seems that for the advantage of making a rare case more elegantly possible "www-neutral" canonical-domains could only be used when the www- and non-www version of the href-attribute both exist and share the same content.
So, although you are right that a "www-sensitive" canonical-domain doesn't satisfy all possible scenarios, I believe it works in more scenarios than a "www-neutral" canonical-domain would.
From keeping canonical-domain "www-sensitive" follows that canonical-wwwnone doesn't provide extra functionality and could be dropped (it could still be supported as a special case of canonical-domain for historical reasons if used somewhere but shouldn't be used in new webpages).

I'm sorry these case explanations became so long. Might look complicated. Actually your proposal canonical-domain is simple yet powerful. Google webmaster tools show there's a need for this. I like this one very much. Ocolon 00:48, 23 August 2010 (UTC)

Handling different values in different files under one domain

Another thought on canonical-domain/canonical-wwwnone: Is this to be handled individually for each file or domain-wide? If this should be interpreted domain-wide, how should conflicting values in different files be handled?


  • http://foo.example/001.html contains <meta name="canonical-domain" href="http://one.example/">
  • http://foo.example/002.html contains <meta name="canonical-domain" href="http://two.example/">

On the one hand it seems to be more appropriate to have this applied for each file individually (a link-tag in a file usually adds meta information just to this file, that's why it's in this file's head). On the other hand applying this to one file only would "degrade" canonical-domain/canonical-wwwnone to something very similar to canonical which works also on a file basis.

Considering this, it might even be cleverer to do this with robots.txt? Ocolon 02:08, 23 August 2010 (UTC)