A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on IRC (such as one of these permanent autoconfirmed members).


From WHATWG Wiki
Revision as of 03:26, 29 October 2009 by Nick (talk | contribs) (rights: why reversion)
Jump to: navigation, search

"description" meta name

I think the description name should be added to the HTML 5 specifications. Yes, search engines have made the keywords name obsolete. However search engines are not that good. It is still only the document author that can provide a reliable, short description of the documents contents. I think there should be constrains on how long, what it contains, and the structure of the description. Short sentences, and plain English descriptions would be the best.

It looks like "description" is in the latest specification. Rfc2549 01:46, 10 October 2008 (UTC)
Keywords still work. Descriptions are a good idea. However, HTML5 should not constrain search engines by specifying short sentences or any particular structure. HTML5 and this MetaExtensions Wiki should be permissive, especially for websites that depend on specialized or foreign search engines that may have other rank-driving preferences for meta tags. Rather, page authors should consider short sentences and the plain language you suggest because when they appear in search results they attract visitors. However, a site meant for physicians might have a different idea on what language is plain for their readers, so HTML can't usefully define that. Advice on good drafting is the province of search engines and various websites that report or advise. Nick 08:45, 20 April 2009 (UTC)

keywords and description should not be unendorsed, should they?

Why are keywords and description unendorsed? Yahoo and Google do use them for searches, if not as much as when they began offering search services. And there are other search engines around the world, which might well support both of these. The unendorsing of cache is explained; what are the explanations for the other two? I see keywords, but not description, was unendorsed from the beginning, which suggests a misclassification by the original proposer. I think someone should reclassify both as proposals. Would it be okay if I did? Nick 08:55, 20 April 2009 (UTC)

On description: resolved. Ian Hickson reports that it's in HTML5, and it is, in section (I had forgotten). On keywords: he didn't mention them in his reply but moved them within the Wiki from unendorsed to failed, and I've submitted a bug report for reconsideration of that decision: W3C Bug 6853. Nick 09:35, 29 April 2009 (UTC)

rights: why reversion

I'm reverting for these reasons:

The revision's added content was largely redundant of the link element rel="license" that, per <http://wiki.whatwg.org/wiki/RelExtensions>, is already in HTML5 (see section of the W3C Working Draft of 8-25-09). The link uses URLs instead of standardized strings, but those URLs can be used the same way as the proposed strings, i.e., a search engine or UA can recognize the URLs without repeatedly fetching as equivalent to the strings. And the set of URLs is essentially extensible without requiring registration here, which simplifies the tag's use.

The proposed string "coprYYYY" is probably not legally sufficient notice even in the U.S., and around the world what that notice must state may vary. It's too much for us to list all the possibilities. Let the page author write such notice as they see fit and let the search engines store or refer to it as they see fit without abbreviating it.

I doubt the proposed code/content distinction can be legally defined in this way and the distinction recognized in a court of law applying copyright law. Most judges and lawyers probably don't know how Internet standards are promulgated. For example, arrangement is copyrightable and a judge might conclude that in copyright law code is one thing and arrangement of the code is another. I haven't researched case law on point but doing so is pointless when copyright law is in many nations that each have their own laws. My proposal is more flexible and thus better able to meet page authors' needs.

The legal needs require a way of writing arbitrary or free-form text (not arbitrary as lawyers use that term).

The revision is more complicated to implement.

The revision doesn't cover multiple licensing for one work. The HTML5 working draft carries an assertion of two licenses. So does Perl, the language. When two licenses are said to apply, a relationship has to be defined, perhaps that the user can choose which to apply, perhaps one applies for commercial use and the other for noncommercial use, or whatever. The revision apparently required multiple meta tags but didn't propose that. If a search engine might take multiple meta rights tags as conflicting, things can get confusing and legal rights may be lost.

The revision doesn't cover multiple licensing for multiple works on one page. Free-form text in the meta tag would allow that. For example, "The photograph may be available from the Permissions Department at . . .; the text is licensed under . . .; the arrangement is the property of . . .; and the music is licensed through BMI."

Thank you for the character entity edit.

Perhaps another meta keyword should be proposed, perhaps rights-standard, to use what was proposed? Would it serve a purpose that link wouldn't?

Thanks. Nick 07:58, 28 October 2009 (UTC)

I see what you're saying -- my point of view would be that it would be beneficial for a machine-readable/parsable value for use in searching, cataloging, etc. Free-form text makes this very difficult, if not impossible to apply to any degree. If the page author is providing a legal notice of copyright, then they must provide the country-appropriate copyright notice visible to the page viewer (in the content itself). Copyright Notice, Deposit, and Registration § 17, 4 U.S.C.§ 401 (2007)
My fear is without caselaw to guide us on the enforceability or applicability of this new way of embedding copyright in the code, we are setting up a standard that will not last past the first court case. Am I confused about the rationale behind this tag? What does it accomplish that a notice the page's viewable content area wouldn't?
>Perhaps another meta keyword should be proposed, perhaps rights-standard, to use what was proposed? I'll do just that -- thank you. BryanH 16:47, 28 October 2009 (UTC)
Notice calculated to be seen by most Web visitors will be visible in the UA's (viz., browser's) window or canvas, so it should be written to be visible there. But code or markup is not seen there. That's seen when a UA user either looks in the source code via the UA or receives it separately and doesn't expose it to the UA. The code will often include not the word "Copyright" or a "C" in a circle but a character entity that mostly only programmers will recognize when it is not interpreted. Without interpretation, "&copy; 2009 Lois Ng" does not meet U.S legal requirements for a copyright notice.
Therefore, I argue that a copyright notice calculated to be seen by someone looking at source code without benefit of a UA must be written to be visible there and humanly understandable in raw form. I use comments for that purpose, in addition to what's to be visible in the browser's window. But comments have a drawback of being usually not parseable by search engines (except for scripts, which use specially formatted comments). Since search engines copy our content, we need a way of embedding a copyright notice they have some way of identifying as being a copyright notice. But because legal requirements vary around the world, over time, and according to what is potentially subject to copyright, not to mention that some of us feel compelled to be extra cautious (either to capture all sorts of rights or because overclaiming can jeopardize an entire claim), there's no way to settle on a single form or a finite list for a copyright notice. A few forms are far more common than others, but it would take a while to research the form of choice to be used in Malawi next year for a sound recording by Madonna on tour if she's donating her work to a local school she's building (e.g., which nation's law applies). Thus, the string must be free-form.
But that limits what search engines can do with the string. With the meta rights proposal, likely the only parsing a machine can do is to recognize that it is a rights statement because it is in a tag and with a keyword that says so in accordance with the spec (which in turn supports the MetaExtensions page) and to recognize the string's length so that it can choose to either copy and display the string or only refer to it but in neither case truncate or edit it.
If a search engine programmer wants to try more sophisticated textual analysis, they'll succeed some of the time, but this very limited parseability is enough to flag a page as having an assertion of a right, which a human might choose to read, and if the human does not read it and infringes the copyright it'll be harder to pretend they infringed innocently, which is relevant to the remedy a U.S. court may apply.
The U.S. law you linked to on form of notice does not control other nations. While it says "or elsewhere", the government's legal authority outside the U.S. is severely limited regardless of what Congress passes and other jurisdictions may enact their own requirements. There is also, within the U.S., state law on copyright, rarely invoked, generally common law, and generally on work that is not in a fixed form; for which I don't know what the notice requirement, if any, is.
Possibly a court will reject anything we design, but generally courts require us to conform to already-existing law. Federal U.S. courts do not give advisory opinions and declaratory relief is expensive, unusual, and difficult to get, so we need to use the best judgment we can marshal now. If we want the benefits of intellectual property law and when it requires that we as owners and licensees give notice, we need to design a means for doing so without waiting for a court ruling on the specific design. Often pages have no rights assertion inside a page and suitable for code readers, and that's simply a deficiency. Thus, this tag.
This tag also can apply to other intellectual property rights. Software is, in some places, subject to patents and patent rights may be granted by their holder and trademark rights may apply, too.
Thanks for creating a rights-standard proposal. You might find more licenses available, such as, perhaps, the MIT license, the GPL the LGPL, the FreeBSD license, and who knows what else that might be applied to pages. While some licenses were explicitly designed for texts, others may be applicable to texts even if that wasn't in the license drafters' original intentions.
Nick 03:26, 29 October 2009 (UTC)