A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.
To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).
Talk:MetaExtensions: Difference between revisions
(→FSDateCreation and FSDatePublish have exact same meaning: new section) |
(→geo.position and icbm: Replied.) |
||
(4 intermediate revisions by the same user not shown) | |||
Line 328: | Line 328: | ||
I'd like to depropose dir-content-pointer (which [https://wiki.whatwg.org/index.php?title=MetaExtensions&diff=3727&oldid=3705 I originally proposed]), because HTML has acquired more elements and they can handle the same function, and handle it better. It doesn't look like it would be a good synonym for anything. Should I move it into the Failed Proposals table? or just delete it? [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 05:37, 3 December 2017 (UTC) | I'd like to depropose dir-content-pointer (which [https://wiki.whatwg.org/index.php?title=MetaExtensions&diff=3727&oldid=3705 I originally proposed]), because HTML has acquired more elements and they can handle the same function, and handle it better. It doesn't look like it would be a good synonym for anything. Should I move it into the Failed Proposals table? or just delete it? [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 05:37, 3 December 2017 (UTC) | ||
:Done. I moved it without deproposing. [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 04:32, 7 January 2018 (UTC) | |||
== want to fail bot-. . . as useless and as a namespace == | == want to fail bot-. . . as useless and as a namespace == | ||
I'd like to depropose bot-. . ., which [https://wiki.whatwg.org/index.php?title=MetaExtensions&diff=3727&oldid=3705 I originally proposed]. I don't see a use for it or for developing it. Should I move it into the Failed Proposals table? [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 05:46, 3 December 2017 (UTC) | I'd like to depropose bot-. . ., which [https://wiki.whatwg.org/index.php?title=MetaExtensions&diff=3727&oldid=3705 I originally proposed]. I don't see a use for it or for developing it. Should I move it into the Failed Proposals table? [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 05:46, 3 December 2017 (UTC) | ||
:Done. I moved it. [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 04:26, 7 January 2018 (UTC) | |||
== twitter:* spec deleted, unsure if found == | == twitter:* spec deleted, unsure if found == | ||
Line 350: | Line 352: | ||
[[User:Nick|Nick]] ([[User talk:Nick|talk]]) 06:38, 3 December 2017 (UTC) | [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 06:38, 3 December 2017 (UTC) | ||
:With respect to publisher: Done. Added as synonym for dcterms.publisher only. | |||
:No change with respect to geographic-coverage. | |||
:[[User:Nick|Nick]] ([[User talk:Nick|talk]]) 04:43, 7 January 2018 (UTC) | |||
== between dc.* and dcterms.*, maybe same specs and definitions and maybe one set should be synonyms for the other == | == between dc.* and dcterms.*, maybe same specs and definitions and maybe one set should be synonyms for the other == | ||
Line 358: | Line 363: | ||
The keyword icbm is a synonym for geo.position and geo.position is a synonym for icbm. Both can't be true, but each synonym is listed with "(different value syntax)". So, perhaps neither should be a synonym for the other, but maybe be moved into descriptions as cross-references. What should we do? [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 06:52, 3 December 2017 (UTC) | The keyword icbm is a synonym for geo.position and geo.position is a synonym for icbm. Both can't be true, but each synonym is listed with "(different value syntax)". So, perhaps neither should be a synonym for the other, but maybe be moved into descriptions as cross-references. What should we do? [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 06:52, 3 December 2017 (UTC) | ||
:Done. Resolved by delisting each as synonym of other and listing each in description as near-duplicate of other. [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 04:51, 7 January 2018 (UTC) | |||
== ambivalence of synonymy for 2 geo.* keywords == | == ambivalence of synonymy for 2 geo.* keywords == | ||
Line 373: | Line 379: | ||
FSDateCreation and FSDatePublish, in the Meta Name Extensions table, have the same description: "[m]entions the date when this web page was created". The spec gives the same meaning for both: "[t]he date when this web page was created". Presumably, one description and one meaning are wrong, rather than that one name is a synonym for the other. I have emailed the organization behind the spec, asking whether FSDatePublish should be defined as "[t]he date when this web page was published" (and, by implication, whether the description at MetaExtensions should be "[m]entions the date when this web page was published" for FSDatePublish to avoid redundancy with FSDateCreation). I've invited their response here or at the MetaExtensions page and anyone else may wish to respond as well. [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 07:04, 3 December 2017 (UTC) | FSDateCreation and FSDatePublish, in the Meta Name Extensions table, have the same description: "[m]entions the date when this web page was created". The spec gives the same meaning for both: "[t]he date when this web page was created". Presumably, one description and one meaning are wrong, rather than that one name is a synonym for the other. I have emailed the organization behind the spec, asking whether FSDatePublish should be defined as "[t]he date when this web page was published" (and, by implication, whether the description at MetaExtensions should be "[m]entions the date when this web page was published" for FSDatePublish to avoid redundancy with FSDateCreation). I've invited their response here or at the MetaExtensions page and anyone else may wish to respond as well. [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 07:04, 3 December 2017 (UTC) | ||
== copyright in (non-WHATWG-wiki) spec copying content from WHATWG Wiki == | |||
In writing specifications that are copied significantly from the WHATWG Wiki (I had written significantly into the wiki, which is under CC0 or (possibly for some older portions by other contributors) the MIT license), I think the law requires that the spec also be under CC0 or the MIT license. A short phrase is not copyrightable, so a short phrase can be copied without invoking this duty. Some other kinds of content are not copyrightable and thus can be freely copied. Fair use may or may not apply. Whether CC0 or the MIT license would apply to the spec depends on whether CC0 or the MIT license, respectively, applied to what is being copied from the wiki. (Identifying MIT-licensed content is complicated by the need to identify its author and then whether the author replaced the MIT license with CC0.) This is regardless of where the spec is published or hosted or by whom the spec is created. This may be true only in the U.S. or in other nations; I don't know about other nations' laws. This does not apply to an underlying concept or information but to the expression of it, so writing in original words generally allows other treatment of copyright. While CC0 might not have to apply to portions of the spec that were not copied from CC0 material, the MIT license might have to be applied to the entire spec. In general, CC0 or the MIT license would have to apply to some or all of an off-Wiki spec that is liberally copied from the Wiki, even to portions that were not themselves copied. (If that were not true, then I could photocopy the original works of Shakespeare and Mozart and, with or without making a few modern changes, secure a new copyright for myself on what they wrote, making a fortune from the process.) [[User:Nick|Nick]] ([[User talk:Nick|talk]]) 07:29, 3 December 2017 (UTC) |
Latest revision as of 04:51, 7 January 2018
"description" meta name
I think the description name should be added to the HTML 5 specifications. Yes, search engines have made the keywords name obsolete. However search engines are not that good. It is still only the document author that can provide a reliable, short description of the documents contents. I think there should be constrains on how long, what it contains, and the structure of the description. Short sentences, and plain English descriptions would be the best.
- It looks like "description" is in the latest specification. Rfc2549 01:46, 10 October 2008 (UTC)
- Keywords still work. Descriptions are a good idea. However, HTML5 should not constrain search engines by specifying short sentences or any particular structure. HTML5 and this MetaExtensions Wiki should be permissive, especially for websites that depend on specialized or foreign search engines that may have other rank-driving preferences for meta tags. Rather, page authors should consider short sentences and the plain language you suggest because when they appear in search results they attract visitors. However, a site meant for physicians might have a different idea on what language is plain for their readers, so HTML can't usefully define that. Advice on good drafting is the province of search engines and various websites that report or advise. Nick 08:45, 20 April 2009 (UTC)
- The html "title" meta name is not validating in our department's webpages, however it is included as a synonym under dcterms.title. We have no trouble with the html "description" validating properly which is also listed as a synonym under dcterms.description. anyone know why our html "title" gets an error message? Thanks.
- The html "title" referenced to in the text is not an extension of the
<meta>
element, but instead the proper<title>
HTML tag. According to the HTML specification, thetitle
element «represents the document's title or name. Authors should use titles that identify their documents even when they are used out of contex». That's why it serves the same purpose of<meta name="dcterms.title" />
and it belongs to the metadata content category. (--Andy sky (talk) 14:59, 11 September 2013 (UTC))
- The html "title" referenced to in the text is not an extension of the
- The html "title" meta name is not validating in our department's webpages, however it is included as a synonym under dcterms.title. We have no trouble with the html "description" validating properly which is also listed as a synonym under dcterms.description. anyone know why our html "title" gets an error message? Thanks.
- Keywords still work. Descriptions are a good idea. However, HTML5 should not constrain search engines by specifying short sentences or any particular structure. HTML5 and this MetaExtensions Wiki should be permissive, especially for websites that depend on specialized or foreign search engines that may have other rank-driving preferences for meta tags. Rather, page authors should consider short sentences and the plain language you suggest because when they appear in search results they attract visitors. However, a site meant for physicians might have a different idea on what language is plain for their readers, so HTML can't usefully define that. Advice on good drafting is the province of search engines and various websites that report or advise. Nick 08:45, 20 April 2009 (UTC)
keywords and description should not be unendorsed, should they?
Why are keywords and description unendorsed? Yahoo and Google do use them for searches, if not as much as when they began offering search services. And there are other search engines around the world, which might well support both of these. The unendorsing of cache is explained; what are the explanations for the other two? I see keywords, but not description, was unendorsed from the beginning, which suggests a misclassification by the original proposer. I think someone should reclassify both as proposals. Would it be okay if I did? Nick 08:55, 20 April 2009 (UTC)
- On description: resolved. Ian Hickson reports that it's in HTML5, and it is, in section 4.2.5.1 (I had forgotten). On keywords: he didn't mention them in his reply but moved them within the Wiki from unendorsed to failed, and I've submitted a bug report for reconsideration of that decision: W3C Bug 6853. Nick 09:35, 29 April 2009 (UTC)
rights: why reversion
I'm reverting for these reasons:
The revision's added content was largely redundant of the link element rel="license" that, per <http://wiki.whatwg.org/wiki/RelExtensions>, is already in HTML5 (see section 6.12.3.9 of the W3C Working Draft of 8-25-09). The link uses URLs instead of standardized strings, but those URLs can be used the same way as the proposed strings, i.e., a search engine or UA can recognize the URLs without repeatedly fetching as equivalent to the strings. And the set of URLs is essentially extensible without requiring registration here, which simplifies the tag's use.
The proposed string "coprYYYY" is probably not legally sufficient notice even in the U.S., and around the world what that notice must state may vary. It's too much for us to list all the possibilities. Let the page author write such notice as they see fit and let the search engines store or refer to it as they see fit without abbreviating it.
I doubt the proposed code/content distinction can be legally defined in this way and the distinction recognized in a court of law applying copyright law. Most judges and lawyers probably don't know how Internet standards are promulgated. For example, arrangement is copyrightable and a judge might conclude that in copyright law code is one thing and arrangement of the code is another. I haven't researched case law on point but doing so is pointless when copyright law is in many nations that each have their own laws. My proposal is more flexible and thus better able to meet page authors' needs.
The legal needs require a way of writing arbitrary or free-form text (not arbitrary as lawyers use that term).
The revision is more complicated to implement.
The revision doesn't cover multiple licensing for one work. The HTML5 working draft carries an assertion of two licenses. So does Perl, the language. When two licenses are said to apply, a relationship has to be defined, perhaps that the user can choose which to apply, perhaps one applies for commercial use and the other for noncommercial use, or whatever. The revision apparently required multiple meta tags but didn't propose that. If a search engine might take multiple meta rights tags as conflicting, things can get confusing and legal rights may be lost.
The revision doesn't cover multiple licensing for multiple works on one page. Free-form text in the meta tag would allow that. For example, "The photograph may be available from the Permissions Department at . . .; the text is licensed under . . .; the arrangement is the property of . . .; and the music is licensed through BMI."
Thank you for the character entity edit.
Perhaps another meta keyword should be proposed, perhaps rights-standard, to use what was proposed? Would it serve a purpose that link wouldn't?
Thanks. Nick 07:58, 28 October 2009 (UTC)
- I see what you're saying -- my point of view would be that it would be beneficial for a machine-readable/parsable value for use in searching, cataloging, etc. Free-form text makes this very difficult, if not impossible to apply to any degree. If the page author is providing a legal notice of copyright, then they must provide the country-appropriate copyright notice visible to the page viewer (in the content itself). Copyright Notice, Deposit, and Registration § 17, 4 U.S.C.§ 401 (2007)
- My fear is without caselaw to guide us on the enforceability or applicability of this new way of embedding copyright in the code, we are setting up a standard that will not last past the first court case. Am I confused about the rationale behind this tag? What does it accomplish that a notice the page's viewable content area wouldn't?
- >Perhaps another meta keyword should be proposed, perhaps rights-standard, to use what was proposed? I'll do just that -- thank you. BryanH 16:47, 28 October 2009 (UTC)
- Notice calculated to be seen by most Web visitors will be visible in the UA's (viz., browser's) window or canvas, so it should be written to be visible there. But code or markup is not seen there. That's seen when a UA user either looks in the source code via the UA or receives it separately and doesn't expose it to the UA. The code will often include not the word "Copyright" or a "C" in a circle but a character entity that mostly only programmers will recognize when it is not interpreted. Without interpretation, "© 2009 Lois Ng" does not meet U.S legal requirements for a copyright notice.
- Therefore, I argue that a copyright notice calculated to be seen by someone looking at source code without benefit of a UA must be written to be visible there and humanly understandable in raw form. I use comments for that purpose, in addition to what's to be visible in the browser's window. But comments have a drawback of being usually not parseable by search engines (except for scripts, which use specially formatted comments). Since search engines copy our content, we need a way of embedding a copyright notice they have some way of identifying as being a copyright notice. But because legal requirements vary around the world, over time, and according to what is potentially subject to copyright, not to mention that some of us feel compelled to be extra cautious (either to capture all sorts of rights or because overclaiming can jeopardize an entire claim), there's no way to settle on a single form or a finite list for a copyright notice. A few forms are far more common than others, but it would take a while to research the form of choice to be used in Malawi next year for a sound recording by Madonna on tour if she's donating her work to a local school she's building (e.g., which nation's law applies). Thus, the string must be free-form.
- But that limits what search engines can do with the string. With the meta rights proposal, likely the only parsing a machine can do is to recognize that it is a rights statement because it is in a tag and with a keyword that says so in accordance with the spec (which in turn supports the MetaExtensions page) and to recognize the string's length so that it can choose to either copy and display the string or only refer to it but in neither case truncate or edit it.
- If a search engine programmer wants to try more sophisticated textual analysis, they'll succeed some of the time, but this very limited parseability is enough to flag a page as having an assertion of a right, which a human might choose to read, and if the human does not read it and infringes the copyright it'll be harder to pretend they infringed innocently, which is relevant to the remedy a U.S. court may apply.
- The U.S. law you linked to on form of notice does not control other nations. While it says "or elsewhere", the government's legal authority outside the U.S. is severely limited regardless of what Congress passes and other jurisdictions may enact their own requirements. There is also, within the U.S., state law on copyright, rarely invoked, generally common law, and generally on work that is not in a fixed form; for which I don't know what the notice requirement, if any, is.
- Possibly a court will reject anything we design, but generally courts require us to conform to already-existing law. Federal U.S. courts do not give advisory opinions and declaratory relief is expensive, unusual, and difficult to get, so we need to use the best judgment we can marshal now. If we want the benefits of intellectual property law and when it requires that we as owners and licensees give notice, we need to design a means for doing so without waiting for a court ruling on the specific design. Often pages have no rights assertion inside a page and suitable for code readers, and that's simply a deficiency. Thus, this tag.
- This tag also can apply to other intellectual property rights. Software is, in some places, subject to patents and patent rights may be granted by their holder and trademark rights may apply, too.
- Thanks for creating a rights-standard proposal. You might find more licenses available, such as, perhaps, the MIT license, the GPL the LGPL, the FreeBSD license, and who knows what else that might be applied to pages. While some licenses were explicitly designed for texts, others may be applicable to texts even if that wasn't in the license drafters' original intentions.
- Nick 03:26, 29 October 2009 (UTC)
Based upon the discussion, I realize that trying to be too encompassing is/was too complicated. I've narrowed the scope to only media (i.e., images, video and other objects) and realigned the purpose: to enable search engines/crawlers to know if a page's objects have special rights for cataloging purposes. — BryanH 17:54, 13 July 2011 (UTC)
- Is this media rights stuff broadly used or did you make it up just now? If you just made it up, you are duplicating functionality from the work licensing microdata vocabulary. Could you, please, check if the microdata vocab solves your problem and not reinvent yet another syntax if it does? hsivonen 07:25, 14 July 2011 (UTC)
Meta versus content
Why is content being put into the meta tag that only search engines can view? Wouldn't it be more useful to have this information in regular HTML code that is viewable to the end user while at the same time easily findable to search engines?
I will need to think about the details, but here is a start.
There is an HTML tag called "base"
<BASE HREF="http://www.example.com/">
That is the foundation of the website or part thereof of the website. All other parts go to that extension. By creating standard pages that are defined on the page -- Title Page A goes with content page A -- it will make it easier for both search engines and the end reader to find the information.
<TITLEPAGE HREF="titlepage.html">
On the titlepage, that is associated with that specific web page, one would expect to find all of the information that one would traditionally find on a title of a print source similar to a book or magazine.
There could be standard XML notation for this info: <TITLE> <SUBTITLE> <AUTHOR> <EDITOR> <PUBLISHER> <ILLUSTRATOR> <PUBLISH-ADDRESS> <COPYRIGHT> <DDC> (dewey decimal classification) <LLC> (library of congress classification) <ISBN> - maybe start a standard where people could register a website, and that organization verifies that the website's content match the info on the titlepage.
Whatever other names are standard in the industry.
<TOC-COMPACT HREF="toc-compact.html">
It points to the TOC of the site. Similair to a traditional TOC.
<TOC HREF="toc.html">
A more detailed TOC.
<ABOUTAUTHOR HREF="about_author.html">
An about the author page.
<INTRODUCTION HREF="introduction.html">
Information generally found in an introduction. Who is the target audience of the website, what are the goals of the website.
<PREVIOUS HREF="previous.html">
If the website is intended to be read like a book, what is the previous page to read?
<NEXT HREF="next.html">
What is the next web page to read?
keywords and descriptions
Personally, I don't understand why the keywords and descriptions are in the header as opposed to the main content. It is content. All content should be viewable to the end reader without having to look at the source code. Hidden titles, descriptions, and keywords would be the same as subliminal messages on a TV. They are being shown, but the user does not know what they are.
Also, by bringing them out into the open part of a web page, there is a less likelihood of abuse.
Maybe things should change so that instead of a search engine just reading the "head" part, they also read the new HTML5 tag "header" and "footer" as well.
Mnewman 17:08, 12 February 2010 (UTC)
- > Why is content being put into the meta tag that only search engines can view?
- > Wouldn't it be more useful to have this information in regular HTML code that
- > is viewable to the end user while at the same time easily findable to search engines?
- > . . . . .
- > Personally, I don't understand why the keywords and descriptions are in the header
- > as opposed to the main content. It is content.
- Yes, in general, but there are legitimate exceptions. Google and probably most search engines recommend exactly what you suggest, even recommending that the primary subject be obvious in the lead paragraphs, in headlines marked with h1, h2, and similar elements, and in page titles. However, this also constrains writing to a style suited to search engine algorithmic extractions. Not all writers want to write that way and not all audiences need it; some prefer or demand other writing styles. Longer writings often begin with background that is not the central subject. Scholarship often distinguishes what will not be discussed and search engines can easily misunderstand the presence of a word as indicative of content rather than of noncontent. And the description meta tag serves the specific purpose of supplying the blurb that search engines display in results. Since a page author often knows what would better describe the page in two lines than would a search engine's extractive algorithm, the page author can provide the description meta tag and give searchers a better idea of what they'll find if they click the result for that page.
- > Hidden titles, descriptions, and keywords would be the same as subliminal messages
- > on a TV. They are being shown, but the user does not know what they are.
- They're not subliminal; they're not visible at all. The head (I don't think you meant header) is not displayed unless the page author made a mistake or something's wrong with a browser or a computer. Only parts of the body should be visible.
- As to subliminalities in the body, including in the header, Google and other search engines discourage the use of on-screen text that's too small or too low-contrast to be humanly readable under normal conditions and they do not meet accessibility standards. A website using that kind of styling is either an art site or abusive.
- > Maybe things should change so that instead of a search engine
- > just reading the "head" part, they also read the new HTML5 tag
- > "header" and "footer" as well.
- Search engines already read head and body and doubtless read header and footer as part of the body. They probably read everything in a page file, although they may discard parts for their permanent indexes.
- > . . . . . On the titlepage, that is associated with that specific web page, one would
- > expect to find all of the information that one would traditionally find on a title of
- > a print source similar to a book or magazine.
- Creating a page to bibliographically describe another page is already supported using rel="" and rev="" (see HTML5 and http://wiki.whatwg.org/wiki/RelExtensions) and, alternatively, the same bibliographic information can be put on the page with the content using various elements. Some of the meta tags proposed are reusing those previously in use under HTML 4.01. Perhaps the Dublin Core system would help, although I've had difficulty implementing it in (X)HTML and I don't know if search engines use DC.
- Generally, there's a preference for reusing existing technology rather than inventing a new solution to an already-solved problem, so see if existing elements would already solve the problem you perceive or please identify a problem that has not been solved and then design a solution to fit that.
- > There could be standard XML notation for this info . . . .
- You can be inspired by XML, but for HTML5 try to use HTML (in a way that's compatible with XHTML) before going outside to XML. XML is very good for intranets (including intranets that can receive and process external submissions) and for industries and endeavors that can maintain their own standards, such as for chemistry, math, and site maps, but I'm not sure XML elements should be reserved for a purpose without a clearly recognizable administrative body to maintain that particular subset of elements in a standard. Since there are various kinds of publishing, several noncomputer standards would likely have to be combined. I don't know if it's worth the work.
- The Dewey Decimal system is not of much use since it is copyrighted and permission is required for any nonlibrary use, including website classification.
- The Universal system may not have much use. At least, I asked a librarian at a major library if she knew of anyone using it and she didn't. One librarian is not a large sample, so maybe it should be investigated further.
- The Library of Congress Subject Headings system is good but I don't know what other nations use.
- > maybe start a standard where people could register a website,
- > and that organization verifies that the website's content match
- > the info on the titlepage.
- Sounds expensive. How much are reviewers' salaries and who writes the checks?
- Thanks for the thoughts.
- Nick 03:14, 18 February 2010 (UTC)
Mnewman's Feb. 12, 2010, deleted comments
I prepared a reply, since I thought they might have been deleted by accident, but they haven't come back, so probably not. If there's interest in these, feel free to indicate while I have the reply. Thanks. Nick 03:30, 18 February 2010 (UTC)
Re: Proposed 'creator' MetaExtension
I really don't see the purpose behind this proposal. If anything, this should be marked as a synonym for the 'author' MetaExtension already defined in HTML if it is found that this particular extension is aready seeing use. Otherwise, it should be marked as unendorsed.
--Codeguru413 18:27, 2 February 2011 (UTC)
- They're different. The author is the author of the Web page. The creator is the creator of content that was independent of the Web until an author authored a Web page with the content. Thus, if you put Shakespeare's plays onto the Web you're the author (page author) and he was the creator. These are often different people and, when they are, they usually need separate identifications. Nick 03:37, 5 February 2011 (UTC)
Re: Proposed 'format-print' MetaExtension
I question the need for this as well. CSS already provides mechanisms by which to set the page size and such through the @media mechanism. However, it's an interesting proposal in that a value such as this (e.g. A4, Letter, etc.) is easily recognizable and has a meaning that provides values not only to OS's (or, more accurately, UAs), but also end-users. Perhaps this is more something that should be added to CSS as an alternative to explicitly setting the page size? Anyway, I'd be interested to hear more comments on this proposal.
--Codeguru413 18:36, 2 February 2011 (UTC)
Dublin Core metadata
Dublin Core (DC) metadata is in the list, but it needs work. Only two of the 15 DC elements are listed (although these are listed as DCTERMS, and these should be synonyms). The DC elements have been defined in IETF RFC 5013 [RFC5013], ANSI/NISO Standard Z39.85-2007 [NISOZ3985] and ISO Standard 15836:2009 [ISO15836], so I don't see how they can be left out.
Also, the Dublin Core Administrative Components (AC) are not listed; these are specified at http://biblstandard.dk/ac/. A couple of these AC elements, handling and action, have their own schemes, and I am not not sure how to handle these. If these are enumerated then it makes for a lot more keywords; for example, AC.handling becomes AC.handling.harvest, AC.handling.public, AC.handling.manual, AC.handling.keep, and AC.handling.mail.
What do people think about this? Should the full sets of DC elements, DCTERMS, and AC elements be included? Ian Hickson has suggested to me that what is more important than whether there are standards that define them, is whether there is any software that consumes them in a useful manner. I agree, but do not have this information. Although I have used AC and DC elements for years, it does not mean that others have. Martin.leese 03:01, 15 July 2012 (UTC)
Later. For full sets of these, I count there would be 55 DCTERMS (currently 55), 15 DC elements (currently 2), and 31 AC elements (currently zero). The high numbers are because Dublin Core is a comprehensive system for metadata. Given this many keywords, would it be more convenient for them to be in a separate table (or tables)? Martin.leese 20:35, 16 July 2012 (UTC)
Even later. A list of projects that use Dublin Core metadata is maintained here. Note that Dublin Core encourages the use of DCTERMS elements over DC (here), so the DC elements could be omitted. Finally, there is currently no specification that specifies the AC elements as HTML meta keywords, so they will have to wait. Martin.leese 04:04, 20 July 2012 (UTC)
Structured Data proposal
It is worth noting that the Dublin Core initiative started with RDF and it now seems suuitable to move towards a Structured Data RDFa implementation. I totally agree with this, because it is much more systematic and syntactically correct. However it involves several changes in the way DC metadata are expressed. Below a comparison between the 2 ways to express it.
Pure HTML syntax
( |
HTML+RDFa syntax
( | |
---|---|---|
Namespace declaration | a <link> sibling element:
|
the attribute prefix in the parent element, with this pattern:
|
Property declaration | name attribute
|
property attribute
|
Property name | [prefix]. (dot)
|
[prefix]: (colon)
|
Namespace-prefix association | Normative specs declaration
|
Conventional, suggested by common use
|
Requires standardisation as enumerated | Yes | No - only DublinCore definition is required |
Suitable elements | <meta> ; <link> elements never standardised.
|
Any |
Value type | Formally, property in DC namespace identify resources, while DCTERMS subproperty allow literals/literal surrogate.
Substantially, only literals are allowed. Resource defined via URIs would require the use of a |
Literals, literal/non-literal surrogates, resources, URIs, specific datatypes (language, datetime), formatted text (XML only)
The RDFa syntax allows a broader variety of datatypes, each specific for the element on which the property is specified |
Even if this page is dedicated to MetaExtensions, I think that authors should know which is the most modern and useful way to provide metadata. Notify me if there are errors, or if a further discussion is required. --Andy sky (talk) 13:12, 28 October 2013 (UTC)
Property list revision
After several studies of use cases and tests, the list of Dublin Core Metadata Initiative properties listed in this table has been reviewed according to DCMI documentation material and specifications (see references above and in text). As a result:
- elements with DC. prefix have been checked. Some of them had been inserted in this list, despite having been defined in the /terms/ namespace, and thus they have been removed. In addition to this, a reference to specific /terms/ subproperty has been listed in the Synonyms column as well as condition for DC/DCTERMS disambiguation.
- removed reference to DCTERMS.collection as it doesn't belong to the /terms/ namespace, but it rather constitutes a type definition.
- a reference has been added to DCTERMS properties, some of which can have non-literal values, to use
<link rel="DCTERMS.property">
instead, to represent a non-literal value reference. - properties in the /terms/ namespace which are "intended to be used with non-literal values as defined in the DCMI Abstract Model (http://dublincore.org/documents/abstract-model/)" have been removed from the list, as a
<meta>
element is not suitable to represent non-literal value surrogates.- such properties have not been removed from /elements/1.1/ list, as this properties collection was created before the introduction of the "range" notion, thus these properties have no preferred value type. However, as stated above and in the table itself, /elements/1.1/ namespace has been maintained for legacy compatibility and the /terms/ revision is to be preferred when properties are used along with the correct range.
--AndySky21 (talk) 22:35, 12 March 2015 (UTC)
About the "properties... have been removed" part, I'm going to revert all the changes you made that removed anything, because you can't be unilaterally removing values that other people have registered. Other people may be using that values in documents, and you have no way of knowing whether they are are not, and your removal may cause their existing document to now unexpectedly start failing validation.
In fact, it caused the vnu/Nu HTML Checker regression test suite to start failing.
So, if you want to do some cleanup and normalization of DC stuff, that's fine, but you absolutely must not remove any values that anybody else had already registered. Or even ones that you've registered previously yourself. At least not without taking it to the whatwg list and/or public-html list to get agreement first.
MikeSmith (talk) 06:36, 15 March 2015 (UTC)
- Believe it or not, I felt that something could be wrong about it, and I hoped that someone could help me, so thank you about the notes on how to discuss it. The fact is, I don't know who inserted the wrong values, as I know that it was me who inserted some entries about Dublin Core. I'll write a message to the lists as advised. --AndySky21 (talk) 12:49, 15 March 2015 (UTC)
- Sounds good MikeSmith (talk) 14:16, 15 March 2015 (UTC)
- Done (HTML) and done (WHATWG). I just wait for anyone to reply. --AndySky21 (talk) 11:45, 18 March 2015 (UTC)
- EDIT: MikeSmith, please, if mailing list threads receive no response/review, I need your help to have this stuff corrected. Thanks for your patience.--AndySky21 (talk) 00:58, 25 March 2015 (UTC)
- Sounds good MikeSmith (talk) 14:16, 15 March 2015 (UTC)
Highwire Press/Google Scholar meta names
I have added several citation_* meta names ("citation_" is essentially the namespace prefix) to the table. This includes those mentioned explicitly in Google Scholar's indexing guidelines, as well as those in widespread usage by academic journal publishers, which are supported by Google Scholar but not mentioned in the documentation.
Some of the citation_* meta names are also covered by Dublin Core, but are in many cases are preferred, as Dublin Core does not cover the whole range of metadata needed for description of an academic paper and its publication venue.
(This thread was created by Aeaton (talk) 14:45 (hour estimated from zone conversion), 15 December 2014 (UTC). Nick (talk) 20:11, 30 January 2016 (UTC))
lots more meta names for scholarship
A whole slew of meta tag names need to be added in connection with Google Scholar, though few or none (that haven't already been added) are from Google itself. I don't have time to add them myself or even to compile them. Maybe their proprietors/promoters should be contacted with an explanation of the reason for adding them to here and an invitation to add them. Apparently, they're already in use. Examples include eprints.*. See a Webmasters Stack Exchange thread and Google Scholar documentation.
Possibly, the people offering them consider them proprietary to their systems and for their products and not meant for general public use and don't care about anyone else's parsers. If so, either we should discourage that practice, register them as if use proprietorship is to be ignored, or create a subnamespace, like "x-[proprietor-string]-*", for which only the subnamespace and not each name in it would be registered here. Registering a subnamespace might be a way to earn WHATWG or W3C a fee.
Nick (talk) 19:53, 30 January 2016 (UTC)
subject-datetime replaces two sets of names
I created subject-datetime as a replacement for page-datetime (which was my past creation) because I thought page-datetime as a name would confuse too many page authors into thinking it's about when a page was authored rather than what was intended, the date of the subject of the page.
Subject-datetime also replaces datetime-coverage, datetime-coverage-start, datetime-coverage-end, and datetime-coverage-vague because the single name comes with more powerful coding possibilities.
All of the replaced names become synonyms of the new value.
Nick (talk) 02:53, 3 December 2017 (UTC)
want to delete dir-content-pointer as now useless
I'd like to depropose dir-content-pointer (which I originally proposed), because HTML has acquired more elements and they can handle the same function, and handle it better. It doesn't look like it would be a good synonym for anything. Should I move it into the Failed Proposals table? or just delete it? Nick (talk) 05:37, 3 December 2017 (UTC)
want to fail bot-. . . as useless and as a namespace
I'd like to depropose bot-. . ., which I originally proposed. I don't see a use for it or for developing it. Should I move it into the Failed Proposals table? Nick (talk) 05:46, 3 December 2017 (UTC)
twitter:* spec deleted, unsure if found
Maybe the spec link used to work, but it doesn't now. Possibly the new specs are at https://developer.twitter.com/en/docs/tweets/optimize-with-cards/guides/getting-started and/or at https://developer.twitter.com/en/docs/tweets/optimize-with-cards/overview/markup (both as accessed 11-30-17) but I'm not sure. I guess I could email Twitter, but I'm not in Twitter (I'm not a member or whatever they're called) and I suspect a Twitter participant would get a more meaningful response. Nick (talk) 06:13, 3 December 2017 (UTC)
twitter:domain description clarification rewrite
The description for twitter:domain used to say, "the domain of the website (added w/ API 1.1)". I thought that meant example.com for a website at http://example.com and surely Twitter has the technology to parse a URL to extract the host domain of anyone's website. But it turns out that the meta name may be for a different purpose. See https://twittercommunity.com/t/twitter-domain/64463 (as accessed 11-29-17). I rewrote accordingly but without much detail and not based on a spec. I did not find a spec for that name. When I sampled some dates for that link's URL at archive.org for twitter:domain I did not find the meta name there and probably wouldn't want to use a deleted spec anyway. Nick (talk) 06:26, 3 December 2017 (UTC)
I doubt one synonym can be a synonym for more than one keyword
Two situations:
There was a metatag name "publisher". I proposed it. It was deleted. I have written a spec which is more precise than that for dc.publisher, dcterms.publisher, or citation_publisher, but now that spec looks unnecessary. I would like to put the word back into MetaExtensions, but as a synonym, which doesn't need a spec. This could be a synonym for more than one present-day keyword. Should I add it to all of them or only to one and, if only to one, to which one? Possibilities are citation_publisher, dc.publisher, and dcterms.publisher. My guess is the last, but that's a flimsy guess.
The metatg name "geographic-coverage", which I introduced, did not meet requirements for registration, and should not be developed further, could be a synonym for any value in the geo.* enumerated names or for dcterms.spatial. I chose the last. If there's another preference, please edit or reply.
Nick (talk) 06:38, 3 December 2017 (UTC)
- With respect to publisher: Done. Added as synonym for dcterms.publisher only.
- No change with respect to geographic-coverage.
- Nick (talk) 04:43, 7 January 2018 (UTC)
between dc.* and dcterms.*, maybe same specs and definitions and maybe one set should be synonyms for the other
In the Meta Name Extensions table, entries for dc.publisher and dcterms.publisher link to the same spec; likewise for dc.created and dcterms.created and dc.creator and dcterms.creator. That's enough sampling. I thought there's an error in one set of links (for either dc.* or dcterms.*), but then I read what the proponent of both sets (DCMI (Dublin Core)) said at an archived page (https://github.com/dcmi/repository/blob/master/mediawiki_wiki/FAQ/DC_and_DCTERMS_Namespaces.md as accessed 11-25-17 ("[i]mplementers may freely choose to use these fifteen properties either in their original dc: variant (e.g., http://purl.org/dc/elements/1.1/creator) or in the dcterms: variant (e.g., http://purl.org/dc/terms/creator) depending on application requirements")) and I'm confused by the page. It looks like, for WHATWG purposes, one set of keywords should be made into synonyms for the other, except that DCMI supports using either set for new pages and not just for legacy content. I've had problems with DCMI's lack of clarity before, on various aspects, and I don't think emailing them will help, although they've changed some of their people over the years. What do you think? Nick (talk) 06:42, 3 December 2017 (UTC)
geo.position and icbm
The keyword icbm is a synonym for geo.position and geo.position is a synonym for icbm. Both can't be true, but each synonym is listed with "(different value syntax)". So, perhaps neither should be a synonym for the other, but maybe be moved into descriptions as cross-references. What should we do? Nick (talk) 06:52, 3 December 2017 (UTC)
- Done. Resolved by delisting each as synonym of other and listing each in description as near-duplicate of other. Nick (talk) 04:51, 7 January 2018 (UTC)
ambivalence of synonymy for 2 geo.* keywords
Two names in geo.* seem to be only ambivalent synonyms:
- "geo.region": "Superseded by either geo.country alone or geo.country plus geo.a1. Name of geographic region to which the page is related. Content is specified by ISO-3166." Therefore, geo.region is a synonym, but of what?
- "geo.placename": "Superseded by geo.lmk. Name of geographic place to which the page is related."/"<meta name='geo.placename' content='London, Ontario'>" Therefore, geo.placename is a synonym, but the example seems to make it a synonym of geo.a3, not of geo.lmk.
What should we do?
Nick (talk) 06:57, 3 December 2017 (UTC)
FSDateCreation and FSDatePublish have exact same meaning
FSDateCreation and FSDatePublish, in the Meta Name Extensions table, have the same description: "[m]entions the date when this web page was created". The spec gives the same meaning for both: "[t]he date when this web page was created". Presumably, one description and one meaning are wrong, rather than that one name is a synonym for the other. I have emailed the organization behind the spec, asking whether FSDatePublish should be defined as "[t]he date when this web page was published" (and, by implication, whether the description at MetaExtensions should be "[m]entions the date when this web page was published" for FSDatePublish to avoid redundancy with FSDateCreation). I've invited their response here or at the MetaExtensions page and anyone else may wish to respond as well. Nick (talk) 07:04, 3 December 2017 (UTC)
copyright in (non-WHATWG-wiki) spec copying content from WHATWG Wiki
In writing specifications that are copied significantly from the WHATWG Wiki (I had written significantly into the wiki, which is under CC0 or (possibly for some older portions by other contributors) the MIT license), I think the law requires that the spec also be under CC0 or the MIT license. A short phrase is not copyrightable, so a short phrase can be copied without invoking this duty. Some other kinds of content are not copyrightable and thus can be freely copied. Fair use may or may not apply. Whether CC0 or the MIT license would apply to the spec depends on whether CC0 or the MIT license, respectively, applied to what is being copied from the wiki. (Identifying MIT-licensed content is complicated by the need to identify its author and then whether the author replaced the MIT license with CC0.) This is regardless of where the spec is published or hosted or by whom the spec is created. This may be true only in the U.S. or in other nations; I don't know about other nations' laws. This does not apply to an underlying concept or information but to the expression of it, so writing in original words generally allows other treatment of copyright. While CC0 might not have to apply to portions of the spec that were not copied from CC0 material, the MIT license might have to be applied to the entire spec. In general, CC0 or the MIT license would have to apply to some or all of an off-Wiki spec that is liberally copied from the Wiki, even to portions that were not themselves copied. (If that were not true, then I could photocopy the original works of Shakespeare and Mozart and, with or without making a few modern changes, secure a new copyright for myself on what they wrote, making a fortune from the process.) Nick (talk) 07:29, 3 December 2017 (UTC)