The purpose of this page is to summarize information and arguments relevant to whether it makes sense for web specification copyright licenses to permit forking.
For our purposes, there are two different approaches to forking.
- Language Forking
- An alternative development path for the language created without using existing specification text as the basis of the work.
- Specification Forking
- An alternative development path using existing specification text as the basis of a derivative work.
Since a language fork is not a derivative work according to copyright law, it may be done without the permission of the copyright holder. A specification fork, however, requires a permissive licence or explicit permission from the copyright holder.
This page focuses on whether the W3C's HTML5 specification should allow forks, as the WHATWG version always has.
Existing forks of HTML
(based on IRC comment by Maciej; possibly not everything here is strictly a fork, could use classification and explanation work)
- XHTML Basic
- HTML 4 Mobile
- XHTML 2
- HTML5 Edition for Web Authors
- HTML: The Markup Language
W3C-published but disavowed forks
- ISO HTML: Diff spec.
- XHTML-MP: Apparently no free download. Reportedly not widely used.
- WML: Apparently no free download. Reportedly very widely used and extremely harmful.
- WTVML: Apparently no free download.
- WHATWG HTML: No comment.
- CE-HTML: Apparently only a preview is available for free, which contains almost nothing beyond the table of contents.
- EPUB: Diff spec.
- HTML 4.1: Accessibility-oriented fork. Seems to attempt to respec from scratch, but it doesn't sound like it has anyone actually implementing it.
Not really forks
- HTML5 for Web developers: A subset of WHATWG HTML.
- Philip Taylor's annotated canvas spec: No normative differences, or at least isn't supposed to have normative differences.
One argument presented in favor of allowing forks is that if the W3C ever makes poor decisions that compromise the quality of its standards, other organizations should have the right to issue competing standards, with implementers agreeing to follow the better standard. When the W3C owns the right to large, established specifications that it doesn't permit others to fork, this becomes harder. Looking at cases where standards authors have abandoned an existing standards group to form their own should give an idea of whether this tends to be a good or bad thing.
W3C competing with IETF
The W3C itself was founded at least partly because Tim Berners-Lee felt that standardization at the IETF wasn't working well. As he writes in his book, Weaving the Web (pp. 62-3):
Progress in the [IETF's] URI working group was slow, partly due to the number of endless philosophical rat holes down which technical conversations would disappear. . . . Sometimes there was a core philosophy being argued, and from my point of view that was not up for compromise. Sometimes there was a basically arbitrary decision (like which punctuation characters to use) that I had already made, and changing it would only mean that millions of Web browsers and existing links would have to be changed.
In practice, the W3C has wound up cooperating with the IETF more than competing.
WHATWG competing with W3C
After HTML 4.01 was finalized in 1998, no new features were added to the HTML markup language other than in XHTML variants that browsers didn't implement. Thus HTML as a standard markup language did not progress at all between about 1998 and 2004. In 2004, Mozilla and Opera requested permission from the W3C to work on improving non-XML-based versions of HTML, and they were denied permission. Apple, Mozilla, and Opera then founded the WHATWG, which began work on a new version of the HTML specification outside the W3C. In a couple of years, the WHATWG rewrote the HTML standard from scratch, made it drastically more precise, and added many new highly-demanded features (such as video and canvas) that were previously only available through proprietary plugins. In 2007, the W3C formed an HTML Working Group again to work on non-XML-based versions of HTML, based on and in conjunction with the work of the WHATWG.
XHTML Modularization has the goal of "providing a means for subsetting and extending XHTML, a feature needed for extending XHTML's reach onto emerging platforms". Although this is not technically a fork (in the sense that it does not imply taking the W3C spec text and rewriting portions of it), the ability to create new HTML-derived specs that add or replace functionality carries many of the same risks that are brought up in the context of forking.
Supposed risks of forking
- "A national government could create its own intentionally incompatible national version of the html specification in order to prevent general Web access from within that country": This argument is wrong at multiple levels:
- A national government could exempt itself from copyright anyway. Even the US government does not consider itself bound by copyright law in all cases.
- A specification is only needed if there is a desire for multiple independent interoperable implementations, i.e., if competition is being encouraged. But what government would on the one hand encourage competition amongst browser vendors and on the other hand prevent those browser vendors from implementing other versions of HTML?
- In practice, it would be economically impractical to control the Web by developing a parallel HTML that is similar enough that the W3C spec could be used as a basis, but different enough that it is incompatible with the Web's HTML. In reality, countries use content filtering software to do such control (q.v., China).
- "Other undesirable forks could be for device specific variants of specs where it would be better for those groups to come into W3C [...] than to splinter html": When the W3C forks HTML (as it has in the past, see the list above), it is just as bad as when anyone else does. There is nothing special about the W3C here.
- In practice, vendors who want to add device-specific features commonly just add new features without forking the spec. For instance, Apple has made up its own proprietary features such as <meta name="apple-touch-icon"> and <meta name="apple-mobile-web-app-capable"> to allow web pages to integrate better with the iOS browser. Forking the whole HTML spec would be a waste of time.
Advantages of allowing forking
- It encourages the W3C to do a good job, because of the risk that if the W3C does not do a good job, it will lose relevance. This has already been shown to be beneficial both to the W3C and the Web at large with HTML5 itself: the W3C tried to strongarm the Web into abandoning HTML, and when the WHATWG worked on it instead, the W3C changed its mind. This demonstrates the benefit of allowing forking (though in this case, it turns out copyright restrictions were not any kind of barrier to forking, because HTML4 wasn't good enough to be a useful starting point and we instead started from scratch).
Reasons preventing forking is not necessary
- If the W3C is the best place to work on specs, then people will not need to fork, they'll just work on them at the W3C.