A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

Validator.nu Web Service Interface: Difference between revisions

From WHATWG Wiki
Jump to navigation Jump to search
(Add out= parameters)
(→‎Output Modes: Categorize implemented and not implemented)
Line 59: Line 59:
==Output Modes==
==Output Modes==

A Web service probably calls for an XML output format for maximal
tool chain integration even though the current HTML output format
makes sense for browsers and can carry all the necessary data.

I think the following modes could make sense:
* HTML with microformat-style <CODE>class</CODE> annotations
(default output).
* XHTML with microformat-style <CODE>class</CODE> annotations (append <code>&out=xhtml</code> to URL).
* [[Validator.nu XML Output|XML format designed specifically for Validator.nu.]] (append <code>&out=xml</code> to URL).
* [[Validator.nu JSON Output|JSON]] (append <code>&out=json</code> to URL).
* Human-readably plain text (append <code>&out=text</code> to URL).
===Not Implemented===

* HTML with microformat-style <CODE>class</CODE> annotations
* [[Validator.nu GNU Output|GNU error format]] (needs a better spec)
(already implemented; default output).
* Relaxed-compatible (lacks a spec)
* XHTML with microformat-style <CODE>class</CODE> annotations (already implemented; append <code>&out=xhtml</code> to URL).
* Unicorn-compatible (hoping that Unicorn changes instead)
* [[Validator.nu XML Output|XML format designed specifically for Validator.nu.]] (already implemented; append <code>&out=xml</code> to URL).
* W3C Validator-compatible SOAP (legacy)
* [[Validator.nu JSON Output|JSON]] (already implemented; append <code>&out=json</code> to URL).
* Human-readably plain text (already implemented; append <code>&out=text</code> to URL).
* [[Validator.nu GNU Output|GNU error format]] (not implemented; needs a better spec)
* Relaxed-compatible (not implemented; lacks a spec)
* Unicorn-compatible (not implemented)
* W3C Validator-compatible SOAP (not implemented; legacy)
* EARL (not implemented; domain modeling mismatch)
* EARL (not implemented; domain modeling mismatch)
For the HTML and XHTML output formats, there could be an option
for suppressing the input form. The output default should be HTML for
the browser-targeted input formats. However, the custom XML format
might be a reasonable default when the input document was POSTed as
the entity body.

Revision as of 20:11, 26 November 2007

This is a inline-commentable updated wiki copy of the original article.


First, I assume there is some level of interest in doing RELAX NG / Schematron validation and HTML5 conformance checking. Next, it would be nice to enable applications that deal with documents to make these checks automatically in addition to having the functionality available for human operators as a Web app. For example, a content management system might check the input it is given.

Java apps could just integrate a private copy of the Free Software back end of the validation / conformance checking service. However, non-Java apps would benefit from having the validation / conformance checking service running out of process and having an interface for talking to the out-of-process Java service. The service instance could be hosted publicly or as a local copy. Even some Java developers would elect to use such a service instead of integrating the back end as part of their own app.

Input Modes

The schemas are expected to be relatively static. Therefore, I think preloading them into the service or letting the service retrieve them is sufficient. Identification by URI works in both cases.

What needs different input modes is the document that is checked.

I think the following modes would make sense:

  • Document URI as a GET parameter; the service retrieves the

document by URI (already implemented).

  • Document in a data: URI as a GET parameter.
  • Document POSTed as the HTTP entity body (the preferred Web

service mode; already implemented).

  • Document POSTed as an application/x-www-form-urlencoded

form field value.

  • Document POSTed as a multipart/form-data file


In the first three modes, additional parameters would be communicated in the URI query string. In the last two modes, additional parameters would be communicated like corresponding from fields are communicated as application/x-www-form-urlencoded and multipart/form-data.

I don’t particularly like the last two modes, but they are needed to address feature requests and for parity with other services. Also, unlike the first three modes, the last two modes need companion UI changes, which is not nice. As a further complication, the last two don’t come naturally with a Content-Type for dispatching to an HTML5 parser or to an XML parser.

All these input modes would share the same “service endpoint URI” (and the same servlet class). The different cases can be distinguished from the HTTP method and in the POST cases from the Content-Type request header.

Output Modes


  • HTML with microformat-style class annotations

(default output).

Not Implemented

  • GNU error format (needs a better spec)
  • Relaxed-compatible (lacks a spec)
  • Unicorn-compatible (hoping that Unicorn changes instead)
  • W3C Validator-compatible SOAP (legacy)
  • EARL (not implemented; domain modeling mismatch)