Validator.nu Full-Stack Tests
Validator.nu has a framework for doing full-stack HTML5 validator testing in an implementation-independent manner. Currently, the framework is lacking tests.
The framework implements the design discussed at the HTML WG unconference session on validator testing at TPAC 2007.
The Front End
The front end for the system is the script named
validator-tester.py in the
The script is documented on a separate page.
The General Idea
The idea is to test the full validator through a Web service API in order to test the aggregation of software components running together with a realistic configuration. Testing merely the parser or the validation layer risks testing them in a different configuration than what gets deployed and without a real HTTP client connecting to real HTTP server.
The tests are intended to be implementation-independent for two reasons:
- to make the tests reusable for different products
- to avoid clamping down the implementation details within a product
There's no cross-product way to give identifiers to HTML5 errors. For example, the identification of errors pertaining to element nesting would be different in a grammar-based implementation and in an assertion-based implementation. Moreover, with grammar-based implementations, only the first error is reliable.
Therefore, the testing framework does not test for error identity. It only tests that the first error elicited by a test case falls within a specified source code character range. (This assumes that implementations can report error location, which is a bad assumption for validators that validate the DOM inside a browser, but we’d be left with no useful assumptions without this one.)
Thus, a test suite consists of files on a public HTTP server and a reference database of URIs pointing to the tests and expected locations of the first error for each URI.
Writing a Test Eliciting an Error
Identifying a Testable Conformance Criterion
First, a testable assertion needs to be identified. For example: “The lang attribute specifies the primary language for the element's contents and for any of the element's attributes that contain text. Its value must be a valid RFC 3066 language code, or the empty string.”
Writing a Test Violationg the Conformance Criterion
The test document should have a violation of the previously identified conformance criterion as its first error. Preferably, this error should be the only error in the document.
We can go read the successor of that RFC and find the
en-UK is not a valid language code and come up with a test document like this:
<!DOCTYPE html> <html> <head> <title>en-UK</title> </head> <body> <p lang='en-UK'>en-UK</p> </body> </html>
Note that a test that isn’t testing the doctype or tag inference should have the doctype and explicit tags for
body in order to avoid accidentally testing something related to tag inference.
Placing the Test Case on a Server
Next, the test file needs to be put on an HTTP server. If you are not testing the internal encoding declaration, the absence of an encoding declaration or a particular encoding, you should serve the file with the header
Content-Type: text/html; charset=utf-8 (or
Content-Type: application/xhtml+xml for XHTML5 tests).
In this case, the sample test has the URI
Generating a Reference Result
If this is a regression test (i.e. a check for the error has already been implemented in Validator.nu), the reference result can be generated using Validator.nu.