A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

Validator.nu Full-Stack Tests

From WHATWG Wiki
Jump to navigation Jump to search

Validator.nu has a framework for doing full-stack HTML5 validator testing in an implementation-independent manner. Currently, the framework is lacking tests.

The framework implements the design discussed at the HTML WG unconference session on validator testing at TPAC 2007.

The Front End

The front end for the system is the script named validator-tester.py in the test-harness/ directory.

The script is documented on a separate page.

The General Idea

The idea is to test the full validator through a Web service API in order to test the aggregation of software components running together with a realistic configuration. Testing merely the parser or the validation layer risks testing them in a different configuration than what gets deployed and without a real HTTP client connecting to real HTTP server.

The tests are intended to be implementation-independent for two reasons:

  1. to make the tests reusable for different products
  2. to avoid clamping down the implementation details within a product

There's no cross-product way to give identifiers to HTML5 errors. For example, the identification of errors pertaining to element nesting would be different in a grammar-based implementation and in an assertion-based implementation. Moreover, with grammar-based implementations, only the first error is reliable.

Therefore, the testing framework does not test for error identity. It only tests that the first error elicited by a test case falls within a specified source code character range. (This assumes that implementations can report error location, which is a bad assumption for validators that validate the DOM inside a browser, but we’d be left with no useful assumptions without this one.)

Thus, a test suite consists of files on a public HTTP server and a reference database of URIs pointing to the tests and expected locations of the first error for each URI.

Writing a Test Eliciting an Error

Identifying a Testable Conformance Criterion

First, a testable assertion needs to be identified. For example: “The ampersand must be followed by one of the names given in the named character references section, using the same case.”

Writing a Test Violationg the Conformance Criterion

The test document should have a violation of the previously identified conformance criterion as its first error. Preferably, this error should be the only error in the document.

We can go read the list of named characters and come up with a test document like this:

<!DOCTYPE html>
<html>
<head>
<title>Unescaped ampersand</title>
</head>
<body>
<p>&amgueao</p>
</body>
</html>

Note that a test that isn’t testing the doctype or tag inference should have the doctype and explicit tags for html, head, title and body in order to avoid accidentally testing something related to tag inference. It is a good idea to make the content of title hint at what is being tested.

Placing the Test Case on a Server

Next, the test file needs to be put on an HTTP server. If you are not testing the internal encoding declaration, the absence of an encoding declaration or a particular encoding, you should serve the file with the header Content-Type: text/html; charset=utf-8 (or Content-Type: application/xhtml+xml for XHTML5 tests).

In this case, the sample test has the URI http://hsivonen.iki.fi/test/moz/unescaped-ampersand.html.

Generating a Reference Result

If this is a regression test (i.e. a check for the error has already been implemented in Validator.nu), the reference result can be generated using Validator.nu.

First, use the HTML UI to check that the right error is given and that the right piece of source is highlighted.

Then, use validator-tester.py to dump the result in a JSON file:

python validator-tester.py dumpuri http://hsivonen.iki.fi/test/moz/unescaped-ampersand.html temp.json

You can then open the temporary file in your favorite text editor:

edit temp.json

You should see contents like this:

{
  "http://hsivonen.iki.fi/test/moz/unescaped-ampersand.html": [
    {
      "firstColumn": 7, 
      "firstLine": 7, 
      "lastColumn": 7, 
      "lastLine": 7, 
      "message": "Text after \u201c&\u201d did not match an entity name. Probable cause: \u201c&\u201d should have been escaped as \u201c&\u201d."
    }
  ]
}

The outmost JSON object has a single key-value pair with the URI as the key and an array of errors as the value. In this case, the array has only one element. If it had more, the rest of the elements would be merely informative for a human and would be ignored by the test harness when comparing results.

The error is a JSON object that has five keys. Four for the error range and one for the message. The message is merely informative for humans. It isn’t compared by the test harness.