A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

Validator.nu Full-Stack Tests: Difference between revisions

From WHATWG Wiki
Jump to navigation Jump to search
No edit summary
Line 33: Line 33:
The test document should have a violation of the previously identified conformance criterion as its first error. Preferably, this error should be the only error in the document.
The test document should have a violation of the previously identified conformance criterion as its first error. Preferably, this error should be the only error in the document.


We can go read the successor of that RFC and find the <code>en-UK</code> is not a valid language code and come up with a test document like this:
We can go read the list of named characters and come up with a test document like this:
<pre><!DOCTYPE html>
<pre><!DOCTYPE html>
<html>
<html>
<head>
<head>
<title>en-UK</title>
<title>Unescaped ampersand</title>
</head>
</head>
<body>
<body>
<p lang='en-UK'>en-UK</p>
<p>&amgueao</p>
</body>
</body>
</html></pre>
</html></pre>
Line 50: Line 50:
Next, the test file needs to be put on an HTTP server. If you are not testing the internal encoding declaration, the absence of an encoding declaration or a particular encoding, you should serve the file with the header <code>Content-Type: text/html; charset=utf-8</code> (or <code>Content-Type: application/xhtml+xml</code> for XHTML5 tests).
Next, the test file needs to be put on an HTTP server. If you are not testing the internal encoding declaration, the absence of an encoding declaration or a particular encoding, you should serve the file with the header <code>Content-Type: text/html; charset=utf-8</code> (or <code>Content-Type: application/xhtml+xml</code> for XHTML5 tests).


In this case, the sample test has the URI [http://hsivonen.iki.fi/test/moz/en-UK.html <code><nowiki>http://hsivonen.iki.fi/test/moz/en-UK.html</nowiki></code>].
In this case, the sample test has the URI [http://hsivonen.iki.fi/test/moz/unescaped-ampersand.html <code><nowiki>http://hsivonen.iki.fi/test/moz/unescaped-ampersand.html</nowiki></code>].


===Generating a Reference Result===
===Generating a Reference Result===
Line 56: Line 56:
If this is a regression test (i.e. a check for the error has already been implemented in Validator.nu), the reference result can be generated using Validator.nu.
If this is a regression test (i.e. a check for the error has already been implemented in Validator.nu), the reference result can be generated using Validator.nu.


First, use the [http://html5.validator.nu/?doc=http%3A%2F%2Fhsivonen.iki.fi%2Ftest%2Fmoz%2Fen-UK.html HTML UI to check] that the right error is given and that the right piece of source is highlighted.
First, use the [http://html5.validator.nu/?doc=http%3A%2F%2Fhsivonen.iki.fi%2Ftest%2Fmoz%2Funescaped-ampersand.html HTML UI to check] that the right error is given and that the right piece of source is highlighted.


Then, use <code>validator-tester.py</code> to dump the result in a JSON file:
Then, use <code>validator-tester.py</code> to dump the result in a JSON file:
<pre>python validator-tester.py dumpuri http://hsivonen.iki.fi/test/moz/en-UK.html temp.json</pre>
<pre>python validator-tester.py dumpuri http://hsivonen.iki.fi/test/moz/unescaped-ampersand.html temp.json</pre>


You can then open the temporary file in your favorite text editor:
You can then open the temporary file in your favorite text editor:
Line 66: Line 66:
You should see contents like this:
You should see contents like this:
<pre>{
<pre>{
   "http://hsivonen.iki.fi/test/moz/en-UK.html": [
   "http://hsivonen.iki.fi/test/moz/unescaped-ampersand.html": [
     {
     {
       "firstColumn": 1,  
       "firstColumn": 7,  
       "firstLine": 7,  
       "firstLine": 7,  
       "lastColumn": 16,  
       "lastColumn": 7,  
       "lastLine": 7,  
       "lastLine": 7,  
       "message": "Bad value \u201cen-UK\u201d for attribute \u201clang\u201d on element \u201cp\u201d: Bad region subtag."
       "message": "Text after \u201c&\u201d did not match an entity name. Probable cause: \u201c&\u201d should have been escaped as \u201c&amp;\u201d."
     }
     }
   ]
   ]

Revision as of 13:24, 9 September 2008

Validator.nu has a framework for doing full-stack HTML5 validator testing in an implementation-independent manner. Currently, the framework is lacking tests.

The framework implements the design discussed at the HTML WG unconference session on validator testing at TPAC 2007.

The Front End

The front end for the system is the script named validator-tester.py in the test-harness/ directory.

The script is documented on a separate page.

The General Idea

The idea is to test the full validator through a Web service API in order to test the aggregation of software components running together with a realistic configuration. Testing merely the parser or the validation layer risks testing them in a different configuration than what gets deployed and without a real HTTP client connecting to real HTTP server.

The tests are intended to be implementation-independent for two reasons:

  1. to make the tests reusable for different products
  2. to avoid clamping down the implementation details within a product

There's no cross-product way to give identifiers to HTML5 errors. For example, the identification of errors pertaining to element nesting would be different in a grammar-based implementation and in an assertion-based implementation. Moreover, with grammar-based implementations, only the first error is reliable.

Therefore, the testing framework does not test for error identity. It only tests that the first error elicited by a test case falls within a specified source code character range. (This assumes that implementations can report error location, which is a bad assumption for validators that validate the DOM inside a browser, but we’d be left with no useful assumptions without this one.)

Thus, a test suite consists of files on a public HTTP server and a reference database of URIs pointing to the tests and expected locations of the first error for each URI.

Writing a Test Eliciting an Error

Identifying a Testable Conformance Criterion

First, a testable assertion needs to be identified. For example: “The lang attribute specifies the primary language for the element's contents and for any of the element's attributes that contain text. Its value must be a valid RFC 3066 language code, or the empty string.”

Writing a Test Violationg the Conformance Criterion

The test document should have a violation of the previously identified conformance criterion as its first error. Preferably, this error should be the only error in the document.

We can go read the list of named characters and come up with a test document like this:

<!DOCTYPE html>
<html>
<head>
<title>Unescaped ampersand</title>
</head>
<body>
<p>&amgueao</p>
</body>
</html>

Note that a test that isn’t testing the doctype or tag inference should have the doctype and explicit tags for html, head, title and body in order to avoid accidentally testing something related to tag inference.

Placing the Test Case on a Server

Next, the test file needs to be put on an HTTP server. If you are not testing the internal encoding declaration, the absence of an encoding declaration or a particular encoding, you should serve the file with the header Content-Type: text/html; charset=utf-8 (or Content-Type: application/xhtml+xml for XHTML5 tests).

In this case, the sample test has the URI http://hsivonen.iki.fi/test/moz/unescaped-ampersand.html.

Generating a Reference Result

If this is a regression test (i.e. a check for the error has already been implemented in Validator.nu), the reference result can be generated using Validator.nu.

First, use the HTML UI to check that the right error is given and that the right piece of source is highlighted.

Then, use validator-tester.py to dump the result in a JSON file:

python validator-tester.py dumpuri http://hsivonen.iki.fi/test/moz/unescaped-ampersand.html temp.json

You can then open the temporary file in your favorite text editor:

edit temp.json

You should see contents like this:

{
  "http://hsivonen.iki.fi/test/moz/unescaped-ampersand.html": [
    {
      "firstColumn": 7, 
      "firstLine": 7, 
      "lastColumn": 7, 
      "lastLine": 7, 
      "message": "Text after \u201c&\u201d did not match an entity name. Probable cause: \u201c&\u201d should have been escaped as \u201c&\u201d."
    }
  ]
}

The outmost JSON object has a single key-value pair with the URI as the key and an array of errors as the value. In this case, the array has only one element. If it had more, the rest of the elements would be merely informative for a human and would be ignored by the test harness when comparing results.

The error is a JSON object that has five keys. Four for the error range and one for the message. The message is merely informative for humans. It isn’t compared by the test harness.