A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

Validator.nu GNU Output: Difference between revisions

From WHATWG Wiki
Jump to navigation Jump to search
m (Category)
Line 85: Line 85:
==See also==
==See also==
*[[Validator.nu Web Service Interface]]
*[[Validator.nu Web Service Interface]]
[[Category:Validator.nu Documentation]]

Revision as of 09:16, 7 February 2008

This format is an adaptation of the GNU error format.

Media Type

This format has semantics beyond the semantics of text/plain. However, for compatibility and given the lack of a specific media type, this format uses the media type text/plain; charset=utf-8.

Character Encoding

This format is defined in terms of Unicode characters. For trasport as bytes, the Unicode characters are encoded as UTF-8.

General Format

The format consists of messages represented as text lines.

Each line consists of the URI of the file that the message pertains to, U+003A COLON, optionally a position descriptor, U+003A COLON if there was a position descriptor, U+0020 SPACE, type descriptor, U+003A COLON, U+0020 SPACE, message and U+000A LINE FEED.

When there are no lines, there are characters—not even a single U+000A LINE FEED.

URI of the File

The URI of the file is its IRI converted to the URI form or the empty string if the IRI of the document is not available. Note that the URI of the file may contain a colon (and often does). The GNU format does not specify what to do if the filename is not available or if it contains a colon.

Position Descriptor

The position descriptor indicates the source position that the message pertains to in terms of lines and columns. The first line is line number 1. The first character on a line occupies column number 1. Columns are counted as UTF-16 code units without tab expansion. (The GNU spec doesn't specify how non-ASCII is counted and specifies tab expansion to stops at every 8 columns.)

The position descriptor takes one of these formats:

  • line number
  • line number, U+002E FULL STOP, column number
  • start line number, U+002D HYPEN-MINUS, end line number
  • start line number, U+002E FULL STOP, start column number, U+002D HYPEN-MINUS, end line number, U+002E FULL STOP, end column number

Start and end are inclusive. The numbers consist of one or more characters in the range from U+0030 DIGIT ZERO to U+0039 DIGIT NINE interpreted as a decimal number.

Type Descriptor

The type descriptor consists of a supertype descriptor optionally followed by U+0020 SPACE and a subtype descriptor.

The supertype descriptor denotes the general class of the message. The permissible values are info, error and non-document-error.

info means an informational message or warning that does not affect the validity of the document being checked. error signifies a problem that causes the validation/checking to fail. non-document-error signifies an error that causes the checking to end in an indeterminate state because the document being validated could not be examined to the end. Examples of such errors include broken schemas, bugs in the validator and IO errors. (Note that when a schema has parse errors, they are first reported as errors and then a catch-all non-document-error is also emitted.)

When the supertype descriptor is info the permissible value for the subtype descriptor is warning, which means that the message seeks to warn about the user of a formally conforming but in some way questionable issue. Otherwise, the message is taken to generally informative.

When the supertype descriptor is error the permissible value for the subtype descriptor is fatal, which means that the error is an XML well-formedness error or, in the case of HTML, a condition that the implementor has opted to treat analogously to XML well-formedness errors (e.g. due to usability or performance considerations). Further errors are suppressed after a fatal error. In the absence of the "subtype" key, a "type":"error" message means a spec violation in general.

When the supertype descriptor is non-document-error the permissible value for the subtype descriptor are io (signifies an input/output error), schema (indicates that initializing a schema-based validator failed) and internal (indicates that the validator/checker found an error bug in itself, ran out of memory, etc., but was still able to emit a message). In the absence of the subtype descriptor key, a non-document-error message means a problem external to the document in general.


The message is a human-readable string that does not contain U+000A LINE FEED or U+000D CARRIAGE RETURN. It may be the empty string.

Processing Model

Clients that consume the message format are referred to as processors.

If the input contains a line that is not in the format described above, the input is deemed to be in an unknown format and not processable according to this processing model.

For forward compatibility, processors must treat unknown subtype descriptors as if there were no subtype descriptor when deciding the semantics according to the previous paragraphs.

Processors must process the lines in a way that is consistent with the semantics of the lines.

Determining Outcome

The outcome of the validation process may be success, failure or indeterminate.

  1. If there are one or more non-document-error messages, the outcome is indeterminate.
  2. Else if there are one or more error messages, the outcome is failure.
  3. Else the outcome is success.

Note that info messages can be suppressed with by setting the input parameter level to error in which case success is equivalent to this format containing no lines.

See also