A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.
To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).
Validator.nu JSON Output: Difference between revisions
No edit summary |
|||
(38 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
{{Obsolete|spec=https://github.com/validator/validator/wiki/Output-»-JSON}} | |||
Italicized words, such as ''object'', refer to JSON data types. “The <code>"foo"</code> ''datatype''” refers to an object of type ''datatype'' that is the value associated with the key <code>"foo"</code> in the parent ''object''. | |||
==Media Type== | |||
The Internet media type for this format is <code>application/json</code>. (Unless the callback extension is used, in which case the media type is <code>application/javascript</code>.) | |||
==Root Object== | ==Root Object== | ||
The root object is a JSON ''object''. It has | The root object is a JSON ''object''. It has one mandatory key, <code>"messages"</code>, and three optional keys, <code>"url"</code>, <code>"source"</code> and <code>"parseTree"</code>. | ||
The values for these keys are described below. | The values for these keys are described below. | ||
Line 15: | Line 18: | ||
====Message ''object''s==== | ====Message ''object''s==== | ||
A message ''object'' has one mandatory key, <code>" | A message ''object'' has one mandatory key, <code>"type"</code>, and seven optional keys, <code>"subtype"</code>, <code>"message"</code>, <code>"extract"</code>, <code>"offset"</code>, <code>"url"</code>, <code>"line"</code> and <code>"column"</code>. | ||
=====The <code>" | =====The <code>"type"</code> ''string''===== | ||
The <code>" | The <code>"type"</code> ''string'' denotes the general class of the message. The permissible values are <code>"info"</code>, <code>"error"</code> and <code>"non-document-error"</code>. | ||
<CODE>"info"</CODE> | <CODE>"info"</CODE> | ||
Line 28: | Line 31: | ||
the document being validated could not be examined to the end. Examples of such errors include broken schemas, bugs in the validator and IO errors. (Note that when a schema has parse errors, they are first reported as <CODE>error</CODE>s and then a catch-all <CODE>non-document-error</CODE> is also emitted.) | the document being validated could not be examined to the end. Examples of such errors include broken schemas, bugs in the validator and IO errors. (Note that when a schema has parse errors, they are first reported as <CODE>error</CODE>s and then a catch-all <CODE>non-document-error</CODE> is also emitted.) | ||
=====The <code>" | =====The <code>"subtype"</code> ''string''===== | ||
The permissible value with <CODE>" | The permissible value with <CODE>"type":"info"</CODE> is <CODE>"warning"</CODE>, which means that the message seeks to warn about the user of a formally conforming but in some way questionable issue. Otherwise, the message is taken to generally informative. | ||
The permissible value with <CODE>" | The permissible value with <CODE>"type":"error"</CODE> is <CODE>"fatal"</CODE>, which means that the error is an XML well-formedness error or, in the case of HTML, a condition that the implementor has opted to treat analogously to XML well-formedness errors (e.g. due to usability or performance considerations). Further errors are suppressed after a fatal error. In the absence of the <CODE>"subtype"</CODE> key, a <CODE>"type":"error"</CODE> message means a spec violation in general. | ||
Permissible values with <CODE>" | Permissible values with <CODE>"type":"non-document-error"</CODE> are: <CODE>"io"</CODE> (signifies an | ||
input/output error), <CODE>"schema"</CODE> (indicates that | input/output error), <CODE>"schema"</CODE> (indicates that | ||
initializing a schema-based validator failed) and <CODE>"internal"</CODE> | initializing a schema-based validator failed) and <CODE>"internal"</CODE> | ||
(indicates that the validator/checker found an error bug in itself, | (indicates that the validator/checker found an error bug in itself, | ||
ran out of memory, etc., but was still able to emit a message). In the absence of the <CODE>" | ran out of memory, etc., but was still able to emit a message). In the absence of the <CODE>"subtype"</CODE> key, a <CODE>"type":"non-document-error"</CODE> message means a problem external to the document in general. | ||
=====The <code>"message"</code> ''string''===== | =====The <code>"message"</code> ''string''===== | ||
The <code>"message"</code> ''string'' represents a paragraph of text (suitable for rendering to the user as plain text without further processing) that is the message stated succinctly in natural language. | |||
=====The <code>"extract"</code> ''string''===== | =====The <code>"extract"</code> ''string''===== | ||
The <code>"extract"</code> ''string'' represents an extract of the document source from around the point in source designated for the message by the <code>"line"</code> and <code>"column"</code> ''number''s. | |||
=====The <code>"offset"</code> ''number''===== | =====The <code>"offset"</code> ''number''===== | ||
The <code>"offset"</code> ''number'' is an UTF-16 code unit index into the <code>"extract"</code> ''string''. The index identifies the same UTF-16 code unit in the extract that the <code>"line"</code> and <code>"column"</code> ''number''s identify in the full source. The first code unit has the index <code>0</code>. | |||
=====The <code>"url"</code> ''string''===== | =====The <code>"url"</code> ''string''===== | ||
The <code>"url"</code> ''string'', if present, must contain the URI | |||
(not IRI) of the resource with which the message is associated | |||
or the literal string “<CODE>data:…</CODE>” (the last character | |||
is U+2026) to signify that the message is associated with a data URI | |||
resource but the exact URI has been omitted. (If a client application | |||
wishes to show IRIs to human users, it is up to the client | |||
application to convert the URI into an IRI.) | |||
=====The <code>" | If the <code>"url"</code> ''string'' is absent on the message element but present on the root element, the message is considered to be associated with the resource designated by the attribute on the root element. | ||
=====The <code>"firstLine"</code>, <code>"firstColumn"</code>, <code>"lastLine"</code> and <code>"lastColumn"</code> ''number''s===== | |||
The <code>"firstLine"</code>, <code>"firstColumn"</code>, <code>"lastLine"</code> and <code>"lastColumn"</code> ''number''s indicate a range of source code associated with the message. The line and column numbers are one-based. The first line is line 1. The first column is column 1. Columns are counted by UTF-16 code units. A line break is considered to occupy the last column on the line it terminates. | |||
The source lines and columns are approximate. For example, if a | |||
message is related to an attribute, the line and column may point to | |||
the first character if the start tag, the character after the start | |||
tag or to the attribute inside the tag depending on implementation. | |||
If a message is related to character data, the line and column may be | |||
inaccurate within a run of text e.g. due to buffering. | |||
The <code>"lastLine"</code> ''number'' indicates the last line (inclusive) onto which the source range associated with the message falls. | |||
The <code>"firstLine"</code> ''number'' indicates the first line onto which the source range associated with the message falls. If the attribute is missing, it is assumed to have the same value as <code>"lastLine"</code>. | |||
The <code>"lastColumn"</code> ''number'' indicates the last column (inclusive) onto which the source range associated with the message falls on the last line onto which is falls. | |||
The <code>"firstColumn"</code> ''number'' indicates the first column onto which the source range associated with the message falls on the first line onto which is falls. | |||
===The <code>"url"</code> ''string''=== | ===The <code>"url"</code> ''string''=== | ||
===The <code>" | The <code>"url"</code> ''string'', if present, must containt the URI | ||
(not IRI) of the document being checked | |||
or the literal string “<CODE>data:…</CODE>” (the last character | |||
is U+2026) to signify that the message is associated with a data URI | |||
resource but the exact URI has been omitted. (If a client application | |||
wishes to show IRIs to human users, it is up to the client | |||
application to convert the URI into an IRI.) | |||
====The <code>"type"</code> ''string''==== | |||
<CODE>"success"</CODE>, <CODE>"failure"</CODE> or <CODE>"indeterminate"</CODE> for valid, invalid or inability to finish, respectively. | |||
====The <code>"message"</code> ''string''==== | |||
A human-readable string describing the result. | |||
===The <code>"source"</code> ''object''=== | ===The <code>"source"</code> ''object''=== | ||
===The <code>" | A <code>"source"</code> ''object'' has one mandatory key, <code>"code"</code>, and two optional keys, <code>"type"</code> and <code>"encoding"</code>. | ||
====The <code>"code"</code> ''string''==== | |||
The <code>"code"</code> ''string'' represents the source of the checked document as decoded to Unicode lone surrogates replaced with the REPLACEMENT CHARACTER and with line breaks replaced with U+00A0 LINE FEED. | |||
====The <code>"type"</code> ''string''==== | |||
The <code>"type"</code> ''string'' represents the media type of the input without parameters. | |||
====The <code>"encoding"</code> ''string''==== | |||
The <code>"encoding"</code> ''string'' represents the <code>charset</code> media type parameter of the input. | |||
===The <code>"parseTree"</code> ''object''=== | |||
What to put here? [https://simon.html5.org/specs/sdf SDF]? | |||
==Example== | ==Example== | ||
Line 64: | Line 128: | ||
<pre><nowiki>{ | <pre><nowiki>{ | ||
"url": "http://example.org/", | "url": "http://example.org/", | ||
"messages": [{ | "messages": [ | ||
" | { | ||
" | "type" : "info", | ||
" | "subtype": "warning", | ||
"lastLine" : 20, | |||
"lastColumn" : 15, | |||
"url" : "http://example.com/", | "url" : "http://example.com/", | ||
"message": "Trailing slash for void element", | "message": "Trailing slash for void element", | ||
"extract": "<br/>", | "extract": "<br/>", | ||
" | "hiliteStart" : 3, | ||
"hiliteLength" : 1 | |||
}, | }, | ||
{ | { | ||
" | "type" : "error", | ||
" | "subtype": "fatal", | ||
" | "lastLine" : 42, | ||
"lastColumn" : 17, | |||
"url" : "http://example.com/", | "url" : "http://example.com/", | ||
"message": "Missing end tag for the “foo” element" | |||
"message": "Missing end tag for the “foo” element" | |||
} | } | ||
], | |||
"source": { | |||
"code" : "...", | |||
"type" : "text/html", | |||
"encoding": "UTF-8" | |||
}, | |||
"parseTree": { | |||
... | |||
} | } | ||
} | }</nowiki></pre> | ||
==Processing Model== | ==Processing Model== | ||
Clients that consume the message format are referred to as | |||
processors. They must use a parser conforming to [https://tools.ietf.org/html/rfc4627 RFC 4627] to parse the | |||
format. | |||
If the root is not an ''object'' with the key <CODE>"messages"</CODE>, | |||
the JSON text is deemed to be in an unknown format and not processable | |||
according to this processing model. | |||
If the processor encounters a key–value pair in an ''object'' with a known key and an unknown value where a value enumerated in this specification is expected, the processor must ignore the key–value pair. If a processor encounters an ''object'' that is missing a required key (possibly because it was ignored under the previous rule), the processor must ignore the entire ''object''. If a message ''object'' does not have a | |||
<CODE>"line"</CODE> ''number'' with a permissible value, a <CODE>"column"</CODE> | |||
''number'' on the ''object'' must be ignored if present. | |||
Processors must process the items in a way that is consistent with | |||
the semantics of the items. | |||
===Determining Outcome=== | |||
The outcome of the validation process may be success, failure or indeterminate. | |||
# If there are one or more <CODE>non-document-error</CODE> messages, the outcome is indeterminate. | |||
# Else if there are one or more <CODE>error</CODE> messages, the outcome is failure. | |||
# Else the outcome is success. | |||
==Callback== | |||
The format described here may [[Validator.nu Common Input Parameters#callback|optionally be wrapped in a JavaScript function call]]. | |||
==See also== | ==See also== | ||
*[[Validator.nu Web Service Interface]] | *[[Validator.nu Web Service Interface]] | ||
[[Category:Validator.nu Documentation]] |
Latest revision as of 04:26, 29 December 2016
This document is obsolete.
For the current specification, see: https://github.com/validator/validator/wiki/Output-»-JSON
Italicized words, such as object, refer to JSON data types. “The "foo"
datatype” refers to an object of type datatype that is the value associated with the key "foo"
in the parent object.
Media Type
The Internet media type for this format is application/json
. (Unless the callback extension is used, in which case the media type is application/javascript
.)
Root Object
The root object is a JSON object. It has one mandatory key, "messages"
, and three optional keys, "url"
, "source"
and "parseTree"
.
The values for these keys are described below.
The "messages"
array
The this array is an ordered collection of zero or more message objects.
Message objects
A message object has one mandatory key, "type"
, and seven optional keys, "subtype"
, "message"
, "extract"
, "offset"
, "url"
, "line"
and "column"
.
The "type"
string
The "type"
string denotes the general class of the message. The permissible values are "info"
, "error"
and "non-document-error"
.
"info"
means an informational message or warning that does not affect the validity of
the document being checked. "error"
signifies
a problem that causes the validation/checking to fail. "non-document-error"
signifies an error that causes the checking to end in an indeterminate state because
the document being validated could not be examined to the end. Examples of such errors include broken schemas, bugs in the validator and IO errors. (Note that when a schema has parse errors, they are first reported as error
s and then a catch-all non-document-error
is also emitted.)
The "subtype"
string
The permissible value with "type":"info"
is "warning"
, which means that the message seeks to warn about the user of a formally conforming but in some way questionable issue. Otherwise, the message is taken to generally informative.
The permissible value with "type":"error"
is "fatal"
, which means that the error is an XML well-formedness error or, in the case of HTML, a condition that the implementor has opted to treat analogously to XML well-formedness errors (e.g. due to usability or performance considerations). Further errors are suppressed after a fatal error. In the absence of the "subtype"
key, a "type":"error"
message means a spec violation in general.
Permissible values with "type":"non-document-error"
are: "io"
(signifies an
input/output error), "schema"
(indicates that
initializing a schema-based validator failed) and "internal"
(indicates that the validator/checker found an error bug in itself,
ran out of memory, etc., but was still able to emit a message). In the absence of the "subtype"
key, a "type":"non-document-error"
message means a problem external to the document in general.
The "message"
string
The "message"
string represents a paragraph of text (suitable for rendering to the user as plain text without further processing) that is the message stated succinctly in natural language.
The "extract"
string
The "extract"
string represents an extract of the document source from around the point in source designated for the message by the "line"
and "column"
numbers.
The "offset"
number
The "offset"
number is an UTF-16 code unit index into the "extract"
string. The index identifies the same UTF-16 code unit in the extract that the "line"
and "column"
numbers identify in the full source. The first code unit has the index 0
.
The "url"
string
The "url"
string, if present, must contain the URI
(not IRI) of the resource with which the message is associated
or the literal string “data:…
” (the last character
is U+2026) to signify that the message is associated with a data URI
resource but the exact URI has been omitted. (If a client application
wishes to show IRIs to human users, it is up to the client
application to convert the URI into an IRI.)
If the "url"
string is absent on the message element but present on the root element, the message is considered to be associated with the resource designated by the attribute on the root element.
The "firstLine"
, "firstColumn"
, "lastLine"
and "lastColumn"
numbers
The "firstLine"
, "firstColumn"
, "lastLine"
and "lastColumn"
numbers indicate a range of source code associated with the message. The line and column numbers are one-based. The first line is line 1. The first column is column 1. Columns are counted by UTF-16 code units. A line break is considered to occupy the last column on the line it terminates.
The source lines and columns are approximate. For example, if a message is related to an attribute, the line and column may point to the first character if the start tag, the character after the start tag or to the attribute inside the tag depending on implementation. If a message is related to character data, the line and column may be inaccurate within a run of text e.g. due to buffering.
The "lastLine"
number indicates the last line (inclusive) onto which the source range associated with the message falls.
The "firstLine"
number indicates the first line onto which the source range associated with the message falls. If the attribute is missing, it is assumed to have the same value as "lastLine"
.
The "lastColumn"
number indicates the last column (inclusive) onto which the source range associated with the message falls on the last line onto which is falls.
The "firstColumn"
number indicates the first column onto which the source range associated with the message falls on the first line onto which is falls.
The "url"
string
The "url"
string, if present, must containt the URI
(not IRI) of the document being checked
or the literal string “data:…
” (the last character
is U+2026) to signify that the message is associated with a data URI
resource but the exact URI has been omitted. (If a client application
wishes to show IRIs to human users, it is up to the client
application to convert the URI into an IRI.)
The "type"
string
"success"
, "failure"
or "indeterminate"
for valid, invalid or inability to finish, respectively.
The "message"
string
A human-readable string describing the result.
The "source"
object
A "source"
object has one mandatory key, "code"
, and two optional keys, "type"
and "encoding"
.
The "code"
string
The "code"
string represents the source of the checked document as decoded to Unicode lone surrogates replaced with the REPLACEMENT CHARACTER and with line breaks replaced with U+00A0 LINE FEED.
The "type"
string
The "type"
string represents the media type of the input without parameters.
The "encoding"
string
The "encoding"
string represents the charset
media type parameter of the input.
The "parseTree"
object
What to put here? SDF?
Example
{ "url": "http://example.org/", "messages": [ { "type" : "info", "subtype": "warning", "lastLine" : 20, "lastColumn" : 15, "url" : "http://example.com/", "message": "Trailing slash for void element", "extract": "<br/>", "hiliteStart" : 3, "hiliteLength" : 1 }, { "type" : "error", "subtype": "fatal", "lastLine" : 42, "lastColumn" : 17, "url" : "http://example.com/", "message": "Missing end tag for the “foo” element" } ], "source": { "code" : "...", "type" : "text/html", "encoding": "UTF-8" }, "parseTree": { ... } }
Processing Model
Clients that consume the message format are referred to as processors. They must use a parser conforming to RFC 4627 to parse the format.
If the root is not an object with the key "messages"
,
the JSON text is deemed to be in an unknown format and not processable
according to this processing model.
If the processor encounters a key–value pair in an object with a known key and an unknown value where a value enumerated in this specification is expected, the processor must ignore the key–value pair. If a processor encounters an object that is missing a required key (possibly because it was ignored under the previous rule), the processor must ignore the entire object. If a message object does not have a
"line"
number with a permissible value, a "column"
number on the object must be ignored if present.
Processors must process the items in a way that is consistent with the semantics of the items.
Determining Outcome
The outcome of the validation process may be success, failure or indeterminate.
- If there are one or more
non-document-error
messages, the outcome is indeterminate. - Else if there are one or more
error
messages, the outcome is failure. - Else the outcome is success.
Callback
The format described here may optionally be wrapped in a JavaScript function call.