A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

CDATA Escapes: Difference between revisions

From WHATWG Wiki
Jump to navigation Jump to search
Line 30: Line 30:
<dl>
<dl>
<dt><code>&lt;</code>
<dt><code>&lt;</code>
<dd>TAG_OPEN_NON_PCDATA
<dd>TAG_OPEN_NON_PCDATA with CDATA as return state
<dt><code>/</code>
<dt><code>/</code>
<dd>CDATA_SLASH
<dd>CDATA_SLASH
Line 44: Line 44:


<dl>
<dl>
<dt><code>&lt;</code>
<dd>TAG_OPEN_NON_PCDATA with CDATA as return state
<dt><code>/</code>
<dt><code>/</code>
<dd>CDATA_LINE_COMMENT
<dd>CDATA_LINE_COMMENT
Line 63: Line 65:
<dd>Stay
<dd>Stay
</dl>
</dl>


====CDATA_SINGLE_QUOTED====
====CDATA_SINGLE_QUOTED====
Line 101: Line 102:
<dd>CDATA
<dd>CDATA
<dt><code>&lt;</code>
<dt><code>&lt;</code>
<dd>TAG_OPEN_NON_PCDATA
<dd>TAG_OPEN_NON_PCDATA with CDATA_LINE_COMMENT as return state
<dt>Anything else
<dd>Stay
</dl>
 
====CDATA_COMMENT====
 
<dl>
<dt><code>*</code>
<dd>CDATA_COMMENT_ASTERISK
<dt><code>&lt;</code>
<dd>TAG_OPEN_NON_PCDATA with CDATA_COMMENT as return state
<dt>Anything else
<dt>Anything else
<dd>Stay
<dd>Stay
</dl>
====CDATA_COMMENT_ASTERISK====
<dl>
<dt><code>/</code>
<dd>CDATA
<dt><code>&lt;</code>
<dd>TAG_OPEN_NON_PCDATA with CDATA_COMMENT as return state
<dt>Anything else
<dd>CDATA_COMMENT
</dl>
</dl>

Revision as of 11:49, 12 August 2009

Requirements

Hard Requirements

  • It must be possible to have the string "</script>" in a string literal in inline JavaScript without having to use JS-level escapes. (This possibility may be limited to scripts that use the <!-- ... --> "Hide from old browsers" pattern.)
  • It must be possible to have "<!--" and "-->" in string literals in inline JavaScript without having to use JS-level escapes.
  • Must not rewind and reparse with different rules.

Medium Requirements

  • It should be possible to have the string <!-- in xmp without having the rest of the page eaten up into xmp element.
  • It should be possible to have <!-- near the start of a script or style element without having a matching --> and still the trailing part of the page shouldn't get eaten up into the script or style element.
  • Pages authored naively for HTML5-parsing-enabled UAs shouldn't be XSS risks in legacy UAs.
  • When the author uses comment-like syntax in the fallback markup in iframe, noembed or noframes, the comment-like syntax should span the same character run that it would if it were parsed as markup.

Nice to Have Requirements

  • It would be nice for the rest of the page not to get eaten up when the author omits </title> accidentally or mistypes it as <title>.

Proposal

  • Remove <!-- ... --> escapes from title, textarea and xmp.
  • Make the closing condition for <!-- ... --> in iframe, noembed and noframes match the comment closing conditions exactly.
  • Remove <!-- ... --> escapes from script and style and introduce a novel string literal detector heuristic.

String Literal Detector Heuristic

CDATA

<
TAG_OPEN_NON_PCDATA with CDATA as return state
/
CDATA_SLASH
"
CDATA_DOUBLE_QUOTED
'
CDATA_SINGLE_QUOTED
Anything else
Stay

CDATA_SLASH

<
TAG_OPEN_NON_PCDATA with CDATA as return state
/
CDATA_LINE_COMMENT
*
CDATA_COMMENT
Anything else
CDATA

CDATA_DOUBLE_QUOTED

"
Line feed
CDATA
\
CDATA_DOUBLE_QUOTED_BACKSLASH
Anything else
Stay

CDATA_SINGLE_QUOTED

'
Line feed
CDATA
\
CDATA_SINGLE_QUOTED_BACKSLASH
Anything else
Stay

CDATA_DOUBLE_QUOTED_BACKSLASH

Line feed
CDATA
Anything else
CDATA_DOUBLE_QUOTED

CDATA_SINGLE_QUOTED_BACKSLASH

Line feed
CDATA
Anything else
CDATA_SINGLE_QUOTED

CDATA_LINE_COMMENT

Line feed
CDATA
<
TAG_OPEN_NON_PCDATA with CDATA_LINE_COMMENT as return state
Anything else
Stay

CDATA_COMMENT

*
CDATA_COMMENT_ASTERISK
<
TAG_OPEN_NON_PCDATA with CDATA_COMMENT as return state
Anything else
Stay

CDATA_COMMENT_ASTERISK

/
CDATA
<
TAG_OPEN_NON_PCDATA with CDATA_COMMENT as return state
Anything else
CDATA_COMMENT