<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.whatwg.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Mrball</id>
	<title>WHATWG Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.whatwg.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Mrball"/>
	<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/wiki/Special:Contributions/Mrball"/>
	<updated>2026-04-30T06:43:41Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.39.3</generator>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=Talk:Sanitization_rules&amp;diff=2478</id>
		<title>Talk:Sanitization rules</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=Talk:Sanitization_rules&amp;diff=2478"/>
		<updated>2007-08-23T08:29:57Z</updated>

		<summary type="html">&lt;p&gt;Mrball: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Is the data URI scheme safe? ==&lt;br /&gt;
&lt;br /&gt;
* Rob Sayre says no and refers to a wikipedia article; however, I cannot see anything in the [http://en.wikipedia.org/wiki/Data:_URI_scheme article] that indicates the scheme is not safe.&lt;br /&gt;
** Looking at that wikipedia page, &amp;lt;code&amp;gt;data&amp;lt;/code&amp;gt; could only be added if it were followed by an asterisk, kinda like the 756* that I see popping up all over the place these days.  In particular, I don&#039;t see the use case which would justify the investment in sanitizing &amp;lt;code&amp;gt;text/html&amp;lt;/code&amp;gt; encoded as a data URI.  Not that it would be difficult, just hard to justify.  Perhaps a section could be added which lists safe content types when included in data URIs. -- [[User:Rubys|Rubys]] 03:48, 9 August 2007 (UTC)&lt;br /&gt;
* Data URIs should be santizable on a per-MIME type basis.  Until a vulnerability is found for text/plain mime types data URIs should be allowed, but other MIME types should be not allowed by default.  Other, safer types could then be allowed via white list. -- [[User:Enricopulatzo|Enricopulatzo]] 16:49, 9 August 2007 (UTC)&lt;br /&gt;
** The word &amp;quot;default&amp;quot; puzzles me here.  The common use case here is small GIFs, JPEGs, and PNGs to be directly embedded in places like CSS and &amp;lt;img&amp;gt; tags.  If the associated MIME-types were to be white listed, under what condition would they &#039;&#039;&#039;not&#039;&#039;&#039; be allowed through? -- [[User:Rubys|Rubys]] 10:30, 10 August 2007 (UTC)&lt;br /&gt;
*** One would assume you&#039;d get yourself an actual checker that understands the PNG, JPEG or GIF format, decode the base64 encoder, run it through, and if it were in a valid format, allow it through. This is not strictly necessary: the browser will do this too and fail to display an image if it&#039;s not in a valid format. &amp;amp;mdash; &amp;lt;span style=&amp;quot;font-variant:small-caps;font-family:sans-serif;&amp;quot;&amp;gt;[[User:Edward Z. Yang|Edward Z. Yang]]&amp;lt;/span&amp;gt;&amp;lt;sup style=&amp;quot;font-family:serif;&amp;quot;&amp;gt;([[User talk:Edward Z. Yang|Talk]])&amp;lt;/sup&amp;gt; 18:57, 19 August 2007 (UTC)&lt;br /&gt;
** Whitelisting data url content-types seems like a good idea.  Whether to apply sanitization to the encoded content is up to the sanitizer.  White-listed content-types that may require additional sanitization could be flagged somehow.  [[User:JamesMSnell|JamesMSnell]]&lt;br /&gt;
&lt;br /&gt;
== Regarding the CSS &amp;lt;code&amp;gt;url()&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
As I understand the proposal, all &amp;lt;code&amp;gt;url()&amp;lt;/code&amp;gt; properties are stripped or ignored.  Why is this important?  If it&#039;s to keep people from linking to malicious scripts only, then you&#039;ve made it difficult for designers to link in background images.&lt;br /&gt;
&lt;br /&gt;
Could we not dereference the URI to determine if it&#039;s safe (ie: a valid image, not a script).  &amp;quot;Safe&amp;quot; files are then stored on the server doing the sanitization, preventing users from swapping the innocent resource for a malicious one.&lt;br /&gt;
&lt;br /&gt;
--[[User:Roberthahn|Roberthahn]] 12:55, 10 August 2007 (UTC)&lt;br /&gt;
&lt;br /&gt;
: As far as I know, not even that&#039;s necessary. Most exploits involving &amp;lt;code&amp;gt;url()&amp;lt;/code&amp;gt; involve some variant of &amp;lt;code&amp;gt;url(&amp;quot;expression:alert(&#039;foo!&#039;);&amp;quot;)&amp;lt;/code&amp;gt;; simply pointing to a JavaScript file like: &amp;lt;code&amp;gt;url(&amp;quot;http://example.com/evil.js&amp;quot;)&amp;lt;/code&amp;gt; should not cause problems: the browser will download the file normally, figure out it&#039;s not a valid image, and not do anything else. There, is of course, the risk of external resource retrieval, but it&#039;s applicable to &amp;lt;code&amp;gt;img&amp;lt;/code&amp;gt; tags too. &amp;amp;mdash; &amp;lt;span style=&amp;quot;font-variant:small-caps;font-family:sans-serif;&amp;quot;&amp;gt;[[User:Edward Z. Yang|Edward Z. Yang]]&amp;lt;/span&amp;gt;&amp;lt;sup style=&amp;quot;font-family:serif;&amp;quot;&amp;gt;([[User talk:Edward Z. Yang|Talk]])&amp;lt;/sup&amp;gt; 18:54, 19 August 2007 (UTC)&lt;br /&gt;
&lt;br /&gt;
== Issues ==&lt;br /&gt;
&lt;br /&gt;
Here are a number of issues I see in these rules, based on my experiences with HTML Purifier:&lt;br /&gt;
&lt;br /&gt;
* Form elements are listed as acceptable. Under certain contexts, they are, but they also be used for the dark side: phishing and such. For example, the [http://it.slashdot.org/article.pl?sid=06/11/21/2319243&amp;amp;from=rss Mozilla Firefox Password Manager Bug] will automatically populate a form with login credentials even if they are hidden and the form points to another website. To be on the safe side, I would argue they are dangerous.&lt;br /&gt;
* Attributes really need to be paired up with their elements. &amp;lt;code&amp;gt;&amp;amp;lt;b summary=&amp;quot;&amp;quot; readonly ismap&amp;amp;gt;&amp;lt;/code&amp;gt; is harmless enough, but semantically it makes no sense and it will cause the page to stop validating.&lt;br /&gt;
* Attribute values need to be validated. Example: &amp;lt;code&amp;gt;id=&amp;quot;m*##$ASD83@&amp;quot;&amp;lt;/code&amp;gt;&lt;br /&gt;
* CSS property values need to be paired up with the appropriate properties. &amp;lt;code&amp;gt;overflow:aqua;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If we take sanitization in the strictest sense of the word: to remove objectionable features, only my first point is valid. However, if you want standards compliant output, all of these bases need to be covered. &amp;amp;mdash; &amp;lt;span style=&amp;quot;font-variant:small-caps;font-family:sans-serif;&amp;quot;&amp;gt;[[User:Edward Z. Yang|Edward Z. Yang]]&amp;lt;/span&amp;gt;&amp;lt;sup style=&amp;quot;font-family:serif;&amp;quot;&amp;gt;([[User talk:Edward Z. Yang|Talk]])&amp;lt;/sup&amp;gt; 15:20, 19 August 2007 (UTC)&lt;br /&gt;
&lt;br /&gt;
* CSS Rules: negative lengths should be allowed? For example: margin-top: -100px;&lt;br /&gt;
--[[User:Mrball|Mrball]] 08:29, 23 August 2007 (UTC)&lt;/div&gt;</summary>
		<author><name>Mrball</name></author>
	</entry>
</feed>