A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

URL

From WHATWG Wiki
Revision as of 15:20, 10 November 2012 by Annevk (talk | contribs) (add IDNA notes)
Jump to navigation Jump to search

This documents research and notes around URLs for the URL standard.

Implementations

Tests

Variants of the following code (runs in Live DOM Viewer) are useful to test which code points are URL escaped in browsers:

<!DOCTYPE html>
<script>
var a = document.createElement("a")

i = 0
cp = 0x100

while ( i < cp ) {
  a.href = "http://x" + String.fromCharCode(i) + "@x/"
  if(a.href.length != "http://x)@x/".length) {
    w(a.href)
  }
  i++
}
</script>

Parsing

JavaScript libraries

For improving the API we might want to take inspiration from:

Schemes

Currently the parser does not separate out query, this could be problematic for about and maybe mailto.

  • data
  • javascript
  • mailto
  • about (uselessly defined in RFC 6694)

IDNA

IDNA2003 below is IDNA2003 with updated Unicode (in theory IDNA2003 restricts Unicode to 3.2?)

What algorithms do we need. ToLabels(domain string) -> list of labels (trailing dot) or failure. ToASCII(label) -> ASCII-label. ToUnicode(label) -> Unicode label. ToLabels should do validation and such too. ToASCII and ToUnicode ideally never fail because ToLabels already ensured validity.