A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on IRC (such as one of these permanent autoconfirmed members) or send an e-mail to admin@wiki.whatwg.org with your desired username and an explanation of the first edit you'd like to make. (Do not use this e-mail address for any other inquiries, as they will be ignored or politely declined.)

Note: This wiki is used to supplement, not replace, specification discussions. If you would like to request changes to existing specifications, please use IRC or a mailing list first.

URL

From WHATWG Wiki
Jump to: navigation, search

The contents of this page, URL, and all edits made to this page in its history, are hereby released under the CC0 Public Domain Dedication, as described in WHATWG Wiki:Copyrights.

This documents research and notes around URLs for the URL standard.

Implementations

Tests

Variants of the following code (runs in Live DOM Viewer) are useful to test which code points are URL escaped in browsers:

<!DOCTYPE html>
<script>
var a = document.createElement("a")

i = 0
cp = 0x100

while ( i < cp ) {
  a.href = "http://x" + String.fromCharCode(i) + "@x/"
  if(a.href.length != "http://x)@x/".length) {
    w(a.href)
  }
  i++
}
</script>

Parsing

JavaScript libraries

For improving the API we might want to take inspiration from:

Schemes

Apart from the scheme-types listed below, the URL Standard identifies "relative schemes", used for parsing a URL into a parsed URL.

Purpose-specific schemes

URL schemes are purpose-specific schemes if they only work in one context. These only work for WebSocket:

  • ws
  • wss

Fetch schemes

URL schemes are resource schemes if fetching the URL results in either a network error or a resource with associated MIME type (potentially sniffed).

ftp
http
https 
These all can be used by the corresponding protocol directly.
file 
Needs platform-specific interpretation and mapping to a resource on a the local file system.
data 
Needs its resource and MIME type information retrieved from its scheme data/query.
blob
about 
The resource is effectively the result of passing scheme data to a hash table (not sure if case-sensitive or not; definitely no percent decoding). Query and fragment can be used by the resource.

(The same-origin definition should maybe account for about/blob/data.)

Navigate schemes

  • The "fetch schemes" -> use "fetch"
  • javascript
  • Not the "purpose-specific schemes" -> error
  • All other schemes (including "external schemes")

External schemes

Depending on the context, schemes not listed above will either launch an external application or result in a network error. Examples:

  • mailto
  • skype

IDNA

Definitions

  • IDNA2003+: IDNA2003 with Unicode updated to the latest version. (So not NFKC from Unicode 3.2., although Python might do that... ) Restrictions on display might be in place.
  • IDNA2008+: IDNA2008 with RFC 5895 section 2 mapping and IDNA2003 domain label separators. Display is restricted to IDNA2008, lookup is unrestricted (everything gets Punycoded).

Implementations

Tests

Algorithms

  • ToLabels(domain string) -> ASCII-label list (empty label at the end signifies trailing dot) or failure.
  • ToASCII(Unicode-label) -> ASCII-label.
  • ToUnicode(ASCII-label) -> Unicode label.

(For convenience maybe ToASCII and ToUnicode should accept lists too.)

UI

Note that this has potential security implications too, but does not matter for interoperability.

Notes