A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

URL: Difference between revisions

From WHATWG Wiki
Jump to navigation Jump to search
Line 35: Line 35:
==Parsing==
==Parsing==


parse (urlstr, optional baseURL)
See https://github.com/annevk/url/blob/master/url.js for now.
  url = new URL
  tokenize(urlstr)
  SCHEME START
    if char is in ALPHA
      buffer += char
      -> SCHEME
    else
      unconsume char
      -> NO SCHEME
  SCHEME
    if char is in ALPHA / DIGIT / "+" / "-" / "."
      buffer += char
      -> continue
    elif char is ":"
      url.scheme = buffer.toASCIILowercase()
      buffer = ""
      if url.scheme is not hierarchical (data:)
        -> PATH
      elif baseURL and url.scheme is baseURL.scheme (http:?test)
        -> HIERARCHICAL
      else  (https://test.com/)
        -> AUTHORITY START
    else:
      input.reset()
      -> NO SCHEME
  NO SCHEME
    if not baseURL or baseURL.scheme is not hierarchical
      url.invalid = true
      return url
    else
      -> HIERARCHICAL
  HIERARCHICAL
    if char is EOI (end-of-input)
      url = baseURL
      url.fragment = null
      exit
    elif char is "/" or char is "\"
      if next char "/" or next char is "\"
        url.scheme = baseURL.scheme
        -> AUTHORITY START
      else
        url.scheme = baseURL.scheme
        url.authority = baseURL.authority
        -> HIERARCHICAL PATH
    elif char is "?"
        url.scheme = baseURL.scheme
        url.authority = baseURL.authority
        url.path = baseURL.path
        -> QUERY
    elif char is "#"
        url.scheme = baseURL.scheme
        url.authority = baseURL.authority
        url.path = baseURL.path
        url.query = baseURL.query
        -> FRAGMENT
    else
      url.scheme = baseURL.scheme
      url.authority = baseURL.authority
      prepend input by baseURL.path up to the last /
      -> HIERARCHICAL PATH
  AUTHORITY START
    if char is "/" or char is "\"
      -> continue
    else
      -> AUTHORITY
  AUTHORITY
    ...
  PATH (has no query)
    if char is "#"
      FRAGMENT
    else
      path += char
  HIERARCHICAL PATH
    if char is "?"
      -> QUERY
    if char is "#"
      -> FRAGMENT
    else
      buffer += char
  QUERY
    if char is "#"
      -> FRAGMENT
  FRAGMENT
    ...


[[Category:Spec coordination]]
[[Category:Spec coordination]]

Revision as of 15:10, 17 September 2012

This documents research and notes around the URL specification.

Implementations

Tests

Terminology

URL string
What you find in attribute values, property values, method parameters, etc.
parse a URL string url using base URL base
Turning a URL string into a URL by using a base URL.
URL
An in-memory representation of a URL with various properties as elaborated on by model below.
URL interface/object
JavaScript representation of a URL.

Model

URL (.href)
- invalid?
- scheme (.protocol)
- authority
  - username (proposed .username)
  - password (proposed .password)
  - ip/host (.hostname)
  - port (.port)
- path (.pathname)
- query (.search)
- fragment (.hash)

Parsing

See https://github.com/annevk/url/blob/master/url.js for now.