A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.

To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).

URL: Difference between revisions

From WHATWG Wiki
Jump to navigation Jump to search
No edit summary
(put parser sketch online)
Line 7: Line 7:
* http://trac.webkit.org/browser/trunk/Source/WebCore/platform/KURLGoogle.cpp
* http://trac.webkit.org/browser/trunk/Source/WebCore/platform/KURLGoogle.cpp
* http://trac.webkit.org/browser/trunk/Source/WebCore/platform/network/DataURL.cpp (data URLs)
* http://trac.webkit.org/browser/trunk/Source/WebCore/platform/network/DataURL.cpp (data URLs)
==Model==
URL (.href)
- invalid?
- scheme (.protocol)
- authority
  - username (proposed .username)
  - password (proposed .password)
  - ip/host (.hostname)
  - port (.port)
- path (.pathname)
- query (.search)
- fragment (.hash)
==Parsing==
parse (urlstr, optional baseURL)
  url = new URL
  SCHEME-OR-RELATIVE
    FIRST SCHEME CHARACTER
      if ...
        -> REMAINING SCHEME CHARACTERS
        -> NO SCHEME
    REMAINING SCHEME CHARACTERS
      if curChar is ":"
        -> SCHEME
      ...
        -> NO SCHEME
        -> REMAINING SCHEME CHARACTERS
  SCHEME
    if url.scheme is not hierarchical  (data:)
      -> NON-HIERARCHICAL
    if url.scheme is hierarchical and url.scheme is baseURL.scheme (http:?test)
      -> RELATIVE
    if url.scheme is hierarchical (https://test.com/)
      -> AUTHORITY
  NO SCHEME
    if baseURL.scheme is not hierarchical
      url.invalid = true
      return url
    else
      -> RELATIVE
  NON-HIERARCHICAL
    if curChar is "#"
      FRAGMENT
    else
      ...
  RELATIVE
    if urlstr is empty
      url = baseURL
      url.fragment = null
      return url
    if curChar is either "/" or "\"
      if urlstr second character is either "/" or "\"
        url.scheme = baseURL.scheme
        AUTHORITY
      else
        url.scheme = baseURL.scheme
        url.authority = baseURL.authority
        PATH
    if curChar is "?"
        url.scheme = baseURL.scheme
        url.authority = baseURL.authority
        url.path = baseURL.path
        QUERY
    if curChar is "#"
        url.scheme = baseURL.scheme
        url.authority = baseURL.authority
        url.path = baseURL.path
        url.query = baseURL.query
        FRAGMENT
    else
      url.scheme = baseURL.scheme
      url.authority = baseURL.authority
      prepend input by baseURL.path up to the last /
      PATH
  AUTHORITY
    if "/" or "\"
      AUTHORITY
    else
      AUTHORITY-AFTER-SLASHES
  AUTHORITY-AFTER-SLASHES
    ...
  PATH
    if curChar is "?"
      QUERY
    if curChar is "#"
      FRAGMENT
  QUERY
    if curChar is "#"
      FRAGMENT
  FRAGMENT
    ...


[[Category:Spec coordination]]
[[Category:Spec coordination]]

Revision as of 12:02, 15 June 2012

This documents research and notes around the URL specification.

Implementations

Model

URL (.href)
- invalid?
- scheme (.protocol)
- authority
  - username (proposed .username)
  - password (proposed .password)
  - ip/host (.hostname)
  - port (.port)
- path (.pathname)
- query (.search)
- fragment (.hash)

Parsing

parse (urlstr, optional baseURL)
 url = new URL

 SCHEME-OR-RELATIVE
   FIRST SCHEME CHARACTER
     if ...
       -> REMAINING SCHEME CHARACTERS
       -> NO SCHEME
   REMAINING SCHEME CHARACTERS
     if curChar is ":"
       -> SCHEME
     ...
       -> NO SCHEME
       -> REMAINING SCHEME CHARACTERS

 SCHEME
   if url.scheme is not hierarchical  (data:)
     -> NON-HIERARCHICAL
   if url.scheme is hierarchical and url.scheme is baseURL.scheme (http:?test)
     -> RELATIVE
   if url.scheme is hierarchical (https://test.com/)
     -> AUTHORITY

 NO SCHEME
   if baseURL.scheme is not hierarchical
     url.invalid = true
     return url
   else
     -> RELATIVE

 NON-HIERARCHICAL
   if curChar is "#"
     FRAGMENT
   else
     ...

 RELATIVE
   if urlstr is empty
     url = baseURL
     url.fragment = null
     return url

   if curChar is either "/" or "\"
     if urlstr second character is either "/" or "\"
       url.scheme = baseURL.scheme
       AUTHORITY
     else
       url.scheme = baseURL.scheme
       url.authority = baseURL.authority
       PATH

   if curChar is "?"
       url.scheme = baseURL.scheme
       url.authority = baseURL.authority
       url.path = baseURL.path
       QUERY

   if curChar is "#"
       url.scheme = baseURL.scheme
       url.authority = baseURL.authority
       url.path = baseURL.path
       url.query = baseURL.query
       FRAGMENT

   else
     url.scheme = baseURL.scheme
     url.authority = baseURL.authority
     prepend input by baseURL.path up to the last /
     PATH

 AUTHORITY
   if "/" or "\"
     AUTHORITY
   else
     AUTHORITY-AFTER-SLASHES

 AUTHORITY-AFTER-SLASHES
   ...

 PATH
   if curChar is "?"
     QUERY
   if curChar is "#"
     FRAGMENT

 QUERY
   if curChar is "#"
     FRAGMENT

 FRAGMENT
   ...