A user account is required in order to edit this wiki, but we've had to disable public user registrations due to spam.
To request an account, ask an autoconfirmed user on Chat (such as one of these permanent autoconfirmed members).
URL: Difference between revisions
Jump to navigation
Jump to search
(→Parsing: update some states with new findings based on offline experiments in javascript) |
m (→Parsing) |
||
Line 57: | Line 57: | ||
-> PATH | -> PATH | ||
elif baseURL and url.scheme is baseURL.scheme (http:?test) | elif baseURL and url.scheme is baseURL.scheme (http:?test) | ||
-> | -> HIERARCHICAL | ||
else (https://test.com/) | else (https://test.com/) | ||
-> AUTHORITY START | -> AUTHORITY START | ||
Line 69: | Line 69: | ||
return url | return url | ||
else | else | ||
-> | -> HIERARCHICAL | ||
HIERARCHICAL | |||
if char is EOI (end-of-input) | if char is EOI (end-of-input) | ||
url = baseURL | url = baseURL | ||
Line 114: | Line 114: | ||
... | ... | ||
PATH ( | PATH (has no query) | ||
if | if char is "#" | ||
FRAGMENT | FRAGMENT | ||
else | else | ||
path += char | |||
HIERARCHICAL PATH | HIERARCHICAL PATH |
Revision as of 11:01, 18 July 2012
This documents research and notes around the URL specification.
Implementations
- http://trac.webkit.org/browser/trunk/Source/WebCore/platform/KURL.cpp
- http://trac.webkit.org/browser/trunk/Source/WebCore/platform/KURLWTFURL.cpp
- http://trac.webkit.org/browser/trunk/Source/WebCore/platform/KURLGoogle.cpp
- http://trac.webkit.org/browser/trunk/Source/WebCore/platform/network/DataURL.cpp (data URLs)
Tests
Terminology
- URL string
- What you find in attribute values, property values, method parameters, etc.
- parse a URL string url using base URL base
- Turning a URL string into a URL by using a base URL.
- URL
- An in-memory representation of a URL with various properties as elaborated on by model below.
- URL interface/object
- JavaScript representation of a URL.
Model
URL (.href) - invalid? - scheme (.protocol) - authority - username (proposed .username) - password (proposed .password) - ip/host (.hostname) - port (.port) - path (.pathname) - query (.search) - fragment (.hash)
Parsing
parse (urlstr, optional baseURL) url = new URL tokenize(urlstr) SCHEME START if char is in ALPHA buffer += char -> SCHEME else unconsume char -> NO SCHEME SCHEME if char is in ALPHA / DIGIT / "+" / "-" / "." buffer += char -> continue elif char is ":" url.scheme = buffer.toASCIILowercase() buffer = "" if url.scheme is not hierarchical (data:) -> PATH elif baseURL and url.scheme is baseURL.scheme (http:?test) -> HIERARCHICAL else (https://test.com/) -> AUTHORITY START else: input.reset() -> NO SCHEME NO SCHEME if not baseURL or baseURL.scheme is not hierarchical url.invalid = true return url else -> HIERARCHICAL HIERARCHICAL if char is EOI (end-of-input) url = baseURL url.fragment = null exit elif char is "/" or char is "\" if next char "/" or next char is "\" url.scheme = baseURL.scheme -> AUTHORITY START else url.scheme = baseURL.scheme url.authority = baseURL.authority -> HIERARCHICAL PATH elif char is "?" url.scheme = baseURL.scheme url.authority = baseURL.authority url.path = baseURL.path -> QUERY elif char is "#" url.scheme = baseURL.scheme url.authority = baseURL.authority url.path = baseURL.path url.query = baseURL.query -> FRAGMENT else url.scheme = baseURL.scheme url.authority = baseURL.authority prepend input by baseURL.path up to the last / -> HIERARCHICAL PATH AUTHORITY START if char is "/" or char is "\" -> continue else -> AUTHORITY AUTHORITY ... PATH (has no query) if char is "#" FRAGMENT else path += char HIERARCHICAL PATH if char is "?" -> QUERY if char is "#" -> FRAGMENT else buffer += char QUERY if char is "#" -> FRAGMENT FRAGMENT ...