Difference between revisions of "Autocomplete Types"
|Line 550:||Line 550:|
Latest revision as of 19:47, 27 September 2014
Many user agents provide autofill functionality, but there is not currently a good way for site authors to directly specify field types for autofill. Herein is a lightweight proposal to enable authors to provide these field type hints for autofill agents.
- 1 Use Case Description
- 2 Proposed Solutions
- 2.1 Extending the autocomplete Attribute for Form Fields
- 2.1.1 Mechanics/Model
- 2.1.2 Alternatives Considered
- 2.1.3 Adoption
- 2.1.4 Limitations
- 2.1.5 Internationalization
- 2.1.6 Extensibility
- 2.1.7 Security and Privacy Implications
- 2.1.8 Experimental Implementation in Chrome
- 2.1 Extending the autocomplete Attribute for Form Fields
Use Case Description
Many user agents provide functionality to quickly fill frequently used form data – address and contact information, for example. For the purposes of this document, we will refer to this functionality as "autofill" functionality, and will refer to user agents that provide such functionality as "autofill agents". Autofill agents save users' time, and help site authors convert users in purchase and registration flows. Autofill works best when site authors are able to directly provide hints to autofill agents as to what data belongs in each field.
Current autofill products primarily rely on contextual clues to determine the type of data that should be filled into form elements. Examples of such contextual clues include the
name of an
input element, the text surrounding the element, and any
We have discussed the shortcomings of these ad hoc approaches with developers of several autofill products, and all have been interested in a solution that would let website authors classify their form fields themselves. While current methods of field classification work in general, for many cases they are unreliable or ambiguous due to the many variations and conventions used by web developers when creating their forms.
- Ambiguity - Fields named "name" can mean a variety of things, including given name, surname, full name, username, or others. Similar confusion can occur among other fields, such as email address and street address.
- Internationalization - Recognizing field names and context clues for all the world’s languages is impractical, time-intensive, and error-prone (as good context clues in one language may mean something else in another language)
- Unrelated Naming - Due to backend requirements (such as a framework that a developer is working within), developers may be constrained in what they can name their fields. As such, the name of a field may be unrelated from the data it contains.
We believe that website authors have strong incentive to facilitate autofill on their forms to help convert users in purchase and registration flows. Additionally, this assists users by streamlining their experience.
Current Usage and Workarounds
As mentioned above, current autofill products primarily rely on contextual clues. Thus, site authors who wish to "play nicely" with these autofill agents must reverse-engineer each agent's heuristics, and design their web site to match.
There has been a previous standard suggested in this space: RFC 3106, a.k.a. ECML. This standard has been largely unused, we believe for (at least) two reasons:
- RFC 3106 requires websites to conform to a set of input naming standards, effectively co-opting the
nameattribute — which has other, sometimes conflicting, uses. For existing websites, changing the
nameattribute is an onerous task: Since this attribute serves as a key for parsing submitted form data, updating a field's
namerequires coordinated front-end and back-end changes. Even for new websites, the
nameattribute might have back-end restrictions conflicting with RFC 3106. For example, the SourceForge registration page appears to use the attribute as a way to provide an extra security token. Based on research done by in the summer of 2011, we believe that this is the primary reason that RFC 3106 has been largely unused.
- RFC 3106 lacks current user agent support. To the best of our knowledge, it is currently supported only by Google Toolbar. There is a bit of a chicken-and-egg problem here: On the one hand, site authors are hesitant to use RFC 3106 due to minimal user agent support. On the other hand, user agents are hesitant to support the RFC due both to minimal usage in the wild, and due to the aforementioned inconvenience to site-authors caused by co-opting of the
nameattribute. Any new standard will have to face a similar hurdle of bolstering initial adoption; but based on discussion with developers of several autofill products, we believe that many autofill agents would be happy to support a cleaner standard.
autocomplete Attribute for Form Fields
We propose extending the current
autocomplete attribute to optionally specify field types, in addition to the existing values of "on", "off", and "default", in order to eliminate ambiguity from the process of determining input data types.
User agents sometimes have features for helping users fill forms in, for example prefilling the user's address based on earlier user input.
autocompleteattribute is an ordered set of space-separated tokens. The attribute implies one of two autocompletion states for the
inputelement: on or off. The "on" keyword maps to the on state, and the "off" keyword maps to the off state. The attribute may also be omitted, or may provide a field datatype hint, as described in section 188.8.131.52.1.1.
(Begin unchanged snippet)
The off state indicates either that the control's input data is particularly sensitive (for example the activation code for a nuclear weapon); or that it is a value that will never be reused (for example a one-time-key for a bank login) and the user will therefore have to explicitly enter the data each time, instead of being able to rely on the UA to prefill the value for him; or that the document provides its own autocomplete mechanism and does not want the user agent to provide autocompletion values.
Conversely, the on state indicates that the value is not particularly sensitive and the user can expect to be able to rely on his user agent to remember values he has entered for that control.
(End unchanged snippet)
If the attribute is omitted, user agent is to use the autocomplete attribute on the element's form owner instead. (By default, the autocomplete attribute of form elements is in the on state.)
When an input element is in one of the following conditions, the input element's resulting autocompletion state is on; otherwise, the input element's resulting autocompletion state is off:
- Its autocomplete attribute is specified, has a non-empty value, and the value is not "off".
- Its autocomplete attribute is omitted, and the element has no form owner.
- Its autocomplete attribute is omitted, and the element's form owner's autocomplete attribute is in the on state.
(Begin unchanged snippet)
When an input element's resulting autocompletion state is on, the user agent may store the value entered by the user so that if the user returns to the page, the UA can prefill the form. Otherwise, the user agent should not remember the control's value, and should not offer past values to the user.
In addition, if the resulting autocompletion state is off, values are reset when traversing the history.
The autocompletion mechanism must be implemented by the user agent acting as if the user had modified the element's value, and must be done at a time where the element is mutable (e.g. just after the element has been inserted into the document, or when the user agent stops parsing).
Banks frequently do not want UAs to prefill login information:
<label>Account: <input type="text" name="ac" autocomplete="off"></label>
<label>PIN: <input type="password" name="pin" autocomplete="off"></label>
A user agent may allow the user to override the resulting autocompletion state and set it to always on, always allowing values to be remembered and prefilled, or always off, never remembering values. However, user agents should not allow users to trivially override the resulting autocompletion state to on, as there are significant security implications for the user if all values are always remembered, regardless of the site's preferences.
(End unchanged snippet)
184.108.40.206.1.1 Specifying field data type hints
autocompleteattribute can also provide a field data type hint to the user agent. If a field data type hint is specified, the
inputelement's autocompletion state is on. User agents do not have to respect the field type specified in the
autocompleteattribute, but it may be used as a hint.
The attribute’s value, if specifying a field data type hint, must be an ordered set of unique space-separated tokens, each of which indicates the data type of the input element. If the user agent supports type-specific autocomplete (a.k.a. "autofill") and is designed to follow the
autocompletefield data type hints, it should iterate over the tokens from left to right and use as the data type the first token that it recognizes (with the exception of section tokens, as defined below).
In either of the following cases, the user agent should not autocomplete the field based on the field's data type, though non-datatype specific autocomplete may still be invoked; otherwise, the user agent may fall back on alternative means for detecting the input data type:
- The user agent does not recognize any of the non-section tokens specified in the
- The field's
autocompleteattribute is not specified, empty, or set to "on"; and there is at least one other field in the form that specifies a field data type hint using the
In practice, this allows website authors to disable datatype-specific autocomplete for an entire form by setting the
autocompleteattribute on one form element to something unrecognized by all browsers. This would still allow autocomplete through other methods (for example, by using the user’s form field history), which separates it from
There is no comprehensive list of tokens, as the number of possible input data types is many and ever-increasing. However, at least the following set of tokens should be recognized by user agents, if the user agent’s autofill feature is capable of filling the corresponding data type.
Token Description Names name full name honorific-prefix prefix or title (Mr., Mrs. Dr., etc.) given-name given or first name additional-name additional or middle name additional-name-initial additional or middle name initial family-name family name, surname, or last name honorific-suffix suffix (Jr., II, etc.) nickname nickname Addresses street-address full street address condensed into one line address-line1 first line of street address address-line2 second line of street address address-line3 third line of street address locality locality or city city same as locality administrative-area administrative area, state, province, or region state same as administrative-area province same as administrative-area region same as administrative-area postal-code postal or ZIP code country-name country name Contact Information email address tel full phone number, including country code tel-country-code international country code tel-national national phone number: full number minus country code tel-area-code area code tel-local local phone number: full number minus country and area codes tel-local-prefix first part of local phone number (not recommended, see note 1 below) tel-local-suffix second part of local phone number (not recommended, see note 1 below) tel-extension phone extension number fax full fax number, including country code fax-country-code international country code fax-national national fax number: full number minus country code fax-area-code area code fax-local local fax number: full number minus country and area codes fax-local-prefix first part of local fax number (not recommended, see note 1 below) fax-local-suffix second part of local fax number (not recommended, see note 1 below) fax-extension fax extension number Credit Cards cc-name full name, as it appears on credit card cc-given-name given or first name, as it appears on credit card (not recommended, see note 2 below) cc-additional-name additional or middle name (or initial), as it appears on credit card (not recommended, see note 2 below) cc-family-name family name, surname, or last name, as it appears on credit card (not recommended, see note 2 below) cc-number credit card number cc-exp-month month of expiration of credit card cc-exp-year year of expiration of credit card (see note 3 below about formatting) cc-exp date of expiration of credit card (see note 4 below about formatting) cc-csc credit card security code Other language preferred language bday birthday (see note 4 below about formatting) bday-year year of birthday (see note 3 below about formatting) bday-month month of birthday bday-day day of birthday org company or organization organization-title user's position or title within company or organization sex sex or gender gender-identity gender identity url Website URL photo photo or avatar section-***** used to group forms together (see note 5 below)
Notes on tokens:
- The tokens phone-local-prefix, phone-local-suffix, fax-local-prefix, and fax-local-suffix are added to support phone and fax formats where the local number is split into two parts (as in the US, for example). However, it is recommended that forms be constructed with no separation to maximize international support.
- The tokens cc-given-name, cc-middle-name, and cc-surname are added to support forms where the name on the credit card has been split into several form fields. However, it is recommended that forms be constructed with no separation.
- For the tokens cc-exp-year and birthday-year, the element’s
maxlengthattribute should be used as a hint to the formatting of the year. For example,
maxlength="2"indicates a 2-digit year format. Beyond this hint, the user agent may fall back on other heuristics to determine the data format.
- For the tokens cc-exp and birthday, it is recommended to use the HTML5 attribute value
type="date"to distinguish fields requesting year and month from those requesting year, month, and day. In these cases the data should be formatted according to the proper formats for those fields. In other cases, the element's
maxlengthattribute can be used as a hint to proper formatting. For example,
maxlength="7"indicates that a 2-digit month, 4-digit year, and 1-digit separator should be used. Beyond this hint, the user agent may fall back on other heuristics to determine the data format.
- To facilitate classification of logical groups of form fields, developers can use tokens that begin with "section-" to denote such sections. This is described in more detail below.
Form fields are often grouped into logical sections, such as shipping and billing addresses. This semantic information is useful to user agents with autofill capabilities. Web developers may specify this sectioning by a token beginning with "section-".
There may be zero or one section tokens in the
autocompletevalue. If there is one, it must be the first token in the list. Any characters may follow "section-" so long as the token remains a valid token. All fields in a logical grouping (such as shipping or billing addresses) should have the same section token (such as section-shipping or section-billing).
We considered endorsing input naming standards. Web developers would name their forms according to a set of naming standards, such as RFC 3106, a.k.a. ECML. While this might be adopted by new websites, it would force developers of existing websites to change their naming conventions on both the front- and back-end of their websites. Adoption of these standards would therefore be slow at best, and likely never catch on (as has been the case for ECML).
The better solution therefore seems to be one that does not alter the
name attribute of input elements, and instead standardizes labels or placeholder text. However, these texts are visible to users. Web developers should have full control over what is displayed in order to provide the best user experience (especially in cases where the web site is in a foreign language) and as such labels and placeholder texts are inappropriate for this purpose.
To avoid co-opting input element names and user-facing text, we considered using custom data attributes, which are new to the HTML5 specification. However, these are to be used for within-site data. The specification explicitly states that custom data attributes "are not a generic extension mechanism for publicly-usable metadata," which is exactly what we are attempting to do.
We believe this addition to the specification to be the best solution because:
- It is simple to add both to new and to existing web forms.
- It does not require web developers to alter backend code.
- It does not alter the display of forms or user-facing text.
- There is precedent for this type of attribute in the autocomplete attribute.
- It is extensible to future or experimental input data types.
- It allows web developers to provide multiple input data types to fall back on.
- It allows user agents to fall back to alternate heuristics if the attribute is not provided.
- User agents that do not recognize the attribute will simply ignore it.
The main drawback to this solution is that unless approved as a part of the HTML specification, a website that specifies field types using the autocomplete attribute would not be detected as valid HTML by most HTML validators. However, it is not uncommon to use experimental elements or attributes for new features.
We hope that this attribute be accepted into the HTML5 specification, eliminating this drawback.
The token names were chosen to support internationalization. While it is extremely difficult to develop a schema that will work for every case, we believe these tokens include the majority of users. In addition, the extensibility of the attribute allows other tokens to be used that are specific to different locales.
To encourage adoption, we included aliases for common terms in the US. For example, in addition to locality and administrative-area, we have included the aliases city and state. This introduces redundancy and increases the number of tokens, but we view it as necessary for adoption in the US. The extensibility of the attribute similarly allows for additional tokens that are specific to other locales.
Security and Privacy Implications
When dealing with user’s personal information, extra care must be taken to ensure that the data is protected and only transmitted with the user’s consent. This proposal improves the accuracy of autofill products to classify form elements, which could potentially assist malicious sites in identifying and extracting private user data. These vulnerabilities need to be addressed in the autofill products themselves, as any autofill product would be equally at-risk of privacy violations with or without explicit author-specified field types, whether specified via the extended
autocomplete attribute or otherwise.
Experimental Implementation in Chrome
As of Chrome 15, this extension has been implemented under the experimental attribute
x-autocompletetype. The experimental implementation supports a slightly different set of token names, reproduced below. We anticipate that the attribute’s success in improving autofill products will encourage other autofill solutions to implement the attribute. Additionally, we hope it will strengthen our proposal to add the attribute to the HTML5 specification.
|given-name||given or first name|
|surname||surname or last name|
|name-prefix||prefix or title (Mr., Mrs. Dr., etc.)|
|name-suffix||suffix (Jr., II, etc.)|
|street-address||full street address condensed into one line|
|address-line1||first line of street address|
|address-line2||second line of street address|
|address-line3||third line of street address|
|locality||locality or city|
|city||same as locality|
|administrative-area||administrative area, state, province, or region|
|state||same as administrative-area|
|province||same as administrative-area|
|region||same as administrative-area|
|postal-code||postal or ZIP code|
|phone-full||full phone number, including country code|
|phone-country-code||international country code|
|phone-national||national phone number: full number minus country code|
|phone-local||local phone number: full number minus country and area codes|
|phone-local-prefix||first part of local phone number|
|phone-local-suffix||second part of local phone number|
|phone-extension||phone extension number|
|fax-full||full fax number, including country code|
|fax-country-code||international country code|
|fax-national||national fax number: full number minus country code|
|fax-local||local fax number: full number minus country and area codes|
|fax-local-prefix||first part of local fax number|
|fax-local-suffix||second part of local fax number|
|fax-extension||fax extension number|
|cc-full-name||full name, as it appears on credit card|
|cc-given-name||given name, as it appears on credit card|
|cc-middle-name||middle name or initial, as it appears on credit card|
|cc-surname||surname, as it appears on credit card|
|cc-number||credit card number|
|cc-exp-month||month of expiration of credit card|
|cc-exp-year||year of expiration of credit card|
|cc-exp||date of expiration of credit card|
|cc-csc||credit card security code|
|birthday-month||month of birthday|
|birthday-year||year of birthday|
|birthday-day||day of birthday|
|organization||company or organization|
|organization-title||user's position or title within company or organization|
|section-*****||used to group forms together|