<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.whatwg.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Jsbell</id>
	<title>WHATWG Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.whatwg.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Jsbell"/>
	<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/wiki/Special:Contributions/Jsbell"/>
	<updated>2026-05-27T07:06:27Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.39.3</generator>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8554</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8554"/>
		<updated>2012-10-19T16:21:04Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Obsoleted&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
= This Document Is Obsolete =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;font size=&amp;quot;6&amp;quot;&amp;gt;See [http://encoding.spec.whatwg.org/#api Encoding Standard - API] instead.&amp;lt;/font&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute return the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoding or the name that was passed in?&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should legacy encodings be supported?&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Consensus on the [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036825.html WHATWG mailing list] - support legacy encodings for decode, and only &#039;utf-8&#039;, &#039;utf-16&#039; and &#039;utf-16be&#039; for encode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;What do to on encoding errors? If non-UTF encodings are supported then we may want to allow substitution (e.g. ASCII &#039;?&#039;) or a script callback (for arbitrary escaping).&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; (see above)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;How are byte order marks handled? &lt;br /&gt;
&amp;lt;dd&amp;gt;BOM is respected if and only if the requested coding is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://encoding.spec.whatwg.org/ Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, if the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the returned encoding is not one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://encoding.spec.whatwg.org/ Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Because only UTF encodings are supported, and because of the use of the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] and thus Unicode code points, no input can cause the encoding process to emit an encoder error.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset,  a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset, an &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; pointer which is initially &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, and a &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# If &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; then set the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://encoding.spec.whatwg.org/ Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://encoding.spec.whatwg.org/ Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, let &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; be a &amp;lt;code&amp;gt;Uint8Array&amp;lt;/code&amp;gt; of length &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;.&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; and set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set. Otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run or resume the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder, with the following additions:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** If &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, the bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions less than &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; are provided by buffer(s) passed in to previous calls to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** The bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; through &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; - 1&amp;lt;/code&amp;gt; are provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** When accessing the byte in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; at position &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt;:&lt;br /&gt;
#*** If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then yield the &#039;&#039;&#039;EOF byte&#039;&#039;&#039;.&lt;br /&gt;
#*** Otherwise, set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt; and suspend the steps of the decoder algorithm until a subsequent call to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;&lt;br /&gt;
#* If encoding is one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; and &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, then prior to running the steps of the decoder algorithm:&lt;br /&gt;
#** If the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag is set, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and clear the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and set the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than three bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next three bytes of the stream are &#039;&#039;&#039;0xEF 0xBB 0xBF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;3&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://encoding.spec.whatwg.org/ Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://encoding.spec.whatwg.org/ Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://encoding.spec.whatwg.org/ Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://encoding.spec.whatwg.org/ Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://encoding.spec.whatwg.org/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8501</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8501"/>
		<updated>2012-09-21T20:27:32Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Encoding standard URL update&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute return the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoding or the name that was passed in?&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should legacy encodings be supported?&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Consensus on the [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036825.html WHATWG mailing list] - support legacy encodings for decode, and only &#039;utf-8&#039;, &#039;utf-16&#039; and &#039;utf-16be&#039; for encode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;What do to on encoding errors? If non-UTF encodings are supported then we may want to allow substitution (e.g. ASCII &#039;?&#039;) or a script callback (for arbitrary escaping).&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; (see above)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;How are byte order marks handled? &lt;br /&gt;
&amp;lt;dd&amp;gt;BOM is respected if and only if the requested coding is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://encoding.spec.whatwg.org/ Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, if the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the returned encoding is not one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://encoding.spec.whatwg.org/ Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Because only UTF encodings are supported, and because of the use of the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] and thus Unicode code points, no input can cause the encoding process to emit an encoder error.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset,  a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset, an &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; pointer which is initially &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, and a &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# If &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; then set the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://encoding.spec.whatwg.org/ Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://encoding.spec.whatwg.org/ Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, let &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; be a &amp;lt;code&amp;gt;Uint8Array&amp;lt;/code&amp;gt; of length &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;.&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; and set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set. Otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run or resume the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder, with the following additions:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** If &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, the bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions less than &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; are provided by buffer(s) passed in to previous calls to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** The bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; through &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; - 1&amp;lt;/code&amp;gt; are provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** When accessing the byte in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; at position &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt;:&lt;br /&gt;
#*** If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then yield the &#039;&#039;&#039;EOF byte&#039;&#039;&#039;.&lt;br /&gt;
#*** Otherwise, set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt; and suspend the steps of the decoder algorithm until a subsequent call to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;&lt;br /&gt;
#* If encoding is one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; and &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, then prior to running the steps of the decoder algorithm:&lt;br /&gt;
#** If the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag is set, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and clear the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and set the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than three bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next three bytes of the stream are &#039;&#039;&#039;0xEF 0xBB 0xBF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;3&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://encoding.spec.whatwg.org/ Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://encoding.spec.whatwg.org/ Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://encoding.spec.whatwg.org/ Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://encoding.spec.whatwg.org/ Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://encoding.spec.whatwg.org/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8491</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8491"/>
		<updated>2012-09-17T21:06:11Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: HTML typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute return the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoding or the name that was passed in?&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should legacy encodings be supported?&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Consensus on the [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036825.html WHATWG mailing list] - support legacy encodings for decode, and only &#039;utf-8&#039;, &#039;utf-16&#039; and &#039;utf-16be&#039; for encode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;What do to on encoding errors? If non-UTF encodings are supported then we may want to allow substitution (e.g. ASCII &#039;?&#039;) or a script callback (for arbitrary escaping).&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; (see above)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;How are byte order marks handled? &lt;br /&gt;
&amp;lt;dd&amp;gt;BOM is respected if and only if the requested coding is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, if the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the returned encoding is not one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Because only UTF encodings are supported, and because of the use of the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] and thus Unicode code points, no input can cause the encoding process to emit an encoder error.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset,  a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset, an &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; pointer which is initially &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, and a &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# If &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; then set the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, let &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; be a &amp;lt;code&amp;gt;Uint8Array&amp;lt;/code&amp;gt; of length &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;.&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; and set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set. Otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run or resume the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder, with the following additions:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** If &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, the bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions less than &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; are provided by buffer(s) passed in to previous calls to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** The bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; through &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; - 1&amp;lt;/code&amp;gt; are provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** When accessing the byte in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; at position &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt;:&lt;br /&gt;
#*** If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then yield the &#039;&#039;&#039;EOF byte&#039;&#039;&#039;.&lt;br /&gt;
#*** Otherwise, set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt; and suspend the steps of the decoder algorithm until a subsequent call to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;&lt;br /&gt;
#* If encoding is one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; and &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, then prior to running the steps of the decoder algorithm:&lt;br /&gt;
#** If the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag is set, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and clear the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and set the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than three bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next three bytes of the stream are &#039;&#039;&#039;0xEF 0xBB 0xBF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;3&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8490</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8490"/>
		<updated>2012-09-17T21:05:24Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Make BOM handling algorithmic&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute return the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoding or the name that was passed in?&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should legacy encodings be supported?&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Consensus on the [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036825.html WHATWG mailing list] - support legacy encodings for decode, and only &#039;utf-8&#039;, &#039;utf-16&#039; and &#039;utf-16be&#039; for encode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;What do to on encoding errors? If non-UTF encodings are supported then we may want to allow substitution (e.g. ASCII &#039;?&#039;) or a script callback (for arbitrary escaping).&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; (see above)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;How are byte order marks handled? &lt;br /&gt;
&amp;lt;dd&amp;gt;BOM is respected if and only if the requested coding is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, if the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the returned encoding is not one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Because only UTF encodings are supported, and because of the use of the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] and thus Unicode code points, no input can cause the encoding process to emit an encoder error.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset,  a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset, an &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; pointer which is initially &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, and a &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# If &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; then set the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, let &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; be a &amp;lt;code&amp;gt;Uint8Array&amp;lt;/code&amp;gt; of length &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;.&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; and set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set. Otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run or resume the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder, with the following additions:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** If &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, the bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions less than &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; are provided by buffer(s) passed in to previous calls to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** The bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; through &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; - 1&amp;lt;/code&amp;gt; are provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** When accessing the byte in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; at position code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt;:&lt;br /&gt;
#*** If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then yield the &#039;&#039;&#039;EOF byte&#039;&#039;&#039;.&lt;br /&gt;
#*** Otherwise, set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt; and suspend the steps of the decoder algorithm until a subsequent call to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;&lt;br /&gt;
#* If encoding is one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; and &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, then prior to running the steps of the decoder algorithm:&lt;br /&gt;
#** If the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag is set, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and clear the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and set the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than three bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next three bytes of the stream are &#039;&#039;&#039;0xEF 0xBB 0xBF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;3&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** If encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot;, then:&lt;br /&gt;
#*** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#*** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8487</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8487"/>
		<updated>2012-09-13T23:09:13Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* TextDecoder */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute return the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoding or the name that was passed in?&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should legacy encodings be supported?&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Consensus on the [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036825.html WHATWG mailing list] - support legacy encodings for decode, and only &#039;utf-8&#039;, &#039;utf-16&#039; and &#039;utf-16be&#039; for encode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;What do to on encoding errors? If non-UTF encodings are supported then we may want to allow substitution (e.g. ASCII &#039;?&#039;) or a script callback (for arbitrary escaping).&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; (see above)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;How are byte order marks handled? &lt;br /&gt;
&amp;lt;dd&amp;gt;BOM is respected if and only if the requested coding is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, if the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the returned encoding is not one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Because only UTF encodings are supported, and because of the use of the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] and thus Unicode code points, no input can cause the encoding process to emit an encoder error.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset,  a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset, an &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; pointer which is initially &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, and a &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# If &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; then set the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, let &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; be a &amp;lt;code&amp;gt;Uint8Array&amp;lt;/code&amp;gt; of length &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;.&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; and set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set. Otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run or resume the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** If &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, the bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions less than &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; are provided by buffer(s) passed in to previous calls to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** The bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; through &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; - 1&amp;lt;/code&amp;gt; are provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** When accessing the byte in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; at position code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt;:&lt;br /&gt;
#*** If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then yield the &#039;&#039;&#039;EOF byte&#039;&#039;&#039;.&lt;br /&gt;
#*** Otherwise, set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt; and suspend the steps of the decoder algorithm until a subsequent call to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag is set and &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; then:&lt;br /&gt;
#** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend this algorithm and return the empty string.&lt;br /&gt;
#** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and clear the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and set the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#* Subsequent to the above &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; test and prior to executing the decoder algorithm, if encoding is one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; and &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, then if a matching BOM is present then &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; should be advanced past it. If not enough bytes are present in the stream for a match and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set suspend this algorithm and return the empty string.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8486</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8486"/>
		<updated>2012-09-13T23:01:22Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute return the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoding or the name that was passed in?&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should legacy encodings be supported?&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Consensus on the [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036825.html WHATWG mailing list] - support legacy encodings for decode, and only &#039;utf-8&#039;, &#039;utf-16&#039; and &#039;utf-16be&#039; for encode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;What do to on encoding errors? If non-UTF encodings are supported then we may want to allow substitution (e.g. ASCII &#039;?&#039;) or a script callback (for arbitrary escaping).&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; (see above)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;How are byte order marks handled? &lt;br /&gt;
&amp;lt;dd&amp;gt;BOM is respected if and only if the requested coding is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, if the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the returned encoding is not one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Because only UTF encodings are supported, and because of the use of the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] and thus Unicode code points, no input can cause the encoding process to emit an encoder error.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset,  a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset, an &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; pointer which is initially &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, and a &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# If &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; then set the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, let &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; be a &amp;lt;code&amp;gt;Uint8Array&amp;lt;/code&amp;gt; of length &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;.&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; and set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set. Otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run or resume the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** If &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, the bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions less than &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; are provided by buffer(s) passed in to previous calls to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** The bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; through &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; - 1&amp;lt;/code&amp;gt; are provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** When accessing the byte in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; at position code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt;:&lt;br /&gt;
#*** If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then yield the &#039;&#039;&#039;EOF byte&#039;&#039;&#039;.&lt;br /&gt;
#*** Otherwise, set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt; and suspend the steps of the decoder algorithm until a subsequent call to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag is set and &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; then:&lt;br /&gt;
#** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend the decoding algorithm and return the empty string.&lt;br /&gt;
#** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and clear the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and set the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8485</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8485"/>
		<updated>2012-09-13T22:58:49Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Incorporate BOM decoding&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
* Should the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute return the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoding or the name that was passed in?&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should legacy encodings be supported?&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Consensus on the [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036825.html WHATWG mailing list] - support legacy encodings for decode, and only &#039;utf-8&#039;, &#039;utf-16&#039; and &#039;utf-16be&#039; for encode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;What do to on encoding errors? If non-UTF encodings are supported then we may want to allow substitution (e.g. ASCII &#039;?&#039;) or a script callback (for arbitrary escaping).&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; (see above)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, if the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the returned encoding is not one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Because only UTF encodings are supported, and because of the use of the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] and thus Unicode code points, no input can cause the encoding process to emit an encoder error.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset,  a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset, an &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; pointer which is initially &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, and a &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# If &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; then set the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, let &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; be a &amp;lt;code&amp;gt;Uint8Array&amp;lt;/code&amp;gt; of length &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;.&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; and set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set. Otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run or resume the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** If &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, the bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions less than &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; are provided by buffer(s) passed in to previous calls to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** The bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; through &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; - 1&amp;lt;/code&amp;gt; are provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** When accessing the byte in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; at position code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt;:&lt;br /&gt;
#*** If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then yield the &#039;&#039;&#039;EOF byte&#039;&#039;&#039;.&lt;br /&gt;
#*** Otherwise, set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt; and suspend the steps of the decoder algorithm until a subsequent call to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag is set and &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; then:&lt;br /&gt;
#** If less than two bytes are present in the stream and the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is true, suspend the decoding algorithm and return the empty string.&lt;br /&gt;
#** If the next two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and clear the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#** If the next two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;2&amp;lt;/code&amp;gt;, and set the &amp;lt;var&amp;gt;utf-16be&amp;lt;/var&amp;gt; flag of the decoder state.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8478</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8478"/>
		<updated>2012-08-22T00:34:33Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Decoding WIP&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
* Should the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute return the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoding or the name that was passed in?&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should legacy encodings be supported?&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Consensus on the [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036825.html WHATWG mailing list] - support legacy encodings for decode, and only &#039;utf-8&#039;, &#039;utf-16&#039; and &#039;utf-16be&#039; for encode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;What do to on encoding errors? If non-UTF encodings are supported then we may want to allow substitution (e.g. ASCII &#039;?&#039;) or a script callback (for arbitrary escaping).&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; (see above)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, if the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the returned encoding is not one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Because only UTF encodings are supported, and because of the use of the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] and thus Unicode code points, no input can cause the encoding process to emit an encoder error.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset,  a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset, an &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; pointer which is initially &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, and a &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# If &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; is a case-insensitive match for &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; then set the internal &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; flag.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, let &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; be a &amp;lt;code&amp;gt;Uint8Array&amp;lt;/code&amp;gt; of length &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;.&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; and set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set. Otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run or resume the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** If &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, the bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions less than &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; are provided by buffer(s) passed in to previous calls to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;.&lt;br /&gt;
#** The bytes in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; for positions &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; through &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; - 1&amp;lt;/code&amp;gt; are provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;.&lt;br /&gt;
#** When accessing the byte in &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; at position code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt;:&lt;br /&gt;
#*** If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then yield the &#039;&#039;&#039;EOF byte&#039;&#039;&#039;.&lt;br /&gt;
#*** Otherwise, set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to &amp;lt;code&amp;gt;&amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; + &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt;&amp;lt;/code&amp;gt; and suspend the steps of the decoder algorithm until a subsequent call to &amp;lt;code&amp;gt;decode()&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
&lt;br /&gt;
#* If &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; is 0:&lt;br /&gt;
#** If &amp;lt;var&amp;gt;useBOM&amp;lt;/var&amp;gt; is set, then:&lt;br /&gt;
#*** If less than two bytes are available in the stream, return without emitting anything.&lt;br /&gt;
#*** If the first two bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; then advance the stream by two bytes and set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to 2.&lt;br /&gt;
#*** If the first two bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; then advance the stream by two bytes and set &amp;lt;var&amp;gt;offset&amp;lt;/var&amp;gt; to 2 and set the &#039;&#039;&#039;utf-16be flag&#039;&#039;&#039; in the &#039;&#039;encoding algorithm state&#039;&#039;&lt;br /&gt;
#** Otherwise, if the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag is set:&lt;br /&gt;
#*** If the first bytes of the stream are &#039;&#039;&#039;0xFF 0xFE&#039;&#039;&#039; and encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; or &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot; then throw an &amp;quot;EncodingError&amp;quot; DOMException.&lt;br /&gt;
#*** If the first bytes of the stream are &#039;&#039;&#039;0xFE 0xFF&#039;&#039;&#039; and encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; or &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot; then throw an &amp;quot;EncodingError&amp;quot; DOMException.&lt;br /&gt;
#*** If the first three bytes of the stream are &#039;&#039;&#039;0xEF 0xBB 0xBF&#039;&#039;&#039; and encoding is &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot; or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; then throw an &amp;quot;EncodingError&amp;quot; DOMException.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8470</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8470"/>
		<updated>2012-08-13T16:07:46Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Clarify informative note.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
* Should the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute return the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoding or the name that was passed in?&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should legacy encodings be supported?&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Consensus on the [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036825.html WHATWG mailing list] - support legacy encodings for decode, and only &#039;utf-8&#039;, &#039;utf-16&#039; and &#039;utf-16be&#039; for encode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;What do to on encoding errors? If non-UTF encodings are supported then we may want to allow substitution (e.g. ASCII &#039;?&#039;) or a script callback (for arbitrary escaping).&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; (see above)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, if the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the returned encoding is not one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Because only UTF encodings are supported, and because of the use of the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] and thus Unicode code points, no input can cause the encoding process to emit an encoder error.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8469</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8469"/>
		<updated>2012-08-09T17:33:52Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Restrict encodings to UTF&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
* Should the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute return the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoding or the name that was passed in?&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should legacy encodings be supported?&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Consensus on the [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036825.html WHATWG mailing list] - support legacy encodings for decode, and only &#039;utf-8&#039;, &#039;utf-16&#039; and &#039;utf-16be&#039; for encode.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;What do to on encoding errors? If non-UTF encodings are supported then we may want to allow substitution (e.g. ASCII &#039;?&#039;) or a script callback (for arbitrary escaping).&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; (see above)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, if the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the returned encoding is not one of &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;utf-16&amp;lt;/code&amp;gt;&amp;quot;, or &amp;quot;&amp;lt;code&amp;gt;utf-16be&amp;lt;/code&amp;gt;&amp;quot; throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Because only UTF encodings are supported, and because of the algorithm used to convert a DOMString to a sequence of Unicode characters, no input can cause the encoding process to emit an encoder error.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8457</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8457"/>
		<updated>2012-08-06T17:49:48Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Fixed return type of encode (ArrayBufferView -&amp;gt; Uint8Array)&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Right now, the encoding process is defined to throw on encoding errors (i.e. when trying to encode a code point that isn&#039;t supported in the specified encoding). There are four reasonable options:&lt;br /&gt;
*** Throw (as currently spec&#039;d)&lt;br /&gt;
*** Encoding-specific replacement character (e.g. &#039;?&#039;)&lt;br /&gt;
*** Custom replacement character&lt;br /&gt;
*** Custom behavior e.g. a user-defined function to handle the code point, such as: &amp;lt;code&amp;gt;function (cp) { return &amp;quot;&amp;amp;#x&amp;quot; + Number(cp).toString(16) + &amp;quot;;&amp;quot;; }&amp;lt;/code&amp;gt;&lt;br /&gt;
** Should the spec define an options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  Uint8Array encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8456</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8456"/>
		<updated>2012-08-06T17:49:02Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Fleshed out description of encoding error options&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Right now, the encoding process is defined to throw on encoding errors (i.e. when trying to encode a code point that isn&#039;t supported in the specified encoding). There are four reasonable options:&lt;br /&gt;
*** Throw (as currently spec&#039;d)&lt;br /&gt;
*** Encoding-specific replacement character (e.g. &#039;?&#039;)&lt;br /&gt;
*** Custom replacement character&lt;br /&gt;
*** Custom behavior e.g. a user-defined function to handle the code point, such as: &amp;lt;code&amp;gt;function (cp) { return &amp;quot;&amp;amp;#x&amp;quot; + Number(cp).toString(16) + &amp;quot;;&amp;quot;; }&amp;lt;/code&amp;gt;&lt;br /&gt;
** Should the spec define an options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8365</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8365"/>
		<updated>2012-06-28T15:59:15Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Acknowledgements */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
* Cameron McCormack&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8364</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8364"/>
		<updated>2012-06-28T15:58:06Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Notes and TODOs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes to Implementers ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8363</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8363"/>
		<updated>2012-06-28T15:57:43Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Based on feedback from Cameron McCormack, don&amp;#039;t need detailed steps for Unicode characters -&amp;gt; DOMString conversion&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes and TODOs ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return an IDL &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; value that represents the sequence of code units resulting from encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8362</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8362"/>
		<updated>2012-06-27T20:49:59Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Add missing definite article&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes and TODOs ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]  - [https://www.w3.org/Bugs/Public/show_bug.cgi?id=17620 tracking bug].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the [[#Steps to convert a sequence of Unicode characters to a DOMString|steps to convert a sequence of Unicode characters to a DOMString]].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
=== Steps to convert a sequence of Unicode characters to a DOMString ===&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] - [https://www.w3.org/Bugs/Public/show_bug.cgi?id=17620 tracking bug]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents the sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8361</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8361"/>
		<updated>2012-06-27T20:33:40Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Notes and TODOs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes and TODOs ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]  - [https://www.w3.org/Bugs/Public/show_bug.cgi?id=17620 tracking bug].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the [[#Steps to convert a sequence of Unicode characters to a DOMString|steps to convert a sequence of Unicode characters to a DOMString]].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
=== Steps to convert a sequence of Unicode characters to a DOMString ===&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] - [https://www.w3.org/Bugs/Public/show_bug.cgi?id=17620 tracking bug]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8360</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8360"/>
		<updated>2012-06-27T20:32:50Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Steps to convert a sequence of Unicode characters to a DOMString */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes and TODOs ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the [[#Steps to convert a sequence of Unicode characters to a DOMString|steps to convert a sequence of Unicode characters to a DOMString]].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
=== Steps to convert a sequence of Unicode characters to a DOMString ===&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] - [https://www.w3.org/Bugs/Public/show_bug.cgi?id=17620 tracking bug]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8359</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8359"/>
		<updated>2012-06-27T17:38:31Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Appendix */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes and TODOs ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the [[#Steps to convert a sequence of Unicode characters to a DOMString|steps to convert a sequence of Unicode characters to a DOMString]].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
=== Steps to convert a sequence of Unicode characters to a DOMString ===&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in ECMAScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8358</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8358"/>
		<updated>2012-06-27T17:38:16Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Notes and TODOs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes and TODOs ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** This is not yet implemented in the [http://code.google.com/p/stringencoding/ ECMAScript shim]&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the [[#Steps to convert a sequence of Unicode characters to a DOMString|steps to convert a sequence of Unicode characters to a DOMString]].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
=== Steps to convert a sequence of Unicode characters to a DOMString ===&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8357</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8357"/>
		<updated>2012-06-27T17:37:14Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Notes and TODOs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes and TODOs ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires retaining partial buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the [[#Steps to convert a sequence of Unicode characters to a DOMString|steps to convert a sequence of Unicode characters to a DOMString]].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
=== Steps to convert a sequence of Unicode characters to a DOMString ===&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8356</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8356"/>
		<updated>2012-06-27T17:36:44Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
&lt;br /&gt;
== Notes and TODOs ==&lt;br /&gt;
&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the [[#Steps to convert a sequence of Unicode characters to a DOMString|steps to convert a sequence of Unicode characters to a DOMString]].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
=== Steps to convert a sequence of Unicode characters to a DOMString ===&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8355</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8355"/>
		<updated>2012-06-27T17:36:17Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
** At the very least, a matching BOM should be ignored (i.e. not returned as part of the output) and a mismatching BOM should signal an error.&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the [[#Steps to convert a sequence of Unicode characters to a DOMString|steps to convert a sequence of Unicode characters to a DOMString]].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
=== Steps to convert a sequence of Unicode characters to a DOMString ===&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8354</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8354"/>
		<updated>2012-06-27T17:34:39Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Correct links to algorithms&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the [http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode steps to convert a DOMString to a sequence of Unicode characters] in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the [[#Steps to convert a sequence of Unicode characters to a DOMString|steps to convert a sequence of Unicode characters to a DOMString]].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
=== Steps to convert a sequence of Unicode characters to a DOMString ===&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8353</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8353"/>
		<updated>2012-06-27T17:27:44Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* TextEncoder */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the steps to &#039;&#039;[http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode convert a DOMString to a sequence of Unicode characters]&#039;&#039; in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the steps to &amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8352</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8352"/>
		<updated>2012-06-27T17:25:54Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* Move the algorithm to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; into [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the steps to &#039;&#039;[http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode convert a DOMString to a sequence of Unicode characters]&#039;&#039; in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the steps to &amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8351</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8351"/>
		<updated>2012-06-27T17:25:09Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Define Unicode character sequence -&amp;gt; DOMString&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;Converting a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; needs to be defined. &lt;br /&gt;
** [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] currently only defines the reverse, &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a DOMString to a sequence of Unicode characters&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;, but that&#039;s probably the right place to define it in the platform.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the steps to &#039;&#039;[http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode convert a DOMString to a sequence of Unicode characters]&#039;&#039; in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the steps to &amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
:TODO: Move the following to [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL]&lt;br /&gt;
&lt;br /&gt;
The following algorithm defines a way to &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt;&amp;lt;sub&amp;gt;0...&amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;-1&amp;lt;/sub&amp;gt; be the sequence of Unicode characters &lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to 0&lt;br /&gt;
# Initialize &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; to be an empty sequence of [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code units]&lt;br /&gt;
# While &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; &amp;lt; &amp;lt;var&amp;gt;n&amp;lt;/var&amp;gt;&lt;br /&gt;
## Let &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; be the &amp;lt;b&amp;gt;code point&amp;lt;/b&amp;gt; of the Unicode character in &amp;lt;var&amp;gt;U&amp;lt;/var&amp;gt; at index &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;&lt;br /&gt;
## If &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; &amp;amp;ge; 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;, then:&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) / 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xD800, where &amp;quot;/&amp;quot; represents integer division.&lt;br /&gt;
### Append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to (&amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt; - 2&amp;lt;sup&amp;gt;16&amp;lt;/sup&amp;gt;) % 2&amp;lt;sup&amp;gt;10&amp;lt;/sup&amp;gt; + 0xDC00, where &amp;quot;%&amp;quot; represents the remainder of an integer division.&lt;br /&gt;
## Otherwise, append to &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt; a [http://dev.w3.org/2006/webapi/WebIDL/#dfn-code-unit code unit] equal to &amp;lt;var&amp;gt;c&amp;lt;/var&amp;gt;.&lt;br /&gt;
## Set &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt; to &amp;lt;var&amp;gt;i&amp;lt;/var&amp;gt;+1&lt;br /&gt;
# Return the IDL [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString DOMString] value that represents sequence of code units &amp;lt;var&amp;gt;S&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8350</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8350"/>
		<updated>2012-06-27T16:54:39Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Added Algorithms section&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;Converting a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; needs to be defined. &lt;br /&gt;
** [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] currently only defines the reverse, &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a DOMString to a sequence of Unicode characters&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;, but that&#039;s probably the right place to define it in the platform.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the steps to &#039;&#039;[http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode convert a DOMString to a sequence of Unicode characters]&#039;&#039; in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the steps to &amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Algorithms ==&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8349</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8349"/>
		<updated>2012-06-27T16:53:47Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: HTML typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;Converting a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; needs to be defined. &lt;br /&gt;
** [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] currently only defines the reverse, &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a DOMString to a sequence of Unicode characters&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;, but that&#039;s probably the right place to define it in the platform.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the steps to &#039;&#039;[http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode convert a DOMString to a sequence of Unicode characters]&#039;&#039; in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the steps to &amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8348</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8348"/>
		<updated>2012-06-27T16:53:04Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Reference steps for Unicode code points -&amp;gt; DOMString (TBD)&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;Converting a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; needs to be defined. &lt;br /&gt;
** [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] currently only defines the reverse, &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a DOMString to a sequence of Unicode characters&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;, but that&#039;s probably the right place to define it in the platform.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the steps to &#039;&#039;[http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode convert a DOMString to a sequence of Unicode characters]&#039;&#039; in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; following the steps to &amp;lt;i&amp;gt;convert a sequence of Unicode characters to a DOMString&amp;lt;i&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8347</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8347"/>
		<updated>2012-06-27T16:50:25Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Reference WebIDL for DOMString -&amp;gt; code points&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;Converting a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; needs to be defined. &lt;br /&gt;
** [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] currently only defines the reverse, &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a DOMString to a sequence of Unicode characters&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;, but that&#039;s probably the right place to define it in the platform.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The stream is composed of the Unicode code points for the Unicode characters produced by following the steps to &#039;&#039;[http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode convert a DOMString to a sequence of Unicode characters]&#039;&#039; in [http://dev.w3.org/2006/webapi/WebIDL/ WebIDL] with &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; as the input. If &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8346</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8346"/>
		<updated>2012-06-27T16:44:53Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;Converting a sequence of Unicode characters to a DOMString&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt; needs to be defined. &lt;br /&gt;
** [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] currently only defines the reverse, &amp;lt;b&amp;gt;&amp;lt;i&amp;gt;convert a DOMString to a sequence of Unicode characters&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;, but that&#039;s probably the right place to define it in the platform.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8345</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8345"/>
		<updated>2012-06-27T16:18:39Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: html typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. &lt;br /&gt;
** [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] currently only defines the reverse, but that&#039;s probably the right place to define it in the platform.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;/var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8319</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8319"/>
		<updated>2012-06-15T21:50:26Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Resolved Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. &lt;br /&gt;
** [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] currently only defines the reverse, but that&#039;s probably the right place to define it in the platform.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding? &#039;&#039;(a proposed 8-bit-clean encoding for interop with legacy binary data stored in ECMAScript strings)&#039;&#039;&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8318</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8318"/>
		<updated>2012-06-15T21:48:24Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
* Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. &lt;br /&gt;
** [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] currently only defines the reverse, but that&#039;s probably the right place to define it in the platform.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding?&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8317</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8317"/>
		<updated>2012-06-15T21:47:14Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write &amp;lt;code&amp;gt;TextDecoder(&#039;iso-2022-kr&#039;).decode(str)&amp;lt;/code&amp;gt; if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding?&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8316</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8316"/>
		<updated>2012-06-15T21:46:57Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write TextDecoder(&#039;iso-2022-kr&#039;).decode(str) if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding?&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&amp;lt;dd&amp;gt;There seems to be pretty strong consensus that streaming, stateful coding is high priority and that an object-oriented API is cleanest, and shoe-horning those onto existing objects would be messy.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8315</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8315"/>
		<updated>2012-06-15T21:45:08Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write TextDecoder(&#039;iso-2022-kr&#039;).decode(str) if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
** &#039;&#039;This is more of a note to implementers than an issue, but it may merit further discussion.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding?&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8314</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8314"/>
		<updated>2012-06-15T21:43:42Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Removed &amp;quot;binary&amp;quot; encoding&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write TextDecoder(&#039;iso-2022-kr&#039;).decode(str) if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Remove &#039;&#039;binary&#039;&#039; encoding?&lt;br /&gt;
&amp;lt;dd&amp;gt;The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays. Consensus on WHATWG is to add better APIs e.g. &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
partial interface ArrayBufferView {&lt;br /&gt;
    DOMString toBase64();&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
partial interface ArrayBuffer {&lt;br /&gt;
    static ArrayBuffer fromBase64(DOMString string);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8282</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8282"/>
		<updated>2012-06-08T17:01:47Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write TextDecoder(&#039;iso-2022-kr&#039;).decode(str) if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
* Streaming decode/encode requires keeping buffers between calls.&lt;br /&gt;
** Some encode/decode algorithms require adjusting the &amp;lt;var&amp;gt;code point pointer&amp;lt;/var&amp;gt; or &amp;lt;var&amp;gt;byte pointer&amp;lt;/var&amp;gt; by a negative amount. This could occur across &amp;quot;chunk&amp;quot; boundaries. This implies that when the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set on an encoder/decoder that the last N elements of the stream are saved for the next call and used as a prefix for the stream. N is defined by the specific encoding algorithm.&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8281</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8281"/>
		<updated>2012-06-08T16:58:44Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* TextEncoder */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write TextDecoder(&#039;iso-2022-kr&#039;).decode(str) if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8280</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8280"/>
		<updated>2012-06-08T16:52:15Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* TextEncoder */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write TextDecoder(&#039;iso-2022-kr&#039;).decode(str) if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 code units to yield a code point stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8236</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8236"/>
		<updated>2012-05-24T17:40:58Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character / replacement callback?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write TextDecoder(&#039;iso-2022-kr&#039;).decode(str) if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 to yield a code unit stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8235</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8235"/>
		<updated>2012-05-24T17:40:27Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
* Encoding errors&lt;br /&gt;
** TODO: Define options dict for &amp;lt;code&amp;gt;TextEncoder()&amp;lt;/code&amp;gt; that allows selecting between throw vs. replacement character?&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write TextDecoder(&#039;iso-2022-kr&#039;).decode(str) if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 to yield a code unit stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8234</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8234"/>
		<updated>2012-05-24T17:38:29Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: /* Open Issues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
* Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
* Encoding errors&lt;br /&gt;
* The [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding specification] defines the byte order mark as more authoritative than an explicit encoding label when decoding content. &lt;br /&gt;
** Is that desirable here? &lt;br /&gt;
** What does it mean to e.g. write TextDecoder(&#039;iso-2022-kr&#039;).decode(str) if string turns out to be UTF-8?&lt;br /&gt;
** What about streaming and other re-uses of the same decoder object?&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Resolved Issues ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: not for &amp;quot;v1&amp;quot; - can be implemented using this API, and breaking the string down by &amp;quot;character&amp;quot; is unlikely to be as obvious as it sounds - surrogate pairs, combining sequences, etc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: See above, and wait for developer feedback.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
&amp;lt;dd&amp;gt;Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 to yield a code unit stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8233</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8233"/>
		<updated>2012-05-24T17:30:48Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Inline &amp;quot;utf-8&amp;quot; as default encoding&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
General: Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&lt;br /&gt;
=== Scenarios ===&lt;br /&gt;
&lt;br /&gt;
* Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&lt;br /&gt;
=== Desired Features ===&lt;br /&gt;
&lt;br /&gt;
* Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
** &#039;&#039;Tentative Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
&lt;br /&gt;
=== API cleanup ===&lt;br /&gt;
&lt;br /&gt;
* Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
** &#039;&#039;Tentative Resolution: Wait for developer feedback.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Spec Issues ===&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] defines the byte order mark as more authoritative than anything else. Is that desirable here?&#039;&#039;&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 to yield a code unit stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &amp;quot;&amp;lt;code&amp;gt;utf-8&amp;lt;/code&amp;gt;&amp;quot;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8232</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8232"/>
		<updated>2012-05-24T17:29:49Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Define TextDecoder constructor algorithmically&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
General: Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&lt;br /&gt;
=== Scenarios ===&lt;br /&gt;
&lt;br /&gt;
* Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&lt;br /&gt;
=== Desired Features ===&lt;br /&gt;
&lt;br /&gt;
* Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
** &#039;&#039;Tentative Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
&lt;br /&gt;
=== API cleanup ===&lt;br /&gt;
&lt;br /&gt;
* Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
** &#039;&#039;Tentative Resolution: Wait for developer feedback.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Spec Issues ===&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] defines the byte order mark as more authoritative than anything else. Is that desirable here?&#039;&#039;&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;default encoding&#039;&#039;&#039; is &amp;lt;code&amp;gt;&amp;quot;utf-8&amp;quot;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &#039;&#039;&#039;default encoding&#039;&#039;&#039;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 to yield a code unit stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
# If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &#039;&#039;&#039;default encoding&#039;&#039;&#039;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# Run the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. &lt;br /&gt;
## If the steps result in failure, throw a &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
## Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, set the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return the &#039;&#039;decoder object&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8231</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8231"/>
		<updated>2012-05-24T17:25:01Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Specify TextEncoder constructor algorithmically&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
General: Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&lt;br /&gt;
=== Scenarios ===&lt;br /&gt;
&lt;br /&gt;
* Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&lt;br /&gt;
=== Desired Features ===&lt;br /&gt;
&lt;br /&gt;
* Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
** &#039;&#039;Tentative Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
&lt;br /&gt;
=== API cleanup ===&lt;br /&gt;
&lt;br /&gt;
* Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
** &#039;&#039;Tentative Resolution: Wait for developer feedback.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Spec Issues ===&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] defines the byte order mark as more authoritative than anything else. Is that desirable here?&#039;&#039;&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;default encoding&#039;&#039;&#039; is &amp;lt;code&amp;gt;&amp;quot;utf-8&amp;quot;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor runs the following steps:&lt;br /&gt;
# If the constructor is called with no arguments, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the &#039;&#039;&#039;default encoding&#039;&#039;&#039;. Otherwise, let &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; be the value of the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument.&lt;br /&gt;
# The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;.&lt;br /&gt;
#* If the steps result in failure, throw an &amp;quot;&amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt;&amp;quot; exception and terminate these steps.&lt;br /&gt;
#* Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding.&lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. &lt;br /&gt;
# Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 to yield a code unit stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
&lt;br /&gt;
If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; be the &#039;&#039;&#039;default encoding&#039;&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
The constructor follows the steps to get an encoding from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. If the steps result in failure, a &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; is thrown. Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding. &lt;br /&gt;
&lt;br /&gt;
If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, otherwise the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
&lt;br /&gt;
Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8230</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8230"/>
		<updated>2012-05-24T17:15:17Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Define encode algorithm inputs/outputs more pedantically&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
General: Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&lt;br /&gt;
=== Scenarios ===&lt;br /&gt;
&lt;br /&gt;
* Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&lt;br /&gt;
=== Desired Features ===&lt;br /&gt;
&lt;br /&gt;
* Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
** &#039;&#039;Tentative Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
&lt;br /&gt;
=== API cleanup ===&lt;br /&gt;
&lt;br /&gt;
* Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
** &#039;&#039;Tentative Resolution: Wait for developer feedback.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Spec Issues ===&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] defines the byte order mark as more authoritative than anything else. Is that desirable here?&#039;&#039;&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;default encoding&#039;&#039;&#039; is &amp;lt;code&amp;gt;&amp;quot;utf-8&amp;quot;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If the constructor is called with no arguments, let &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; be the &#039;&#039;&#039;default encoding&#039;&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. If the steps result in failure, an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; is thrown. Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding. Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt;; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the &amp;lt;var&amp;gt;stream of code points&amp;lt;/var&amp;gt; is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 to yield a code unit stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the &amp;lt;var&amp;gt;sequence of emitted bytes&amp;lt;/var&amp;gt; by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
&lt;br /&gt;
If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; be the &#039;&#039;&#039;default encoding&#039;&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
The constructor follows the steps to get an encoding from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. If the steps result in failure, a &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; is thrown. Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding. &lt;br /&gt;
&lt;br /&gt;
If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, otherwise the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
&lt;br /&gt;
Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
	<entry>
		<id>https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8229</id>
		<title>StringEncoding</title>
		<link rel="alternate" type="text/html" href="https://wiki.whatwg.org/index.php?title=StringEncoding&amp;diff=8229"/>
		<updated>2012-05-24T17:13:42Z</updated>

		<summary type="html">&lt;p&gt;Jsbell: Rewrote decode method algorithmically&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Proposed Text Encoding Web API for Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== Editors ==&lt;br /&gt;
&lt;br /&gt;
* Joshua Bell (Google, Inc)&lt;br /&gt;
&lt;br /&gt;
== Abstract ==&lt;br /&gt;
&lt;br /&gt;
This specification defines an API for encoding strings to binary data, and decoding strings from binary data.&lt;br /&gt;
&lt;br /&gt;
NOTE: This specification intentionally does not address the opposite scenario of encoding binary data as strings and decoding binary data from strings, for example using Base64 encoding.&lt;br /&gt;
&lt;br /&gt;
Discussion on this topic has so far taken place on the [http://www.khronos.org/webgl/public-mailing-list/ public_webgl@khronos.org] mailing list. See http://www.khronos.org/webgl/public-mailing-list/archives/1111/msg00017.html for the initial discussion thread.&lt;br /&gt;
&lt;br /&gt;
Discussion has since moved to the [http://www.whatwg.org/mailing-list#specs WHATWG spec discussion] mailing list. See  http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-March/035038.html for the latest discussion thread.&lt;br /&gt;
&lt;br /&gt;
== Open Issues ==&lt;br /&gt;
&lt;br /&gt;
General: Should this be a standalone API (as written), or live on e.g. DataView, or on e.g. String?&lt;br /&gt;
&lt;br /&gt;
=== Scenarios ===&lt;br /&gt;
&lt;br /&gt;
* Encode as many characters as possible into a fixed-size buffer for transmission, and repeat starting with next unencoded character&lt;br /&gt;
&lt;br /&gt;
=== Desired Features ===&lt;br /&gt;
&lt;br /&gt;
* Allow arbitrary end byte sequences (e.g. &amp;lt;code&amp;gt;0xFF&amp;lt;/code&amp;gt; for UTF-8 strings)&lt;br /&gt;
** &#039;&#039;Tentative Resolution: Add &amp;lt;code&amp;gt;indexOf&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ArrayBufferView&amp;lt;/code&amp;gt;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Encoding errors&lt;br /&gt;
&lt;br /&gt;
=== API cleanup ===&lt;br /&gt;
&lt;br /&gt;
* Support two versions of encode; one which takes target buffer, and one which creates/returns a right-sized buffer&lt;br /&gt;
** &#039;&#039;Tentative Resolution: Wait for developer feedback.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Spec Issues ===&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] defines the byte order mark as more authoritative than anything else. Is that desirable here?&#039;&#039;&lt;br /&gt;
* Remove &#039;&#039;binary&#039;&#039; encoding? The only real use is with &amp;lt;code&amp;gt;atob()&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;btoa()&amp;lt;/code&amp;gt;; a better API would be Base64 directly in/out of Typed Arrays&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;default encoding&#039;&#039;&#039; is &amp;lt;code&amp;gt;&amp;quot;utf-8&amp;quot;&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== TextEncoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextEncodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(DOMString encoding)]&lt;br /&gt;
interface TextEncoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  ArrayBufferView encode(DOMString? string, optional TextEncodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If the constructor is called with no arguments, let &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; be the &#039;&#039;&#039;default encoding&#039;&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
The constructor follows the &#039;&#039;&#039;steps to get an encoding&#039;&#039;&#039; from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. If the steps result in failure, an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; is thrown. Otherwise, set the encoder object&#039;s internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding. Initialize the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the encoder object to false. Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the encoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. &lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;encoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; method runs these steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;encode&amp;lt;/code&amp;gt; on this object.&lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the steps of the &amp;lt;var&amp;gt;encoding algorithm&amp;lt;/var&amp;gt;:&lt;br /&gt;
#* The input to the algorithm is a stream of code points. The code units within the DOMString &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; are interpreted as UTF-16 code units, to produce a stream of code points; if &amp;lt;var&amp;gt;string&amp;lt;/var&amp;gt; is null, the stream is empty.&lt;br /&gt;
#*:&#039;&#039;ISSUE: Interpreting a DOMString as UTF-16 to yield a code unit stream needs to be defined, including unpaired surrogates. [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL] only defines the reverse.&#039;&#039;&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &amp;lt;code&amp;gt;stream&amp;lt;/code&amp;gt; option is false, then after final code point is yielded by the stream then the &#039;&#039;&#039;EOF code point&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* The output of the the algorithm is a sequence of emitted bytes.&lt;br /&gt;
# Returns a &amp;lt;code&amp;gt;Unit8Array&amp;lt;/code&amp;gt; object wrapping an &amp;lt;code&amp;gt;ArrayBuffer&amp;lt;/code&amp;gt; containing the sequence of bytes emitted by encoder algorithm.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== TextDecoder ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;WebIDL&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dictionary TextDecoderOptions {&lt;br /&gt;
  boolean fatal = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
dictionary TextDecodeOptions {&lt;br /&gt;
  boolean stream = false;&lt;br /&gt;
};&lt;br /&gt;
&lt;br /&gt;
[Constructor,&lt;br /&gt;
 Constructor(optional DOMString encoding, optional TextDecoderOptions options)]&lt;br /&gt;
interface TextDecoder {&lt;br /&gt;
  readonly attribute DOMString encoding;&lt;br /&gt;
  DOMString decode(optional ArrayBufferView view, optional TextDecodeOptions options);&lt;br /&gt;
};&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The constructor creates a &#039;&#039;decoder object&#039;&#039;. It has the internal properties &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, a &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag which is initially unset, and a &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag which is initially unset.&lt;br /&gt;
&lt;br /&gt;
If called without an &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; argument, let &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; be the &#039;&#039;&#039;default encoding&#039;&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
The constructor follows the steps to get an encoding from [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; as &amp;lt;var&amp;gt;label&amp;lt;/var&amp;gt;. If the steps result in failure, a &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; is thrown. Otherwise, set the &#039;&#039;decoder object&#039;s&#039;&#039; internal &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; property to the returned encoding. &lt;br /&gt;
&lt;br /&gt;
If the constructor is called with an &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; argument, and the &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; property of the dictionary is set, the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, otherwise the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
&lt;br /&gt;
Initialize the internal &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for the encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;.&lt;br /&gt;
&amp;lt;dl&amp;gt;&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;encoding&amp;lt;/code&amp;gt; of type DOMString, readonly&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
Returns the &amp;lt;var&amp;gt;Name&amp;lt;/var&amp;gt; of the decoder object&#039;s &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;, per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&lt;br /&gt;
&lt;br /&gt;
:Note that this may differ from the name of the encoding specified during the call to the constructor. For example, if the constructor is called with &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; of &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; the &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt; attribute of the &#039;&#039;decoder object&#039;&#039; would have the value &amp;lt;code&amp;gt;&amp;quot;windows-1252&amp;quot;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;&amp;quot;ascii&amp;quot;&amp;lt;/code&amp;gt; is a label for that encoding.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;dt&amp;gt;&amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;dd&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; method runs the following steps:&lt;br /&gt;
&lt;br /&gt;
# If the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is not set, then reset the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; to the default values for encoding &amp;lt;var&amp;gt;encoding&amp;lt;/var&amp;gt;. Otherwise, the &amp;lt;var&amp;gt;encoding algorithm state&amp;lt;/var&amp;gt; is re-used from the previous call to &amp;lt;code&amp;gt;decode&amp;lt;/code&amp;gt; on this object. &lt;br /&gt;
# If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter is specified and the &amp;lt;var&amp;gt;stream&amp;lt;var&amp;gt; option is &#039;&#039;&#039;true&#039;&#039;&#039;, then the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is set; otherwise the internal &amp;lt;var&amp;gt;streaming&amp;lt;/var&amp;gt; flag is cleared.&lt;br /&gt;
# Run the decoder algorithm of the &#039;&#039;decoder object&#039;s&#039;&#039; encoder&lt;br /&gt;
#* The input to the algorithm is a &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt;. The &amp;lt;var&amp;gt;byte stream&amp;lt;/var&amp;gt; is provided by the bytes in &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt; starting at offset &amp;lt;var&amp;gt;view.byteOffset&amp;lt;/var&amp;gt;. A maximum of &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream from &amp;lt;var&amp;gt;view.buffer&amp;lt;/var&amp;gt;. If &amp;lt;var&amp;gt;view&amp;lt;/var&amp;gt; is not specified, the stream is empty.&lt;br /&gt;
#* If the &amp;lt;var&amp;gt;options&amp;lt;/var&amp;gt; parameter not specified or the &#039;&#039;&#039;stream&#039;&#039;&#039; option is &#039;&#039;&#039;false&#039;&#039;&#039;, then after &amp;lt;var&amp;gt;view.byteLength&amp;lt;/var&amp;gt; bytes are yielded by the stream the &#039;&#039;&#039;EOF byte&#039;&#039;&#039; is yielded.&lt;br /&gt;
#* If the internal &amp;lt;var&amp;gt;fatal&amp;lt;/var&amp;gt; flag of the &#039;&#039;decoder object&#039;&#039; is set, then a &#039;&#039;&#039;decoder error&#039;&#039;&#039; causes an &amp;lt;code&amp;gt;DOMException&amp;lt;/code&amp;gt; of type &amp;lt;code&amp;gt;EncodingError&amp;lt;/code&amp;gt; to be thrown rather than emitting a fallback code point.&lt;br /&gt;
#* The output of the algorithm is a sequence of &amp;lt;var&amp;gt;emitted code points&amp;lt;/var&amp;gt;.&lt;br /&gt;
# Return a &amp;lt;code&amp;gt;DOMString&amp;lt;/code&amp;gt; by encoding the &amp;lt;var&amp;gt;sequence of emitted code points&amp;lt;/var&amp;gt; as UTF-16 as per [http://dev.w3.org/2006/webapi/WebIDL/#idl-DOMString WebIDL].&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;ISSUE: Need to handle BOMs. [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding] specifies this by using the caller-specified encoding as a suggestion, and consuming the BOM as a part of selecting the real encoder where BOM takes precedence. At the very least a matching BOM should be ignored. A mismatching BOM could throw, be a decoding error, or could actually switch the decoder for this stream (call or sequence of calls), possibly if-and-only-if the constructor was called without an encoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/dl&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Examples ==&lt;br /&gt;
&lt;br /&gt;
=== Example #1 - encoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example uses the API to encode an array of strings into a ArrayBuffer. The result is a Uint8Array containing the number of strings (as a Uint32), followed by the length of the first string (as a Uint32), the UTF-8 encoded string data, the length of the second string (as a Uint32), the string data, and so on.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function encodeArrayOfStrings(strings, encoding) {&lt;br /&gt;
  var encoder, encoded, len, i, bytes, view, offset;&lt;br /&gt;
&lt;br /&gt;
  encoder = TextEncoder(encoding);&lt;br /&gt;
  encoded = [];&lt;br /&gt;
&lt;br /&gt;
  len = Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; strings.length; i += 1) {&lt;br /&gt;
    len += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    encoded[i] = TextEncoder(encoding).encode(strings[i]);&lt;br /&gt;
    len += encoded[i].byteLength;&lt;br /&gt;
  }&lt;br /&gt;
&lt;br /&gt;
  bytes = new Uint8Array(len);&lt;br /&gt;
  view = new DataView(bytes.buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
&lt;br /&gt;
  view.setUint32(offset, strings.length);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; encoded.length; i += 1) {&lt;br /&gt;
    len = encoded[i].byteLength;&lt;br /&gt;
    view.setUint32(offset, len);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    bytes.set(encoded[i], offset);&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return bytes.buffer;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Example #2 - decoding strings ===&lt;br /&gt;
&lt;br /&gt;
The following example decodes an ArrayBuffer containing data encoded in the format produced by the previous example back into an array of strings.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
function decodeArrayOfStrings(buffer, encoding) {&lt;br /&gt;
  var decoder, view, offset, num_strings, strings, i, len;&lt;br /&gt;
&lt;br /&gt;
  decoder = TextDecoder(encoding);&lt;br /&gt;
  view = new DataView(buffer);&lt;br /&gt;
  offset = 0;&lt;br /&gt;
  strings = [];&lt;br /&gt;
&lt;br /&gt;
  num_strings = view.getUint32(offset);&lt;br /&gt;
  offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
  for (i = 0; i &amp;lt; num_strings; i += 1) {&lt;br /&gt;
    len = view.getUint32(offset);&lt;br /&gt;
    offset += Uint32Array.BYTES_PER_ELEMENT;&lt;br /&gt;
    strings[i] = decoder.decode(&lt;br /&gt;
      new DataView(view.buffer, offset, len));&lt;br /&gt;
    offset += len;&lt;br /&gt;
  }&lt;br /&gt;
  return strings;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Encodings ==&lt;br /&gt;
&lt;br /&gt;
Encodings are defined and implemented per [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding]. This implicitly includes the steps to &amp;lt;em&amp;gt;get an encoding&amp;lt;/em&amp;gt; from a string, and the logic for label matching and case-insensitivity.&lt;br /&gt;
&lt;br /&gt;
User agents MUST NOT support any other encodings or labels than those defined in [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], and MUST support all encodings and labels defined in that specification, with the additions defined below.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: In [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding], &amp;quot;ascii&amp;quot; is a label for &#039;&#039;&#039;windows-1252&#039;&#039;&#039;; there is no 7-bit-or-raise-exception encoding. Applications that are required to restrict the content of decoded strings should implement validation after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Unicode normalization forms are outside the scope of this specification. No normalization is done prior to encoding or after decoding.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: Handling of encoding-specific issues, e.g. over-long UTF-8 encodings, byte order marks, unmatched surrogate pairs, and so on is defined by [http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html Encoding].&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
=== Additional Encodings ===&lt;br /&gt;
&lt;br /&gt;
The following additional encodings are defined by this specification. They are specific to the methods defined herein.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;table border=1 cellpadding=5&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Name&amp;lt;th&amp;gt;Labels&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;td&amp;gt;binary&amp;lt;td&amp;gt;&amp;quot;binary&amp;quot;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== binary ====&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary&#039;&#039;&#039; encoding is a single-byte encoding where the input code point and output byte are identical for the range &#039;&#039;&#039;U+0000&#039;&#039;&#039; to &#039;&#039;&#039;U+00ff&#039;&#039;&#039;. &lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: This encoding is intended to allow interoperation with legacy code that encodes binary data in ECMAScript strings, for example the WindowBase64 methods [http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-atob atob()]/[http://dev.w3.org/html5/spec-author-view/webappapis.html#dom-windowbase64-btoa btoa()] methods. It is recommended that new Web applications use Typed Arrays for transmission and storage of binary data.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
If &#039;&#039;&#039;binary&#039;&#039;&#039; is selected as the encoding, then step 3 of the steps to &#039;&#039;decode a byte stream&#039;&#039; are skipped; no BOM detection is performed.&lt;br /&gt;
&lt;br /&gt;
:&#039;&#039;NOTE: If &amp;lt;code&amp;gt;&amp;quot;binary&amp;quot;&amp;lt;/code&amp;gt; is specified, byte order marks must be ignored.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary decoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt; be byte pointer.&lt;br /&gt;
# If &amp;lt;byte&amp;gt; is the EOF byte, emit the EOF code point.&lt;br /&gt;
# Increase the byte pointer by one.&lt;br /&gt;
# Emit a code point whose value is &amp;lt;var&amp;gt;byte&amp;lt;/var&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;binary encoder&#039;&#039;&#039; is:&lt;br /&gt;
# Let &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; be the code point pointer&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is the EOF code point, emit the EOF byte.&lt;br /&gt;
# Increase the code point pointer by one.&lt;br /&gt;
# If &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt; is in the range U+0000 to U+00FF, emit a byte whose value is &amp;lt;var&amp;gt;code point&amp;lt;/var&amp;gt;&lt;br /&gt;
# Emit an encoder error&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* WebIDL http://dev.w3.org/2006/webapi/WebIDL&lt;br /&gt;
* Encoding http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
&lt;br /&gt;
* Alan Chaney&lt;br /&gt;
* Ben Noordhuis&lt;br /&gt;
* Glenn Maynard&lt;br /&gt;
* John Tamplin&lt;br /&gt;
* Kenneth Russell (Google, Inc)&lt;br /&gt;
* Robert Mustacchi&lt;br /&gt;
* Ryan Dahl&lt;br /&gt;
* Anne van Kesteren&lt;br /&gt;
&lt;br /&gt;
== Appendix ==&lt;br /&gt;
&lt;br /&gt;
A &amp;quot;shim&amp;quot; implementation in JavaScript (that may not fully match the current version of the spec) plus some initial unit tests can be found at:&lt;br /&gt;
&lt;br /&gt;
:http://code.google.com/p/stringencoding/&lt;br /&gt;
&lt;br /&gt;
[[Category:Proposals]]&lt;/div&gt;</summary>
		<author><name>Jsbell</name></author>
	</entry>
</feed>