HTML Standard Tracker

Diff (omit for latest revision)
Filter

Short URL: http://html5.org/r/5814

File a bug

SVNBugCommentTime (UTC)
5814[Authors] [Gecko] [Internet Explorer] [Opera] [Webkit] Specify window.atob() and .btoa(). (ack Aryeh for the reverse-engineering to do this)2011-02-01 08:18
Index: source
===================================================================
--- source	(revision 5813)
+++ source	(revision 5814)
@@ -2267,6 +2267,13 @@
     is passed an Infinity or Not-a-Number (NaN) value, a
     <code>NOT_SUPPORTED_ERR</code> exception must be raised.</p>
 
+    <p>Except where otherwise specified, if a method has an argument
+    of type <code>DOMString</code>, the user agent must <span
+    title="dfn-obtain-unicode">convert the <code>DOMString</code> to a
+    sequence of Unicode characters</span> when the method is invoked,
+    to obtain the string on which the method is to operate. <a
+    href="#refsWEBIDL">[WEBIDL]</a></p>
+
    </dd>
 
    <dt>JavaScript</dt>
@@ -74012,6 +74019,276 @@
   </div>
 
 
+  <h3 id="atob">Base64 utility methods</h3>
+
+  <p>The <code title="dom-windowbase64-atob">atob()</code> and <code
+  title="dom-windowbase64-btoa">btoa()</code> methods allow authors to
+  transform content to and from the base64 encoding.</p>
+
+  <!-- v2: actual binary support -->
+
+  <pre class="idl">[Supplemental, NoInterfaceObject]
+interface <dfn>WindowBase64</dfn> {
+  DOMString <span title="dom-windowbase64-btoa">btoa</span>(in DOMString btoa);
+  DOMString <span title="dom-windowbase64-atob">atob</span>(in DOMString atob);
+};
+<span>Window</span> implements <span>WindowBase64</span>;</pre>
+
+  <p class="note">In these APIs, for mnemonic purposes, the "b" can be
+  considered to stand for "binary", and the "a" for "ASCII". In
+  practice, though, for primarily historical reasons, both the input
+  and output of these functions are Unicode strings.</p>
+
+  <dl class="domintro">
+
+   <dt><var title="">result</var> = <var title="">window</var> . <code title="dom-windowbase64-btoa">btoa</code>( <var title="">data</var> )</dt>
+
+   <dd>
+
+    <p>Takes the input data, in the form of a Unicode string
+    containing only characters in the range U+0000 to U+00FF, each
+    representing a binary byte with values 0x00 to 0xFF respectively,
+    and converts it to its base64 representation, which it returns.</p>
+
+    <p>Throws an <code>INVALID_CHARACTER_ERR</code> exception if the
+    input string contains any out-of-range characters.</p>
+
+   </dd>
+
+   <dt><var title="">result</var> = <var title="">window</var> . <code title="dom-windowbase64-atob">atob</code>( <var title="">data</var> )</dt>
+
+   <dd>
+
+    <p>Takes the input data, in the form of a Unicode string
+    containing base64-encoded binary data, decodes it, and returns a
+    string consisting of characters in the range U+0000 to U+00FF,
+    each representing a binary byte with values 0x00 to 0xFF
+    respectively, corresponding to that binary data.</p>
+
+    <p>Throws an <code>INVALID_CHARACTER_ERR</code> exception if the
+    input string is not valid base64 data.</p>
+
+   </dd>
+
+  </dl>
+
+  <div class="impl">
+
+  <p class="note">The <code>WindowBase64</code> interface adds to the
+  <code>Window</code> interface and the <code>WorkerUtils</code>
+  interface (part of Web Workers).</p>
+
+  <p>The <dfn title="dom-windowbase64-btoa"><code>btoa()</code></dfn>
+  method must throw an <code>INVALID_CHARACTER_ERR</code> exeption if
+  the method's first argument contains any character whose code point
+  is greater than U+00FF. Otherwise, the user agent must convert that
+  argument to a sequence of octets whose <var title="">n</var>th octet
+  is the eight-bit representation of the code point of the <var
+  title="">n</var>th character of the argument, and then must apply
+  the base64 algorithm to that sequence of octets, and return the
+  result. <a href="#refsRFC4648">[RFC4648]</a><!--base64--></p>
+  <!-- Aryeh says: This seems to be what all browsers do as of January
+  2011 (except IE, which doesn't support these functions at all). -->
+
+
+  <p>The <dfn title="dom-windowbase64-atob"><code>atob()</code></dfn>
+  method must run the following steps to parse the string passed in
+  the method's first argument:</p>
+
+  <ol>
+
+   <!-- Aryeh says: Copies Firefox behavior as of January 2011
+   (4.0b8). WebKit is somewhat laxer, and Opera throws no exceptions
+   at all. gsnedders reports Opera's behavior causes site-compat
+   problems, and I figure most sites depend on Firefox if on anything,
+   so go with that. -->
+
+   <li><p>Let <var title="">input</var> be the string being
+   parsed.</p></li>
+
+   <li><p>Let <var title="">position</var> be a pointer into <var
+   title="">input</var>, initially pointing at the start of the
+   string.</p></li>
+
+   <li><p>If the length of <var title="">input</var> divides by 4
+   leaving no remainder, then: if <var title="">input</var> ends with
+   one or two U+003D EQUALS SIGN (=) characters, remove them from <var
+   title="">input</var>.</p></li>
+
+   <li><p>If the length of <var title="">input</var> divides by 4
+   leaving a remainder of 1, throw an
+   <code>INVALID_CHARACTER_ERR</code> exception and abort these
+   steps.</p>
+
+   <li>
+
+    <p>If <var title="">input</var> contains a character that is not
+    in the following list of characters and character ranges, throw an
+    <code>INVALID_CHARACTER_ERR</code> exception and abort these
+    steps:</p>
+
+    <ul class="brief">
+     <li>U+002B PLUS SIGN (+)
+     <li>U+002F SOLIDUS (/)
+     <li>U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9)
+     <li>U+0041 LATIN CAPITAL LETTER A to U+005A LATIN CAPITAL LETTER Z
+     <li>U+0061 LATIN SMALL LETTER A to U+007A LATIN SMALL LETTER Z
+    </ul>
+
+   </li>
+
+   <li><p>Let <var title="">output</var> be a string, initially
+   empty.</p></li>
+
+   <li><p>Let <var title="">buffer</var> be a buffer that can have
+   bits appended to it, initially empty.</p></li>
+
+   <li>
+
+    <p>While <var title="">position</var> does not point past the end
+    of <var title="">input</var>, run these substeps:</p>
+
+    <ol>
+
+     <li>
+
+      <p>Find the character pointed to by <var title="">position</var>
+      in the first column of the following table. Let <var
+      title="">n</var> be the number given in the second cell of the
+      same row.</p>
+
+      <div id="base64-table">
+       <table>
+        <thead>
+         <tr>
+          <th>Character
+          <th>Number
+        <tbody>
+         <tr><td>A<td>0
+         <tr><td>B<td>1
+         <tr><td>C<td>2
+         <tr><td>D<td>3
+         <tr><td>E<td>4
+         <tr><td>F<td>5
+         <tr><td>G<td>6
+         <tr><td>H<td>7
+         <tr><td>I<td>8
+         <tr><td>J<td>9
+         <tr><td>K<td>10
+         <tr><td>L<td>11
+         <tr><td>M<td>12
+         <tr><td>N<td>13
+         <tr><td>O<td>14
+         <tr><td>P<td>15
+         <tr><td>Q<td>16
+         <tr><td>R<td>17
+         <tr><td>S<td>18
+         <tr><td>T<td>19
+         <tr><td>U<td>20
+         <tr><td>V<td>21
+         <tr><td>W<td>22
+         <tr><td>X<td>23
+         <tr><td>Y<td>24
+         <tr><td>Z<td>25
+         <tr><td>a<td>26
+         <tr><td>b<td>27
+         <tr><td>c<td>28
+         <tr><td>d<td>29
+         <tr><td>e<td>30
+         <tr><td>f<td>31
+         <tr><td>g<td>32
+         <tr><td>h<td>33
+         <tr><td>i<td>34
+         <tr><td>j<td>35
+         <tr><td>k<td>36
+         <tr><td>l<td>37
+         <tr><td>m<td>38
+         <tr><td>n<td>39
+         <tr><td>o<td>40
+         <tr><td>p<td>41
+         <tr><td>q<td>42
+         <tr><td>r<td>43
+         <tr><td>s<td>44
+         <tr><td>t<td>45
+         <tr><td>u<td>46
+         <tr><td>v<td>47
+         <tr><td>w<td>48
+         <tr><td>x<td>49
+         <tr><td>y<td>50
+         <tr><td>z<td>51
+         <tr><td>0<td>52
+         <tr><td>1<td>53
+         <tr><td>2<td>54
+         <tr><td>3<td>55
+         <tr><td>4<td>56
+         <tr><td>5<td>57
+         <tr><td>6<td>58
+         <tr><td>7<td>59
+         <tr><td>8<td>60
+         <tr><td>9<td>61
+         <tr><td>+<td>62
+         <tr><td>/<td>63
+       </table>
+      </div>
+
+     </li>
+
+     <li><p>Append to <var title="">buffer</var> the six bits
+     corresponding to <var title="">number</var>, most significant bit
+     first.</p></li>
+
+     <li><p>If <var title="">buffer</var> has accumulated 24 bits,
+     interpret them as three 8-bit big-endian numbers. Append the
+     three characters with code points equal to those numbers to <var
+     title="">output</var>, in the same order, and then empty <var
+     title="">buffer</var>.</p></li>
+
+     <li><p>Advance <var title="">position</var> by one
+     character.</p></li>
+
+    </ol>
+
+   </li>
+
+   <li>
+
+    <p>If <var title="">buffer</var> is not empty, it contains either
+    12 or 18 bits. If it contains 12 bits, discard the last four and
+    interpret the remaining eight as an 8-bit big-endian number. If it
+    contains 18 bits, discard the last two and interpret the remaining
+    16 as two 8-bit big-endian numbers. Append the one or two
+    characters with code points equal to those one or two numbers to
+    <var title="">output</var>, in the same order.</p>
+
+    <p>The discarded bits mean that, for instance, <code
+    title="">atob("YQ")</code> and <code title="">atob("YR")</code>
+    both return "<code title="">a</code>".</p>
+
+   </li>
+
+   <li><p>Return <var title="">output</var>.</p></li>
+
+  </ol>
+
+  <!-- Note: this function is defined explicitly here because RFC4648
+  does not specify how to handle erroneous input, and no preexisting
+  browser implementation simply throws an exception on all erroneous
+  input. -->
+
+  <p class="note">Some base64 encoders add newlines or other
+  whitespace to their output. The <code
+  title="dom-windowbase64-atob">atob()</code> method throws an
+  exception if its input contains characters other than those
+  described by the regular expression bracket expression <code
+  title="">[+/=0-9A-Za-z]</code>, so other characters need to be
+  removed before <code title="dom-windowbase64-atob">atob()</code> is
+  used for decoding.</p>
+
+  </div>
+
+
+
+
   <h3 id="timers">Timers</h3>
 
   <p>The <code title="dom-windowtimers-setTimeout">setTimeout()</code>

|