[t] (0) Remove the requirement that the parser deal with raw surrogat…

…es, since they can't make it this far. Fixing http://www.w3.org/Bugs/Public/show_bug.cgi?id=11298 git-svn-id: http://svn.whatwg.org/webapps@5862 340c8d12-0b0e-0410-8428-c7bf67bfef74
whatwg · Feb 9, 2011 · 3accfd8 · 3accfd8
1 parent 19cf17d
commit 3accfd8
Show file tree

Hide file tree

Showing 3 changed files with 9 additions and 33 deletions.
diff --git a/complete.html b/complete.html
@@ -77607,13 +77607,6 @@ <h5 id=preprocessing-the-input-stream><span class=secno>12.2.2.3 </span>Preproce
   motivated by a desire to increase the resilience of user agents in
   the face of na&iuml;ve transcoders.</p>
 
-  <p>Code points in the range U+D800 to U+DFFF<!-- surrogates are not
-  allowed e.g. in UTF-8, and we don't want them to suddenly turn into
-  code points when they go through a UTF-16 pipe --> in the input must
-  be replaced by U+FFFD REPLACEMENT CHARACTERs. Any occurrences of
-  such characters and code points are <a href=#parse-error title="parse error">parse
-  errors</a>.</p>
-
   <p>Any occurrences of any characters in the ranges U+0001 to U+0008,
   <!-- HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF,
   CR allowed --> U+000E to U+001F, <!-- ASCII allowed --> U+007F
@@ -80255,10 +80248,9 @@ <h5 id=tokenizing-character-references><span class=secno>12.2.4.69 </span>Tokeni
       <tr><td>0x9E <td>U+017E <td>LATIN SMALL LETTER Z WITH CARON (&#382;)
       <tr><td>0x9F <td>U+0178 <td>LATIN CAPITAL LETTER Y WITH DIAERESIS (&Yuml;)
     </table><p>Otherwise, if the number is in the range 0xD800 to 0xDFFF<!--
-    surrogates not allowed; see the comment in the "preprocessing the
-    input stream" section for details --> or is greater than 0x10FFFF,
-    then this is a <a href=#parse-error>parse error</a>. Return a U+FFFD
-    REPLACEMENT CHARACTER.</p>
+    surrogates --> or is greater than 0x10FFFF, then this is a
+    <a href=#parse-error>parse error</a>. Return a U+FFFD REPLACEMENT
+    CHARACTER.</p>
 
     <p>Otherwise, return a character token for the Unicode character
     whose code point is that number.

diff --git a/index b/index
@@ -73578,13 +73578,6 @@ interface <dfn id=messageport>MessagePort</dfn> {
   motivated by a desire to increase the resilience of user agents in
   the face of na&iuml;ve transcoders.</p>
 
-  <p>Code points in the range U+D800 to U+DFFF<!-- surrogates are not
-  allowed e.g. in UTF-8, and we don't want them to suddenly turn into
-  code points when they go through a UTF-16 pipe --> in the input must
-  be replaced by U+FFFD REPLACEMENT CHARACTERs. Any occurrences of
-  such characters and code points are <a href=#parse-error title="parse error">parse
-  errors</a>.</p>
-
   <p>Any occurrences of any characters in the ranges U+0001 to U+0008,
   <!-- HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF,
   CR allowed --> U+000E to U+001F, <!-- ASCII allowed --> U+007F
@@ -76226,10 +76219,9 @@ interface <dfn id=messageport>MessagePort</dfn> {
       <tr><td>0x9E <td>U+017E <td>LATIN SMALL LETTER Z WITH CARON (&#382;)
       <tr><td>0x9F <td>U+0178 <td>LATIN CAPITAL LETTER Y WITH DIAERESIS (&Yuml;)
     </table><p>Otherwise, if the number is in the range 0xD800 to 0xDFFF<!--
-    surrogates not allowed; see the comment in the "preprocessing the
-    input stream" section for details --> or is greater than 0x10FFFF,
-    then this is a <a href=#parse-error>parse error</a>. Return a U+FFFD
-    REPLACEMENT CHARACTER.</p>
+    surrogates --> or is greater than 0x10FFFF, then this is a
+    <a href=#parse-error>parse error</a>. Return a U+FFFD REPLACEMENT
+    CHARACTER.</p>
 
     <p>Otherwise, return a character token for the Unicode character
     whose code point is that number.

diff --git a/source b/source
@@ -87882,13 +87882,6 @@ interface <span>WindowLocalStorage</span> {
   motivated by a desire to increase the resilience of user agents in
   the face of na&iuml;ve transcoders.</p>
 
-  <p>Code points in the range U+D800 to U+DFFF<!-- surrogates are not
-  allowed e.g. in UTF-8, and we don't want them to suddenly turn into
-  code points when they go through a UTF-16 pipe --> in the input must
-  be replaced by U+FFFD REPLACEMENT CHARACTERs. Any occurrences of
-  such characters and code points are <span title="parse error">parse
-  errors</span>.</p>
-
   <p>Any occurrences of any characters in the ranges U+0001 to U+0008,
   <!-- HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF,
   CR allowed --> U+000E to U+001F, <!-- ASCII allowed --> U+007F
@@ -90948,10 +90941,9 @@ interface <span>WindowLocalStorage</span> {
     </table>
 
     <p>Otherwise, if the number is in the range 0xD800 to 0xDFFF<!--
-    surrogates not allowed; see the comment in the "preprocessing the
-    input stream" section for details --> or is greater than 0x10FFFF,
-    then this is a <span>parse error</span>. Return a U+FFFD
-    REPLACEMENT CHARACTER.</p>
+    surrogates --> or is greater than 0x10FFFF, then this is a
+    <span>parse error</span>. Return a U+FFFD REPLACEMENT
+    CHARACTER.</p>
 
     <p>Otherwise, return a character token for the Unicode character
     whose code point is that number.