Skip to content

Commit

Permalink
[t] (0) Remove the requirement that the parser deal with raw surrogat…
Browse files Browse the repository at this point in the history
…es, since they can't make it this far.

Fixing http://www.w3.org/Bugs/Public/show_bug.cgi?id=11298

git-svn-id: http://svn.whatwg.org/webapps@5862 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information
Hixie committed Feb 9, 2011
1 parent 19cf17d commit 3accfd8
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 33 deletions.
14 changes: 3 additions & 11 deletions complete.html
Expand Up @@ -77607,13 +77607,6 @@ <h5 id=preprocessing-the-input-stream><span class=secno>12.2.2.3 </span>Preproce
motivated by a desire to increase the resilience of user agents in
the face of na&iuml;ve transcoders.</p>

<p>Code points in the range U+D800 to U+DFFF<!-- surrogates are not
allowed e.g. in UTF-8, and we don't want them to suddenly turn into
code points when they go through a UTF-16 pipe --> in the input must
be replaced by U+FFFD REPLACEMENT CHARACTERs. Any occurrences of
such characters and code points are <a href=#parse-error title="parse error">parse
errors</a>.</p>

<p>Any occurrences of any characters in the ranges U+0001 to U+0008,
<!-- HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF,
CR allowed --> U+000E to U+001F, <!-- ASCII allowed --> U+007F
Expand Down Expand Up @@ -80255,10 +80248,9 @@ <h5 id=tokenizing-character-references><span class=secno>12.2.4.69 </span>Tokeni
<tr><td>0x9E <td>U+017E <td>LATIN SMALL LETTER Z WITH CARON (&#382;)
<tr><td>0x9F <td>U+0178 <td>LATIN CAPITAL LETTER Y WITH DIAERESIS (&Yuml;)
</table><p>Otherwise, if the number is in the range 0xD800 to 0xDFFF<!--
surrogates not allowed; see the comment in the "preprocessing the
input stream" section for details --> or is greater than 0x10FFFF,
then this is a <a href=#parse-error>parse error</a>. Return a U+FFFD
REPLACEMENT CHARACTER.</p>
surrogates --> or is greater than 0x10FFFF, then this is a
<a href=#parse-error>parse error</a>. Return a U+FFFD REPLACEMENT
CHARACTER.</p>

<p>Otherwise, return a character token for the Unicode character
whose code point is that number.
Expand Down
14 changes: 3 additions & 11 deletions index
Expand Up @@ -73578,13 +73578,6 @@ interface <dfn id=messageport>MessagePort</dfn> {
motivated by a desire to increase the resilience of user agents in
the face of na&iuml;ve transcoders.</p>

<p>Code points in the range U+D800 to U+DFFF<!-- surrogates are not
allowed e.g. in UTF-8, and we don't want them to suddenly turn into
code points when they go through a UTF-16 pipe --> in the input must
be replaced by U+FFFD REPLACEMENT CHARACTERs. Any occurrences of
such characters and code points are <a href=#parse-error title="parse error">parse
errors</a>.</p>

<p>Any occurrences of any characters in the ranges U+0001 to U+0008,
<!-- HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF,
CR allowed --> U+000E to U+001F, <!-- ASCII allowed --> U+007F
Expand Down Expand Up @@ -76226,10 +76219,9 @@ interface <dfn id=messageport>MessagePort</dfn> {
<tr><td>0x9E <td>U+017E <td>LATIN SMALL LETTER Z WITH CARON (&#382;)
<tr><td>0x9F <td>U+0178 <td>LATIN CAPITAL LETTER Y WITH DIAERESIS (&Yuml;)
</table><p>Otherwise, if the number is in the range 0xD800 to 0xDFFF<!--
surrogates not allowed; see the comment in the "preprocessing the
input stream" section for details --> or is greater than 0x10FFFF,
then this is a <a href=#parse-error>parse error</a>. Return a U+FFFD
REPLACEMENT CHARACTER.</p>
surrogates --> or is greater than 0x10FFFF, then this is a
<a href=#parse-error>parse error</a>. Return a U+FFFD REPLACEMENT
CHARACTER.</p>

<p>Otherwise, return a character token for the Unicode character
whose code point is that number.
Expand Down
14 changes: 3 additions & 11 deletions source
Expand Up @@ -87882,13 +87882,6 @@ interface <span>WindowLocalStorage</span> {
motivated by a desire to increase the resilience of user agents in
the face of na&iuml;ve transcoders.</p>

<p>Code points in the range U+D800 to U+DFFF<!-- surrogates are not
allowed e.g. in UTF-8, and we don't want them to suddenly turn into
code points when they go through a UTF-16 pipe --> in the input must
be replaced by U+FFFD REPLACEMENT CHARACTERs. Any occurrences of
such characters and code points are <span title="parse error">parse
errors</span>.</p>

<p>Any occurrences of any characters in the ranges U+0001 to U+0008,
<!-- HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF,
CR allowed --> U+000E to U+001F, <!-- ASCII allowed --> U+007F
Expand Down Expand Up @@ -90948,10 +90941,9 @@ interface <span>WindowLocalStorage</span> {
</table>

<p>Otherwise, if the number is in the range 0xD800 to 0xDFFF<!--
surrogates not allowed; see the comment in the "preprocessing the
input stream" section for details --> or is greater than 0x10FFFF,
then this is a <span>parse error</span>. Return a U+FFFD
REPLACEMENT CHARACTER.</p>
surrogates --> or is greater than 0x10FFFF, then this is a
<span>parse error</span>. Return a U+FFFD REPLACEMENT
CHARACTER.</p>

<p>Otherwise, return a character token for the Unicode character
whose code point is that number.
Expand Down

0 comments on commit 3accfd8

Please sign in to comment.