HTML Standard Tracker

Filter

File a bug

SVNBugCommentTime (UTC)
943[Conformance Checkers] [Internet Explorer] [Opera] [Webkit] [Tools] Be explicit about what an invalid Unicode character is.2007-06-22 01:44
@@ -32330,27 +32330,28 @@ function receiver(e) {
       <tr><td>0x98 <td>U+02DC <td>SMALL TILDE ('&#x02DC')
       <tr><td>0x99 <td>U+2122 <td>TRADE MARK SIGN ('&#x2122')
       <tr><td>0x9A <td>U+0161 <td>LATIN SMALL LETTER S WITH CARON ('&#x0161')
       <tr><td>0x9B <td>U+203A <td>SINGLE RIGHT-POINTING ANGLE QUOTATION MARK ('&#x203A')
       <tr><td>0x9C <td>U+0153 <td>LATIN SMALL LIGATURE OE ('&#x0153')
       <tr><td>0x9D <td>U+FFFD <td>REPLACEMENT CHARACTER
       <tr><td>0x9E <td>U+017E <td>LATIN SMALL LETTER Z WITH CARON ('&#x017E')
       <tr><td>0x9F <td>U+0178 <td>LATIN CAPITAL LETTER Y WITH DIAERESIS ('&#x0178')
     </table>
 
-    <p>Otherwise, if the number is not a valid Unicode character
-    (e.g. if the number is higher than 1114111), or if the number is
-    zero, then return a character token for the U+FFFD REPLACEMENT
+    <p>Otherwise, if the number is zero, if the number is higher than
+    0x10FFFF, or if it's one of the surrogate characters (characters
+    in the range 0xD800 to 0xDFFF), then this is a <span>parse
+    error</span>; return a character token for the U+FFFD REPLACEMENT
     CHARACTER character instead.</p>
 
     <p>Otherwise, return a character token for the Unicode character
-    whose code point is that number.
+    whose code point is that number.</p>
 
    </dd>
 
 
    <dt>Anything else</dt>
 
    <dd>
 
     <p>Consume the maximum number of characters possible, with the
     consumed characters case-sensitively matching one of the

|