HTML5 Tracker

Diff (omit for latest revision)
Filter

Short URL: http://html5.org/r/2729

SVNBugCommentTime (UTC)
2729Minor editorial fixes to the parser section (credit: ey)2009-02-01 01:56
Index: source
===================================================================
--- source	(revision 2728)
+++ source	(revision 2729)
@@ -54166,17 +54166,18 @@
   a <span>parse error</span>.</p>
 
   <p>Any occurrences of any characters in the ranges U+0001 to U+0008,
-  <!-- HT, LF allowed --> U+000B, <!-- FF, CR allowed --> U+000E to
-  U+001F, <!-- ASCII allowed --> U+007F <!--to U+0084, (U+0085 NEL not
-  allowed), U+0086--> to U+009F, U+D800 to U+DFFF<!-- surrogates not
-  allowed -->, U+FDD0 to U+FDEF, and characters U+FFFE, U+FFFF,
-  U+1FFFE, U+1FFFF, U+2FFFE, U+2FFFF, U+3FFFE, U+3FFFF, U+4FFFE,
-  U+4FFFF, U+5FFFE, U+5FFFF, U+6FFFE, U+6FFFF, U+7FFFE, U+7FFFF,
-  U+8FFFE, U+8FFFF, U+9FFFE, U+9FFFF, U+AFFFE, U+AFFFF, U+BFFFE,
-  U+BFFFF, U+CFFFE, U+CFFFF, U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF,
-  U+FFFFE, U+FFFFF, U+10FFFE, and U+10FFFF are <span title="parse
-  error">parse errors</span>. (These are all control characters or
-  permanently undefined Unicode characters.)</p>
+  <!-- HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF,
+  CR allowed --> U+000E to U+001F, <!-- ASCII allowed --> U+007F
+  <!--to U+0084, (U+0085 NEL not allowed), U+0086--> to U+009F, U+D800
+  to U+DFFF<!-- surrogates not allowed -->, U+FDD0 to U+FDEF, and
+  characters U+000B, U+FFFE, U+FFFF, U+1FFFE, U+1FFFF, U+2FFFE,
+  U+2FFFF, U+3FFFE, U+3FFFF, U+4FFFE, U+4FFFF, U+5FFFE, U+5FFFF,
+  U+6FFFE, U+6FFFF, U+7FFFE, U+7FFFF, U+8FFFE, U+8FFFF, U+9FFFE,
+  U+9FFFF, U+AFFFE, U+AFFFF, U+BFFFE, U+BFFFF, U+CFFFE, U+CFFFF,
+  U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF, U+FFFFE, U+FFFFF, U+10FFFE, and
+  U+10FFFF are <span title="parse error">parse errors</span>. (These
+  are all control characters or permanently undefined Unicode
+  characters.)</p>
 
   <p>U+000D CARRIAGE RETURN (CR) characters and U+000A LINE FEED (LF)
   characters are treated specially. Any CR characters that are
@@ -54188,8 +54189,9 @@
 
   <p>The <dfn>next input character</dfn> is the first character in the
   input stream that has not yet been <dfn>consumed</dfn>. Initially,
-  the <em>next input character</em> is the first character in the
-  input.</p>
+  the <i>next input character</i> is the first character in the
+  input. The <dfn>current input character</dfn> is the last character
+  to have been <i>consumed</i>.</p>
 
   <p>The <dfn>insertion point</dfn> is the position (just before a
   character or just before the end of the input stream) where content
@@ -54915,7 +54917,7 @@
     <p>Consume the <span>next input character</span>. If it is a
     U+002F SOLIDUS (/) character, switch to the <span>close tag open
     state</span>. Otherwise, emit a U+003C LESS-THAN SIGN character
-    token and reconsume the current input character in the
+    token and reconsume the <span>current input character</span> in the
     <span>data state</span>.</p>
 
    </dd>
@@ -54959,7 +54961,7 @@
 
      <dt>Anything else</dt>
      <dd><span>Parse error</span>. Emit a U+003C LESS-THAN SIGN
-     character token and reconsume the current input character in the
+     character token and reconsume the <span>current input character</span> in the
      <span>data state</span>.</dd>
 
     </dl>
@@ -55051,7 +55053,7 @@
    state</span>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the current input character
+   <dd>Append the lowercase version of the <span>current input character</span>
    (add 0x0020 to the character's code point) to the current tag
    token's tag name. Stay in the <span>tag name state</span>.</dd>
 
@@ -55061,7 +55063,7 @@
    state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current tag token's
+   <dd>Append the <span>current input character</span> to the current tag token's
    tag name. Stay in the <span>tag name state</span>.</dd>
 
   </dl>
@@ -55089,8 +55091,8 @@
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Start a new attribute in the current tag token. Set that
-   attribute's name to the lowercase version of the current input
-   character (add 0x0020 to the character's code point), and its
+   attribute's name to the lowercase version of the <span>current input
+   character</span> (add 0x0020 to the character's code point), and its
    value to the empty string. Switch to the <span>attribute name
    state</span>.</dd>
 
@@ -55107,7 +55109,7 @@
 
    <dt>Anything else</dt>
    <dd>Start a new attribute in the current tag token. Set that
-   attribute's name to the current input character, and its value to
+   attribute's name to the <span>current input character</span>, and its value to
    the empty string. Switch to the <span>attribute name
    state</span>.</dd>
 
@@ -55138,7 +55140,7 @@
    state</span>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the current input character
+   <dd>Append the lowercase version of the <span>current input character</span>
    (add 0x0020 to the character's code point) to the current
    attribute's name. Stay in the <span>attribute name
    state</span>.</dd>
@@ -55154,7 +55156,7 @@
    state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current attribute's
+   <dd>Append the <span>current input character</span> to the current attribute's
    name. Stay in the <span>attribute name state</span>.</dd>
 
   </dl>
@@ -55193,7 +55195,7 @@
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Start a new attribute in the current tag token. Set that
-   attribute's name to the lowercase version of the current input character
+   attribute's name to the lowercase version of the <span>current input character</span>
    (add 0x0020 to the character's code point), and its value to
    the empty string. Switch to the <span>attribute name
    state</span>.</dd>
@@ -55210,7 +55212,7 @@
 
    <dt>Anything else</dt>
    <dd>Start a new attribute in the current tag token. Set that
-   attribute's name to the current input character, and its value to
+   attribute's name to the <span>current input character</span>, and its value to
    the empty string. Switch to the <span>attribute name
    state</span>.</dd>
 
@@ -55254,7 +55256,7 @@
    state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current attribute's
+   <dd>Append the <span>current input character</span> to the current attribute's
    value. Switch to the <span>attribute value (unquoted)
    state</span>.</dd>
 
@@ -55282,7 +55284,7 @@
    state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current attribute's
+   <dd>Append the <span>current input character</span> to the current attribute's
    value. Stay in the <span>attribute value (double-quoted)
    state</span>.</dd>
 
@@ -55310,7 +55312,7 @@
    state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current attribute's
+   <dd>Append the <span>current input character</span> to the current attribute's
    value. Stay in the <span>attribute value (single-quoted)
    state</span>.</dd>
 
@@ -55351,7 +55353,7 @@
    state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current attribute's
+   <dd>Append the <span>current input character</span> to the current attribute's
    value. Stay in the <span>attribute value (unquoted)
    state</span>.</dd>
 
@@ -55658,7 +55660,7 @@
 
    <dt>Anything else</dt>
    <dd>Create a new DOCTYPE token. Set the token's name to the
-   current input character. Switch to the <span>DOCTYPE name
+   <span>current input character</span>. Switch to the <span>DOCTYPE name
    state</span>.</dd>
 
   </dl>
@@ -55692,7 +55694,7 @@
    Reconsume the EOF character in the <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current DOCTYPE
+   <dd>Append the <span>current input character</span> to the current DOCTYPE
    token's name. Stay in the <span>DOCTYPE name state</span>.</dd>
 
   </dl>
@@ -55723,12 +55725,13 @@
    <dt>Anything else</dt>
    <dd>
 
-    <p>If the next six characters are an <span>ASCII
-    case-insensitive</span> match for the word "PUBLIC", then consume
-    those characters and switch to the <span>before DOCTYPE public
-    identifier state</span>.</p>
+    <p>If the six characters starting from the <span>current input
+    character</span> are an <span>ASCII case-insensitive</span> match
+    for the word "PUBLIC", then consume those characters and switch to
+    the <span>before DOCTYPE public identifier state</span>.</p>
 
-    <p>Otherwise, if the next six characters are an <span>ASCII
+    <p>Otherwise, if the six characters starting from the
+    <span>current input character</span> are an <span>ASCII
     case-insensitive</span> match for the word "SYSTEM", then consume
     those characters and switch to the <span>before DOCTYPE system
     identifier state</span>.</p>
@@ -55803,7 +55806,7 @@
    Reconsume the EOF character in the <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current DOCTYPE
+   <dd>Append the <span>current input character</span> to the current DOCTYPE
    token's public identifier. Stay in the <span>DOCTYPE public
    identifier (double-quoted) state</span>.</dd>
 
@@ -55830,7 +55833,7 @@
    Reconsume the EOF character in the <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current DOCTYPE
+   <dd>Append the <span>current input character</span> to the current DOCTYPE
    token's public identifier. Stay in the <span>DOCTYPE public
    identifier (single-quoted) state</span>.</dd>
 
@@ -55938,7 +55941,7 @@
    Reconsume the EOF character in the <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current DOCTYPE
+   <dd>Append the <span>current input character</span> to the current DOCTYPE
    token's system identifier. Stay in the <span>DOCTYPE system
    identifier (double-quoted) state</span>.</dd>
 
@@ -55965,7 +55968,7 @@
    Reconsume the EOF character in the <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the current input character to the current DOCTYPE
+   <dd>Append the <span>current input character</span> to the current DOCTYPE
    token's system identifier. Stay in the <span>DOCTYPE system
    identifier (single-quoted) state</span>.</dd>
 
@@ -56179,18 +56182,18 @@
     <!-- this is the same as the equivalent list in the input stream
     section, except it has 0x0000 included in the first range. -->
     <p>Otherwise, if the number is in the range 0x0000 to 0x0008, <!--
-    HT, LF allowed --> 0x000B, <!-- FF, CR allowed --> 0x000E to
-    0x001F, <!-- ASCII allowed --> 0x007F <!--to 0x0084, (0x0085 NEL
-    not allowed), 0x0086--> to 0x009F, 0xD800 to 0xDFFF<!--
-    surrogates not allowed -->, 0xFDD0 to 0xFDEF, or is one of 0xFFFE,
-    0xFFFF, 0x1FFFE, 0x1FFFF, 0x2FFFE, 0x2FFFF, 0x3FFFE, 0x3FFFF,
-    0x4FFFE, 0x4FFFF, 0x5FFFE, 0x5FFFF, 0x6FFFE, 0x6FFFF, 0x7FFFE,
-    0x7FFFF, 0x8FFFE, 0x8FFFF, 0x9FFFE, 0x9FFFF, 0xAFFFE, 0xAFFFF,
-    0xBFFFE, 0xBFFFF, 0xCFFFE, 0xCFFFF, 0xDFFFE, 0xDFFFF, 0xEFFFE,
-    0xEFFFF, 0xFFFFE, 0xFFFFF, 0x10FFFE, or 0x10FFFF, or is higher
-    than 0x10FFFF, then this is a <span>parse error</span>; return a
-    character token for the U+FFFD REPLACEMENT CHARACTER character
-    instead.</p>
+    HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF, CR
+    allowed --> 0x000E to 0x001F, <!-- ASCII allowed --> 0x007F <!--to
+    0x0084, (0x0085 NEL not allowed), 0x0086--> to 0x009F, 0xD800 to
+    0xDFFF<!-- surrogates not allowed -->, 0xFDD0 to 0xFDEF, or is one
+    of 0x000B, 0xFFFE, 0xFFFF, 0x1FFFE, 0x1FFFF, 0x2FFFE, 0x2FFFF,
+    0x3FFFE, 0x3FFFF, 0x4FFFE, 0x4FFFF, 0x5FFFE, 0x5FFFF, 0x6FFFE,
+    0x6FFFF, 0x7FFFE, 0x7FFFF, 0x8FFFE, 0x8FFFF, 0x9FFFE, 0x9FFFF,
+    0xAFFFE, 0xAFFFF, 0xBFFFE, 0xBFFFF, 0xCFFFE, 0xCFFFF, 0xDFFFE,
+    0xDFFFF, 0xEFFFE, 0xEFFFF, 0xFFFFE, 0xFFFFF, 0x10FFFE, or
+    0x10FFFF, or is higher than 0x10FFFF, then this is a <span>parse
+    error</span>; return a character token for the U+FFFD REPLACEMENT
+    CHARACTER character instead.</p>
 
     <p>Otherwise, return a character token for the Unicode character
     whose code point is that number.</p>
@@ -61515,33 +61518,33 @@
   Flanagan, David H&aring;s&auml;ther, David Hyatt, David Smith, David
   Woolley, Dean Edridge, Debi Orton, Derek Featherstone, DeWitt
   Clinton, Dimitri Glazkov, dolphinling, Doron Rosenberg, Doug Kramer,
-  Edward O'Connor, Eira Monstad, Elliotte Harold, Eric Carlson, Eric
-  Law, Erik Arvidsson, Evan Martin, Evan Prodromou, fantasai, Felix
-  Sasaki, Franck 'Shift' Qu&eacute;lain, Garrett Smith, Geoffrey
-  Garen, Geoffrey Sneddon, George Lund, H&aring;kon Wium Lie, Hans
-  S. T&oslash;mmerhalt, Henri Sivonen, Henrik Lied, Henry Mason, Hugh
-  Winkler, Ignacio Javier, Ivo Emanuel Gon&ccedil;alves, J. King,
-  Jacques Distler, James Graham, James Justin Harrell, James M Snell,
-  James Perrett, Jan-Klaas Kollhof, Jason White, Jasper Bryant-Greene,
-  Jed Hartman, Jeff Cutsinger, Jeff Schiller, Jeff Walden, Jens
-  Bannmann, Jens Fendler, Jeroen van der Meer, Jim Jewett, Jim Meehan,
-  Joe Clark, John Fallows, Joseph Kesselman, Jjgod Jiang, Joel
-  Spolsky, Johan Herland, John Boyer, John Bussjaeger, John Harding,
-  Johnny Stenback, Jon Gibbins, Jon Perlow, Jonathan Worent, Jorgen
-  Horstink, Josh Levenberg, Joshua Randall, Jukka K. Korpela, Jules
-  Cl&eacute;ment-Ripoche, Julian Reschke, Kai Hendry, Kartikaya Gupta,
-  <!-- Keryx Web, = Lars Gunther --> Kornel Lesinski,
-  &#x9ed2;&#x6fa4;&#x525b;&#x5fd7; (KUROSAWA Takeshi), Kristof
-  Zelechovski, Kyle Hofmann, Lachlan Hunt, Larry Page, Lars Gunther,
-  Laura L. Carlson, Laura Wisewell, Laurens Holst, Lee Kowalkowski,
-  Leif Halvard Silli, Lenny Domnitser, L&eacute;onard Bouchet, Leons
-  Petrazickis, Logan<!-- on moz irc -->, Loune, Maciej Stachowiak,
-  Magnus Kristiansen<!-- Dashiva -->, Maik Merten, Malcolm Rowe, Mark
-  Nottingham, Mark Rowe<!--bdash-->, Mark Schenk, Martijn Wargers,
-  Martin Atkins, Martin D&uuml;rst, Martin Honnen, Masataka Yakura,
-  Mathieu Henri, Matthew Gregan, Matthew Mastracci, Matthew Raymond,
-  Matthew Thomas, Mattias Waldau, Max Romantschuk, Michael 'Ratt'
-  Iannarelli, Michael A. Nachbaur, Michael A. Puls
+  Edward O'Connor, Edward Z. Yang, Eira Monstad, Elliotte Harold, Eric
+  Carlson, Eric Law, Erik Arvidsson, Evan Martin, Evan Prodromou,
+  fantasai, Felix Sasaki, Franck 'Shift' Qu&eacute;lain, Garrett
+  Smith, Geoffrey Garen, Geoffrey Sneddon, George Lund, H&aring;kon
+  Wium Lie, Hans S. T&oslash;mmerhalt, Henri Sivonen, Henrik Lied,
+  Henry Mason, Hugh Winkler, Ignacio Javier, Ivo Emanuel
+  Gon&ccedil;alves, J. King, Jacques Distler, James Graham, James
+  Justin Harrell, James M Snell, James Perrett, Jan-Klaas Kollhof,
+  Jason White, Jasper Bryant-Greene, Jed Hartman, Jeff Cutsinger, Jeff
+  Schiller, Jeff Walden, Jens Bannmann, Jens Fendler, Jeroen van der
+  Meer, Jim Jewett, Jim Meehan, Joe Clark, John Fallows, Joseph
+  Kesselman, Jjgod Jiang, Joel Spolsky, Johan Herland, John Boyer,
+  John Bussjaeger, John Harding, Johnny Stenback, Jon Gibbins, Jon
+  Perlow, Jonathan Worent, Jorgen Horstink, Josh Levenberg, Joshua
+  Randall, Jukka K. Korpela, Jules Cl&eacute;ment-Ripoche, Julian
+  Reschke, Kai Hendry, Kartikaya Gupta, <!-- Keryx Web, = Lars Gunther
+  --> Kornel Lesinski, &#x9ed2;&#x6fa4;&#x525b;&#x5fd7; (KUROSAWA
+  Takeshi), Kristof Zelechovski, Kyle Hofmann, Lachlan Hunt, Larry
+  Page, Lars Gunther, Laura L. Carlson, Laura Wisewell, Laurens Holst,
+  Lee Kowalkowski, Leif Halvard Silli, Lenny Domnitser, L&eacute;onard
+  Bouchet, Leons Petrazickis, Logan<!-- on moz irc -->, Loune, Maciej
+  Stachowiak, Magnus Kristiansen<!-- Dashiva -->, Maik Merten, Malcolm
+  Rowe, Mark Nottingham, Mark Rowe<!--bdash-->, Mark Schenk, Martijn
+  Wargers, Martin Atkins, Martin D&uuml;rst, Martin Honnen, Masataka
+  Yakura, Mathieu Henri, Matthew Gregan, Matthew Mastracci, Matthew
+  Raymond, Matthew Thomas, Mattias Waldau, Max Romantschuk, Michael
+  'Ratt' Iannarelli, Michael A. Nachbaur, Michael A. Puls
   II<!--Shadow2531-->, Michael Carter, Michael Gratton, Michael
   Nordman, Michael Powers, Michael(tm) Smith, Michel Fortin, Michiel
   van der Blonk, Mihai &#x015E;ucan<!-- from ROBO Design -->, Mike

|