Short URL: http://html5.org/r/1936
| SVN | Bug | Comment | Time (UTC) |
|---|---|---|---|
| 1936 | Case Sensitivity Training | 2008-07-25 09:26 |
Index: source
===================================================================
--- source (revision 1935)
+++ source (revision 1936)
@@ -9,16 +9,15 @@
<p class="big-issue">Some of the more major known issues are marked
like this. There are many other issues that have been raised as
well; the issues given in this document are not the only known
- issues! There are also some spec-wide issues that have not yet been
- addressed: case-sensitivity is a very poorly handled topic right
- now, and the firing of events needs to be unified (right now some
- bubble, some don't, they all use different text to fire events,
- etc). It would also be nice to unify the rules on downloading
- content when attributes change (e.g. <code title="">src</code>
- attributes) - should they initiate downloads when the element
- immediately, is inserted in the document, when active scripts end,
- etc. This matters e.g. if an attribute is set twice in a row (does
- it hit the network twice).</p>
+ issues! Also, firing of events needs to be unified (right now some
+ bubble, some don't, they all use different text to fire events, we
+ don't have an official queueing mechanism, etc). It would also be
+ nice to unify the rules on fetching/downloading content when
+ attributes change (e.g. <code title="">src</code> attributes) -
+ should they initiate downloads when the element immediately, is
+ inserted in the document, when active scripts end, etc. This matters
+ e.g. if an attribute is set twice in a row (does it hit the network
+ twice).</p>
<h2 class="no-num no-toc" id="contents">Table of contents</h2>
@@ -415,9 +414,9 @@
<p>Attribute names are said to be <dfn>XML-compatible</dfn> if they
match the <a href="http://www.w3.org/TR/REC-xml/#NT-Name"><code
title="">Name</code></a> production defined in XML, they contain no
- U+003A COLON (:) characters, and they do not start with three
- characters "<code title="">xml</code>". <a
- href="#refsXML">[XML]</a></p> <!-- XXX case-insensitive ASCII -->
+ U+003A COLON (:) characters, and their first three characters are
+ not an <span>ASCII case-insensitive</span> match for the string
+ "<code title="">xml</code>". <a href="#refsXML">[XML]</a></p>
<h4>DOM trees</h4>
@@ -965,10 +964,41 @@
+ <h3>Case-sensitivity</h3>
+ <p>This specification defines several comparison operators for
+ strings.</p>
+ <p>Comparing two strings in a <dfn>case-sensitive</dfn> manner means
+ comparing them exactly, codepoint for codepoint.</p>
+ <p>Comparing two strings in a <dfn>ASCII case-insensitive</dfn>
+ manner means comparing them exactly, codepoint for codepoint, except
+ that the characters in the range U+0041 .. U+005A (i.e. LATIN
+ CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding
+ characters in the range U+0061 .. U+007A (i.e. LATIN SMALL LETTER A
+ to LATIN SMALL LETTER Z) are considered to also match.</p>
+ <p>Comparing two strings in a <dfn>compatibility caseless</dfn>
+ manner means using the Unicode <i>compatibility caseless match</i>
+ operation to compare the two strings. <a
+ href="#refsUNICODECASE">[UNICODECASE]</a></p> <!-- XXX refs to
+ Unicode Standard Annex #21, Case Mappings -->
+
+ <p><dfn title="converted to uppercase">Converting a string to
+ uppercase</dfn> means replacing all characters in the range U+0061
+ .. U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) with
+ the corresponding characters in the range U+0041 .. U+005A
+ (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z).</p>
+
+ <p><dfn title="converted to lowercase">Converting a string to
+ lowercase</dfn> means replacing all characters in the range U+0041
+ .. U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z)
+ with the corresponding characters in the range U+0061 .. U+007A
+ (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z).</p>
+
+
+
<h3>URLs</h3>
<p>This specification defines the term <span>URL</span>, and defines
@@ -1709,7 +1739,7 @@
of the attribute represents the false value.</p>
<p>If the attribute is present, its value must either be the empty
- string or a value that is a case-insensitive <!-- XXX ASCII -->
+ string or a value that is a, <span>ASCII case-insensitive</span>
match for the attribute's canonical name, with no leading or
trailing whitespace.</p>
@@ -3332,20 +3362,18 @@
is the <em>missing value default</em>.</p>
<p>If an enumerated attribute is specified, the attribute's value
- must be one of the given keywords that are not said to be
- non-conforming, with no leading or trailing whitespace. The keyword
- may use any mix of uppercase and lowercase letters.<!-- XXX should
- say "uppercase and lowercase ASCII letters" or some such --></p>
+ must be an <span>ASCII case-insensitive</span> match for one of the
+ given keywords that are not said to be non-conforming, with no
+ leading or trailing whitespace.</p>
- <p>When the attribute is specified, if its value
- <span>case-insensitively</span><!-- XXX ascii case folding -->
- matches one of the given keywords then that keyword's state is the
- state that the attribute represents. If the attribute value matches
- none of the given keywords, but the attribute has an <em>invalid
- value default</em>, then the attribute represents that
- state. Otherwise, if the attribute value matches none of the
- keywords but there is a <em>missing value default</em> state
- defined, then <em>that</em> is the state represented by the
+ <p>When the attribute is specified, if its value is an <span>ASCII
+ case-insensitively</span> match for one of the given keywords then
+ that keyword's state is the state that the attribute represents. If
+ the attribute value matches none of the given keywords, but the
+ attribute has an <em>invalid value default</em>, then the attribute
+ represents that state. Otherwise, if the attribute value matches
+ none of the keywords but there is a <em>missing value default</em>
+ state defined, then <em>that</em> is the state represented by the
attribute. Otherwise, there is no default, and invalid values must
be ignored.</p>
@@ -3392,12 +3420,15 @@
<li><p>Return the first element of type <var title="">type</var>
that has an <code title="attr-id">id</code> or <code
- title="">name</code> attribute whose value <!-- Unicode,
- apparently: <annevk> seems IE might be Unicode case-insensitive for
- ID [and name] values (related to <map> anyway, and at least for the
- character ë --> case-insensitively matches <var
- title="">s</var>.</p></li>
-
+ title="">name</code> attribute whose value is a <span>compatibility
+ caseless</span> match for <var title="">s</var>.</p></li>
+ <!--
+ That's what IE does:
+ http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Cmap%20name%3D%22T%26eacute%3B%26%23x01F1%3B%26%23x2075%3B%22%3E%3Carea%20href%3D%22%2F%22%20shape%3Drect%20coords%3D0%2C0%2C200%2C200%3E%3C%2Fmap%3E%0A%3Cimg%20usemap%3D%22%23t%26Eacute%3BDZ5%22%20src%3Dimage%3E
+ ...except that doesn't explain why this fails:
+ http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Cmap%20name%3D%22T%26eacute%3B%26%23x01F1%3B%26%23x2075%3B%26%23xFB01%3B%22%3E%3Carea%20href%3D%22%2F%22%20shape%3Drect%20coords%3D0%2C0%2C200%2C200%3E%3C%2Fmap%3E%0A%3Cimg%20usemap%3D%22%23t%26Eacute%3BDZ5F%26%23x0131%3B%26%23x0307%3B%22%20src%3Dimage%3E
+ maybe they just don't know about combining dot above?
+ -->
</ol>
@@ -3442,13 +3473,13 @@
getting, the DOM attribute must return the conforming value
associated with the state the attribute is in (in its canonical
case), or the empty string if the attribute is in a state that has
- no associated keyword value; and on setting, if the new value
- case-insensitively <!-- XXX --> matches one of the keywords given
- for that attribute, then the content attribute must be set to the
- conforming value associated with the state that the attribute would
- be in if set to the given new value, otherwise, if the new value is
- the empty string, then the content attribute must be removed,
- otherwise, the setter must raise a <code>SYNTAX_ERR</code>
+ no associated keyword value; and on setting, if the new value is an
+ <span>ASCII case-insensitive</span> match for one of the keywords
+ given for that attribute, then the content attribute must be set to
+ the conforming value associated with the state that the attribute
+ would be in if set to the given new value, otherwise, if the new
+ value is the empty string, then the content attribute must be
+ removed, otherwise, the setter must raise a <code>SYNTAX_ERR</code>
exception.</p>
<p>If a reflecting DOM attribute is a <code>DOMString</code> but
@@ -4089,8 +4120,8 @@
<ol>
<li><p>Find the first seven characters in <var title="">s</var>
- that are a case-insensitive<!-- XXX ASCII--> match for the word
- 'charset'. If no such match is found, return nothing.</p>
+ that are an <span>ASCII case-insensitive</span> match for the word
+ "charset". If no such match is found, return nothing.</p>
<li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
characters that immediately follow the word 'charset' (there might
@@ -4147,13 +4178,6 @@
<ol>
- <li><p>Let <var title="">official type</var> be the type given by
- the <span title="Content-Type">Content-Type metadata</span> for the
- resource (in lowercase<!-- XXX ASCII case folding -->, ignoring any
- parameters). If there is no such type, jump to the <i
- title="content-type sniffing: unknown type">unknown type</i> step
- below.</p></li>
-
<li><p>If the user agent is configured to strictly obey
Content-Type headers for this resource, then jump to the last step
in this set of steps.</p></li>
@@ -4191,14 +4215,22 @@
</li>
+ <li><p>Let <var title="">official type</var> be the type given by
+ the <span title="Content-Type">Content-Type metadata</span> for the
+ resource, ignoring parameters. If there is no such type, jump to
+ the <i title="content-type sniffing: unknown type">unknown type</i>
+ step below. Comparisons with this type, as defined by MIME
+ specifications, are done in an <span>ASCII case-insensitive</span>
+ manner. <a href="#refsRFC2046">[RFC2046]</a></p></li>
+
<li><p>If <var title="">official type</var> is "unknown/unknown" or
- "application/unknown", jump to the <i title="content-type
- sniffing: unknown type">unknown type</i> step below.</p>
- <!-- In a study looking at many billions of pages whose first five
- characters were "<HTML", "unknown/unknown" was used to label
- documents about once for every 5000 pages labeled "text/html", and
- "application/unknown" was used about once for every 35000 pages
- labeled "text/html". --></li>
+ "application/unknown", jump to the <i title="content-type sniffing:
+ unknown type">unknown type</i> step below.</p> <!-- In a study
+ looking at many billions of pages whose first five characters were
+ "<HTML", "unknown/unknown" was used to label documents about once
+ for every 5000 pages labeled "text/html", and "application/unknown"
+ was used about once for every 35000 pages labeled
+ "text/html". --></li>
<li><p>If <var title="">official type</var> ends in "+xml", or if
it is either "text/xml" or "application/xml", then the sniffed
@@ -4209,7 +4241,8 @@
<li><p>If <var title="">official type</var> is an image type
supported by the user agent (e.g. "image/png", "image/gif",
"image/jpeg", etc), then jump to the <i title="content-type
- sniffing: image">images</i> section below.</p></li>
+ sniffing: image">images</i> section below, passing it the <var
+ title="">official type</var>.</p></li>
<li><p>If <var title="">official type</var> is "text/html", then
jump to the <i title="content-type sniffing: feed or html">feed or
@@ -5296,7 +5329,7 @@
<p>The <dfn
title="dom-document-getElementsByName"><code>getElementsByName(<var
- title="">name</var>)</code></dfn> method a string <var
+ title="">name</var>)</code></dfn> method takes a string <var
title="">name</var>, and must return a live <code>NodeList</code>
containing all the <code>a</code>, <code>applet</code>,
<code>button</code>, <code>form</code>, <!-- frame? frameset?
@@ -5304,8 +5337,9 @@
<code>map</code>, <code>meta</code>, <code>object</code>,<!-- param?
XXX--> <code>select</code>, and <code>textarea</code> elements in
that document that have a <code title="">name</code> attribute whose
- value is equal<!-- XXX case sensitivity --> to the <var
- title="">name</var> argument.</p> <!-- XXX what about XHTML? -->
+ value is equal to the <var title="">name</var> argument (in a
+ <span>case-sensitive</span> manner).</p> <!-- XXX what about XHTML?
+ -->
<p>The <dfn
title="dom-document-getElementsByClassName"><code>getElementsByClassName(<var
@@ -6397,7 +6431,6 @@
<h3>APIs in HTML documents</h3>
- <!-- XXX case-sensitivity training required here. -->
<p>For <span>HTML documents</span>, and for <span>HTML
elements</span> in <span>HTML documents</span>, certain APIs defined
@@ -6417,9 +6450,10 @@
<dd>
- <p>These attributes return tag names in all uppercase<!-- XXX
- xref--> and attribute names in all lowercase<!-- XXX xref -->,
- regardless of the case with which they were created.</p>
+ <p>These attributes must return tag names <span>converted to
+ uppercase</span> and attribute names <span>converted to
+ lowercase</span>, regardless of the case with which they were
+ created.</p>
</dd>
@@ -6429,9 +6463,9 @@
<dd>
<p>The canonical form of HTML markup is all-lowercase; thus, this
- method will lowercase<!-- XXX xref --> the argument before
- creating the requisite element. Also, the element created must be
- in the <span>HTML namespace</span>.</p>
+ method will <span title="converted to lowercase">lowercase</span>
+ the argument before creating the requisite element. Also, the
+ element created must be in the <span>HTML namespace</span>.</p>
<p class="note">This doesn't apply to <code
title="">Document.createElementNS()</code>. Thus, it is possible,
@@ -6450,7 +6484,8 @@
<p>When an <code>Attr</code> node is set on an <span title="HTML
elements">HTML element</span>, it must have its name
- lowercased<!-- XXX xref --> before the element is affected.</p>
+ <span>converted to lowercase</span> before the element is
+ affected.</p>
<p class="note">This doesn't apply to <code
title="">Document.setAttributeNodeNS()</code>.</p>
@@ -6463,8 +6498,8 @@
<dd>
<p>When an attribute is set on an <span title="HTML elements">HTML
- element</span>, the name argument must be lowercased<!-- XXX xref
- --> before the element is affected.</p>
+ element</span>, the name argument must be <span>converted to
+ lowercas</span> before the element is affected.</p>
<p class="note">This doesn't apply to <code
title="">Document.setAttributeNS()</code>.</p>
@@ -6478,9 +6513,10 @@
<dd>
<p>These methods (but not their namespaced counterparts) must
- compare the given argument case-insensitively<!-- XXX xref -->
- when looking at <span title="HTML elements">HTML elements</span>,
- and case-sensitively otherwise.</p>
+ compare the given argument in an <span>ASCII
+ case-insensitive</span> manner when looking at <span title="HTML
+ elements">HTML elements</span>, and in a
+ <span>case-sensitive</span> manner otherwise.</p>
<p class="note">Thus, in an <span title="HTML documents">HTML
document</span> with nodes in multiple namespaces, these methods
@@ -6495,8 +6531,8 @@
<dd>
<p>If the new namespace is the <span>HTML namespace</span>, then
- the new qualified name must be lowercased before the rename takes
- place.<!-- XXX xref --></p>
+ the new qualified name must be <span>converted to lowercase</span>
+ before the rename takes place.</p>
</dd>
@@ -6571,8 +6607,8 @@
otherwise.</p></li>
<li><p>Let <var title="">replace</var> be true if there is a second
- argument and it has the value "replace"<!-- case-insensitive. XXX
- -->, and false otherwise.</p></li>
+ argument and it is an <span>ASCII case-insensitive</span> match for
+ the value "replace", and false otherwise.</p></li>
<li>
@@ -6973,8 +7009,8 @@
character.</li>
<li>A <code>ProcessingInstruction</code> node whose target name is
- the string "<code title="">xml</code>" (case insensitively)<!--
- ASCII -->.</li>
+ an <span>ASCII case-insensitive</span> match for the string "<code
+ title="">xml</code>".</li>
<li>A <code>ProcessingInstruction</code> node whose target name
contains a U+003A COLON (":").</li>
@@ -7916,13 +7952,13 @@
<p>For <code>meta</code> elements in the <span
title="attr-meta-http-equiv-content-type">Encoding declaration
state</span>, the <code title="attr-meta-content">content</code>
- attribute must have a value that is a case-insensitive<!-- ASCII
- XXX--> match of a string that consists of the literal string
- "<code title="">text/html;</code>", optionally followed by any
- number of <span title="space character">space characters</span>,
- followed by the literal string "<code title="">charset=</code>",
- followed by the character encoding name of <a href="#charset">the
- character encoding declaration</a>.</p>
+ attribute must have a value that is an <span>ASCII
+ case-insensitive</span> match for a string that consists of the
+ literal string "<code title="">text/html;</code>", optionally
+ followed by any number of <span title="space character">space
+ characters</span>, followed by the literal string "<code
+ title="">charset=</code>", followed by the character encoding name
+ of <a href="#charset">the character encoding declaration</a>.</p>
<p>If the document contains a <code>meta</code> element in the
<span title="attr-meta-http-equiv-content-type">Encoding
@@ -17483,7 +17519,7 @@
<p>When the UA is passed an empty string or a string specifying a
context that it does not support, then it must return null. String
- comparisons must be literal and case-sensitive.</p>
+ comparisons must be <span>case-sensitive</span>.</p>
<p>Arguments other than the <var title="">contextId</var> must be
ignored, and must not cause the user agent to raise an exception (as
@@ -17996,8 +18032,9 @@
</dl>
<p>These values are all case-sensitive — they must be used
- exactly as shown. User agents must not recognize values that do not
- exactly match the values given above.</p>
+ exactly as shown. User agents must not recognize values that are not
+ a <span>case-sensitive</span> match for one of the values given
+ above.</p>
<p>The operators in the above list must be treated as described by
the Porter-Duff operator given at the start of their description
@@ -28175,8 +28212,8 @@
special keywords.)</p>
<p>A <dfn>valid browsing context name or keyword</dfn> is any string
- that is either a <span>valid browsing context name</span> or that
- case-insensitively <!-- ASCII --> matches one of: <code
+ that is either a <span>valid browsing context name</span> or that is
+ an <span>ASCII case-insensitive</span> match for one of: <code
title="">_blank</code>, <code title="">_self</code>, <code
title="">_parent</code>, or <code title="">_top</code>.</p>
@@ -28569,7 +28606,7 @@
<li><p>Let <var title="">scheme</var> be the <span
title="url-scheme"><scheme></span> component of the URI,
- converted to lowercase<!-- XXX -->.</p></li>
+ <span>converted to lowercase</span>.</p></li>
<li><p>If the UA doesn't support the protocol given by <var
title="">scheme</var>, then return a new globally unique
@@ -28598,7 +28635,8 @@
</li>
<li><p>Let <var title="">host</var> be the result of converting
- <var title="">host</var> to lowercase<!-- XXX -->.</p></li>
+ <var title="">host</var> <span title="converted to lowercase">to
+ lowercase</span>.</p></li>
<li><p>If there is no <span title="url-port"><port></span>
component, then let <var title="">port</var> be the default port
@@ -30267,9 +30305,10 @@
<dd>
<p>A scheme, such as <code>ftp</code> or <code>fax</code>. The
- scheme must be treated case-insensitively by user agents for the
- purposes of comparing with the scheme part of URLs that they
- consider against the list of registered handlers.</p>
+ scheme must be compared in an <span>ASCII case-insensitive</span>
+ manner by user agents for the purposes of comparing with the
+ scheme part of URLs that they consider against the list of
+ registered handlers.</p>
<p>The <var title="">protocol</var> value, if it contains a colon
(as in "<code>ftp:</code>"), will never match anything, since
@@ -30282,10 +30321,10 @@
<dd>
<p>A MIME type, such as <code>model/vrml</code> or
- <code>text/richtext</code>. The MIME type must be treated
- case-insensitively by user agents for the purposes of comparing
- with MIME types of documents that they consider against the list
- of registered handlers.</p>
+ <code>text/richtext</code>. The MIME type must be compared in an
+ <span>ASCII case-insensitive</span> manner by user agents for the
+ purposes of comparing with MIME types of documents that they
+ consider against the list of registered handlers.</p>
<p>User agents must compare the given values only to the MIME
type/subtype parts of content types, not to the complete type
@@ -30968,8 +31007,9 @@
<p>If the resulting <span>absolute URL</span> has a different
<span title="url-scheme"><scheme></span> component than
- the manifest's URL (compared case-insensitively<!-- XXX ASCII
- -->), then jump back to the step labeled "start of line".</p>
+ the manifest's URL (compared in an <span>ASCII
+ case-insensitive</span> manner), then jump back to the step
+ labeled "start of line".</p>
<p>Drop the <span title="url-fragment"><fragment></span>
component of the resulting <span>absolute URL</span>, if it has
@@ -31009,8 +31049,9 @@
<p>If the resulting <span>absolute URL</span> for <var
title="">part two</var> has a different <span
title="url-scheme"><scheme></span> component than the
- manifest's URL (compared case-insensitively<!-- XXX ASCII -->),
- then jump back to the step labeled "start of line".</p>
+ manifest's URL (compared in an <span>ASCII
+ case-insensitive</span> manner), then jump back to the step
+ labeled "start of line".</p>
<p>Drop any the <span
title="url-fragment"><fragment></span> components of the
@@ -31045,8 +31086,9 @@
<p>If the resulting <span>absolute URL</span> has a different
<span title="url-scheme"><scheme></span> component than
- the manifest's URL (compared case-insensitively<!-- XXX ASCII
- -->), then jump back to the step labeled "start of line".</p>
+ the manifest's URL (compared in an <span>ASCII
+ case-insensitive</span> manner), then jump back to the step
+ labeled "start of line".</p>
<p>Drop the <span title="url-fragment"><fragment></span>
component of the resulting <span>absolute URL</span>, if it has
@@ -33888,7 +33930,8 @@
must be created first.</p>
<p>All strings including the empty string are valid database
- names. Database names are case-sensitive.</p>
+ names. Database names must be compared in a
+ <span>case-sensitive</span> manner.</p>
<p class="note">Implementations can support this even in
environments that only support a subset of all strings as database
@@ -36788,16 +36831,16 @@
title="">true</code>" if the content attribute is set to the true
state, <code title="">false</code>" if the content attribute is set
to the false state, and "<code title="">inherit</code>"
- otherwise. On setting, if the new value is case-insensitively<!--
- XXX ascii --> equal to the string "<code title="">inherit</code>"
- then the content attribute must be removed, if the new value is
- case-insensitively<!-- XXX ascii --> equal to the string "<code
- title="">true</code>" then the content attribute must be set to the
- string "<code title="">true</code>", if the new value is
- case-insensitively<!-- XXX ascii --> equal to the string "<code
- title="">false</code>" then the content attribute must be set to the
- string "<code title="">false</code>", and otherwise the attribute
- setter must raise a <code>SYNTAX_ERR</code> exception.</p>
+ otherwise. On setting, if the new value is an <span>ASCII
+ case-insensitive</span> match for the string "<code
+ title="">inherit</code>" then the content attribute must be removed,
+ if the new value is an <span>ASCII case-insensitive</span> match for
+ the string "<code title="">true</code>" then the content attribute
+ must be set to the string "<code title="">true</code>", if the new
+ value is an <span>ASCII case-insensitive</span> match for the string
+ "<code title="">false</code>" then the content attribute must be set
+ to the string "<code title="">false</code>", and otherwise the
+ attribute setter must raise a <code>SYNTAX_ERR</code> exception.</p>
<p>The <dfn
title="dom-isContentEditable"><code>isContentEditable</code></dfn>
@@ -37034,9 +37077,9 @@
<p>The <code title="dom-document-designMode">designMode</code> DOM
attribute on the <code>Document</code> object takes two values,
"<code title="">on</code>" and "<code title="">off</code>". When it
- is set, the new value must be case-insensitively <!-- XXX ASCII
- case-folding --> compared to these two values. If it matches the
- "<code title="">on</code>" value, then <code
+ is set, the new value must be compared in an <span>ASCII
+ case-insensitive</span> manner to these two values. If it matches
+ the "<code title="">on</code>" value, then <code
title="dom-document-designMode">designMode</code> must be enabled,
and if it matches the "<code title="">off</code>" value, then <code
title="dom-document-designMode">designMode</code> must be
@@ -38484,8 +38527,9 @@
title="">commandId</var>.</p>
<p>The possible values for <var title="">commandId</var>, and their
- corresponding meanings, are as follows. These values are <!-- XXX
- ASCII --> case-insensitive.</p>
+ corresponding meanings, are as follows. These values must be
+ compared to the argument in an <span>ASCII case-insensitive</span>
+ manner.</p>
<dl>
@@ -38546,22 +38590,22 @@
GREATER-THAN SIGN character ('>'), then remove the first and last
characters from <var title="">value</var>.</p></li>
<li>
- <p>If <var title="">value</var> is (now) a case-insensitive<!--
- XXX ASCII--> match for the tag name of an element defined by
- this specification that is defined to be a <span>prose
- element</span> but not a <span>phrasing element</span>, then,
- for every position in the selection, take the furthest
- <span>flow content</span> ancestor element of that position that
- contains only <span>phrasing content</span>, and, if that
- element is <span>editable</span>, and has a content model that
- allows it to contain <span>prose content</span> other than
- <span>phrasing content</span>, and has a parent element whose
- content model allows that parent to contain any <span>prose
- content</span>, rename the element (as if the <code
+ <p>If <var title="">value</var> is (now) an <span>ASCII
+ case-insensitive</span> match for the tag name of an element
+ defined by this specification that is defined to be a
+ <span>prose element</span> but not a <span>phrasing
+ element</span>, then, for every position in the selection, take
+ the furthest <span>flow content</span> ancestor element of that
+ position that contains only <span>phrasing content</span>, and,
+ if that element is <span>editable</span>, and has a content
+ model that allows it to contain <span>prose content</span> other
+ than <span>phrasing content</span>, and has a parent element
+ whose content model allows that parent to contain any
+ <span>prose content</span>, rename the element (as if the <code
title="">Element.renameNode()</code> method had been used) to
- <var title="">value</var>, using the HTML namespace.</p>
- <p>If there is no selection, then, where in the description
- above refers to the selection, the user agent must act as if the
+ <var title="">value</var>, using the HTML namespace.</p> <p>If
+ there is no selection, then, where in the description above
+ refers to the selection, the user agent must act as if the
selection was an empty range (with just one position) at the
caret position.</p>
</li>
@@ -39455,8 +39499,8 @@
verify that the URL parses without failure and has a <span
title="url-scheme"><scheme></span> component whose value is
either "<code title="">ws</code>" or "<code title="">wss</code>",
- when compared case-insensitively<!-- XXX ASCII -->. If it does, it
- has, and it is, then the user agent must asynchronously
+ when compared in an <span>ASCII case-insensitive</span> manner. If
+ it does, it has, and it is, then the user agent must asynchronously
<span>establish a Web Socket connection</span> to <var
title="">url</var>. Otherwise, the constructor must raise a
<code>SYNTAX_ERR</code> exception.</p>
@@ -41153,7 +41197,14 @@
</div>
+ <p>Many strings in the HTML syntax (e.g. the names of elements and
+ their attributes) are case-insensitive, but only for characters in
+ the ranges U+0041 .. U+005A (LATIN CAPITAL LETTER A to LATIN CAPITAL
+ LETTER Z) and U+0061 .. U+007A (LATIN SMALL LETTER A to LATIN SMALL
+ LETTER Z). For convenience, in this section this is just referred to
+ as "case-insensitive".</p>
+
<h4>The DOCTYPE</h4>
<p>A <dfn title="syntax-doctype">DOCTYPE</dfn> is a mostly useless,
@@ -41384,8 +41435,8 @@
the control characters, and any characters that are not defined by
Unicode. In the HTML syntax, attribute names may be written with any
mix of lower- and uppercase letters that, when converted to
- all-lowercase<!-- ASCII case-insensitive -->, matches the
- attribute's name; attribute names are case-insensitive.</p>
+ all-lowercase, matches the attribute's name; attribute names are
+ case-insensitive.</p>
<p><dfn title="syntax-attribute-value">Attribute values</dfn> are a
mixture of <span title="syntax-text">text</span> and <span
@@ -41735,10 +41786,10 @@
<p>The text in CDATA and RCDATA elements must not contain any
occurrences of the string "<code title=""></</code>" (U+003C
LESS-THAN SIGN, U+002F SOLIDUS) followed by characters that
- case-insensitively<!--ASCII--> match the tag name of the element
- followed by one of U+0009 CHARACTER TABULATION, U+000A LINE FEED
- (LF), U+000C FORM FEED (FF), U+0020 SPACE, U+003E GREATER-THAN SIGN
- (>), or U+002F SOLIDUS (/), unless that string is part of an <span
+ case-insensitively match the tag name of the element followed by one
+ of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM
+ FEED (FF), U+0020 SPACE, U+003E GREATER-THAN SIGN (>), or U+002F
+ SOLIDUS (/), unless that string is part of an <span
title="syntax-escape">escaping text span</span>.</p>
<p>An <dfn title="syntax-escape">escaping text span</dfn> is a span
@@ -42476,11 +42527,11 @@
<p>When comparing a string specifying a character encoding with the
name or alias of a character encoding to determine if they are
- equal, user agents must ignore the all characters in the ranges
- U+0009 to U+000D, U+0020 to U+002F, U+003A to U+0040, U+005B to
- U+0060, and U+007B to U+007E (all whitespace and punctuation
- characters in ASCII) in both names, and then perform the comparison
- case-insensitively<!-- XXX ASCII -->.</p>
+ equal, user agents must ignore all characters in the ranges U+0009
+ to U+000D, U+0020 to U+002F, U+003A to U+0040, U+005B to U+0060, and
+ U+007B to U+007E (all whitespace and punctuation characters in
+ ASCII) in both names, and then perform the comparison in an
+ <span>ASCII case-insensitive</span> manner.</p>
<p class="example">For instance, "GB_2312-80" and "g.b.2312(80)" are
considered equivalent names.</p>
@@ -43361,11 +43412,11 @@
<p>If the <span>content model flag</span> is set to the RCDATA or
CDATA states but no start tag token has ever been emitted by this
instance of the tokeniser (<span>fragment case</span>), or, if the
- <span>content model flag</span> is set to the RCDATA or CDATA
- states and the next few characters do not match the tag name of
- the last start tag token emitted (case insensitively), or if they
- do but they are not immediately followed by one of the following
- characters:</p>
+ <span>content model flag</span> is set to the RCDATA or CDATA states
+ and the next few characters do not match the tag name of the last
+ start tag token emitted (compared in an <span>ASCII case
+ insensitive</span> manner), or if they do but they are not
+ immediately followed by one of the following characters:</p>
<ul class="brief">
<li>U+0009 CHARACTER TABULATION</li>
@@ -43840,21 +43891,19 @@
whose data is the empty string, and switch to the <span>comment
start state</span>.</p>
- <p>Otherwise, if the next seven characters are a
- <span>case-insensitive</span><!-- XXX xref, ascii only --> match
- for the word "DOCTYPE", then consume those characters and switch
- to the <span>DOCTYPE state</span>.</p>
+ <p>Otherwise, if the next seven characters are an <span>ASCII
+ case-insensitive</span> match for the word "DOCTYPE", then consume
+ those characters and switch to the <span>DOCTYPE state</span>.</p>
<p>Otherwise, if the <span>insertion mode</span> is "<span
title="insertion mode: in foreign content">in foreign
- content</span>" and the <span>current node</span> is not an
- element in the <span>HTML namespace</span> and the next seven
- characters are a <span>case-sensitive</span><!-- XXX xref, ascii
- only --> match for the string "[CDATA[" (the five uppercase
- letters "CDATA" with a U+005B LEFT SQUARE BRACKET character before
- and after), then consume those characters and switch to the
- <span>CDATA section state</span> (which is unrelated to the
- <span>content model flag</span>'s CDATA state).</p>
+ content</span>" and the <span>current node</span> is not an element
+ in the <span>HTML namespace</span> and the next seven characters are
+ an <span>ASCII case-sensitive</span> match for the string "[CDATA["
+ (the five uppercase letters "CDATA" with a U+005B LEFT SQUARE
+ BRACKET character before and after), then consume those characters
+ and switch to the <span>CDATA section state</span> (which is
+ unrelated to the <span>content model flag</span>'s CDATA state).</p>
<p>Otherwise, this is a <span>parse error</span>. Switch to the
<span>bogus comment state</span>. The next character that is
@@ -44096,15 +44145,15 @@
<dt>Anything else</dt>
<dd>
- <p>If the next six characters are a
- <span>case-insensitive</span><!-- XXX xref, ascii only --> match
- for the word "PUBLIC", then consume those characters and switch
- to the <span>before DOCTYPE public identifier state</span>.</p>
+ <p>If the next six characters are an <span>ASCII
+ case-insensitive</span> match for the word "PUBLIC", then consume
+ those characters and switch to the <span>before DOCTYPE public
+ identifier state</span>.</p>
- <p>Otherwise, if the next six characters are a
- <span>case-insensitive</span><!-- XXX xref, ascii only --> match
- for the word "SYSTEM", then consume those characters and switch
- to the <span>before DOCTYPE system identifier state</span>.</p>
+ <p>Otherwise, if the next six characters are an <span>ASCII
+ case-insensitive</span> match for the word "SYSTEM", then consume
+ those characters and switch to the <span>before DOCTYPE system
+ identifier state</span>.</p>
<p>Otherwise, this is the <span>parse error</span>. Set the
DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to
@@ -44575,9 +44624,9 @@
<dd>
<p>Consume the maximum number of characters possible, with the
- consumed characters case-sensitively matching one of the
- identifiers in the first column of the <span>named character
- references</span> table.</p>
+ consumed characters matching one of the identifiers in the first
+ column of the <span>named character references</span> table (in a
+ <span>case-sensitive</span> manner).</p>
<p>If no match can be made, then this is a <span>parse
error</span>. No characters are consumed, and nothing is
@@ -44936,15 +44985,16 @@
<dt>A DOCTYPE token</dt>
<dd>
- <p>If the DOCTYPE token's <code title="">name</code> does not
- case-insensitively match the string "<code title="">HTML</code>",
- or if the token's public identifier is not missing, or if the
- token's system identifier is not missing, then there is a
- <span>parse error</span>. Conformance checkers may, instead of
- reporting this error, switch to a conformance checking mode for
- another language (e.g. based on the DOCTYPE token a conformance
- checker could recognize that the document is an HTML4-era
- document, and defer to an HTML4 conformance checker.)</p>
+ <p>If the DOCTYPE token's <code title="">name</code> is not an
+ <span>ASCII case-insensitive</span> match for the string "<code
+ title="">HTML</code>", or if the token's public identifier is not
+ missing, or if the token's system identifier is not missing, then
+ there is a <span>parse error</span>. Conformance checkers may,
+ instead of reporting this error, switch to a conformance checking
+ mode for another language (e.g. based on the DOCTYPE token a
+ conformance checker could recognize that the document is an
+ HTML4-era document, and defer to an HTML4 conformance
+ checker.)</p>
<p>Append a <code>DocumentType</code> node to the
<code>Document</code> node, with the <code title="">name</code>
@@ -45057,9 +45107,9 @@
</ul>
<p>The name, system identifier, and public identifier strings must
- be compared to the values given in the lists above in a
- case-insensitive<!-- ASCII --> manner. A system identifier whose
- value is the empty string is not considered missing for the
+ be compared to the values given in the lists above in an
+ <span>ASCII case-insensitive</span> manner. A system identifier
+ whose value is the empty string is not considered missing for the
purposes of the conditions above.</p>
<p>Then, switch the <span>insertion mode</span> to "<span
@@ -47011,10 +47061,10 @@
<dd>
<p>If the token does not have an attribute with the name "type",
- or if it does, but that attribute's value is not a
- case-insensitive <!-- XXX ASCII --> match for the string "hidden",
- or, if the <span>current table</span> is <span>tainted</span>,
- then: act as described in the "anything else" entry below.</p>
+ or if it does, but that attribute's value is not an <span>ASCII
+ case-insensitive</span> match for the string "hidden", or, if the
+ <span>current table</span> is <span>tainted</span>, then: act as
+ described in the "anything else" entry below.</p>
<p>Otherwise:</p>
@@ -48335,7 +48385,7 @@
the local names of elements and attributes, then the tool may map
all element and attribute local names that the API wouldn't support
to a set of names that <em>are</em> allowed, by replacing any
- character that isn't supported with the upper case letter U and the
+ character that isn't supported with the uppercase letter U and the
five digits of the character's Unicode codepoint when expressed in
hexadecimal, using digits 0-9 and capital letters A-F as the
symbols, in increasing numeric order.</p>
@@ -48910,9 +48960,10 @@
document (printing) and what this means for the UA, in particular
creating a new view for the print media.</p>
- <p class="big-issue">Must define that in CSS, tag names in HTML
- documents, and class names in quirks mode documents, are
- case-insensitive.</p>
+ <p class="big-issue">Must define that in CSS, tag and attribute
+ names in HTML documents, and class names in quirks mode documents,
+ are case-insensitive, as well as saying which attribute values must
+ be compared case-insensitively.</p>
<h3>Rendering and the DOM</h3>
@@ -49884,16 +49935,12 @@
<pav> the html spec should say what to do with it
-should we say that elements in HTML must be lowercase? (but with error
-handling for uppercase tags, obviously)? If so, update examples.
-
<title> is for out of context headers
<h1> is for in-context headers
The parsing rules of HTML
media="" is case-insensitive
-case-sensitivity of other attributes, and what it means
empty title attribute is equivalent to missing attribute for purposes
of alternate style sheet processing