HTML Standard Tracker

Diff (omit for latest revision)
Filter

Short URL: http://html5.org/r/8081

File a bug

SVNBugCommentTime (UTC)
808122661Closer integration with encoding.spec.whatwg.org2013-07-23 23:16
Index: source
===================================================================
--- source	(revision 8080)
+++ source	(revision 8081)
@@ -1189,7 +1189,7 @@
     <div class="example">
 
      <p>For example, the restriction on using UTF-7 exists purely to avoid authors falling prey to a
-     known cross-site-scripting attack using UTF-7.</p>
+     known cross-site-scripting attack using UTF-7. <a href="#refsUTF7">[UTF7]</a></p>
 
     </div>
 
@@ -1823,7 +1823,7 @@
   0x3F, 0x41 - 0x5A, and 0x61 - 0x7A<!-- is that list ok? do any character sets we want to support
   do things outside that range? -->, ignoring bytes that are the second and later bytes of multibyte
   sequences, all correspond to single-byte sequences that map to the same Unicode characters as
-  those bytes in ANSI_X3.4-1968 (US-ASCII). <a href="#refsRFC1345">[RFC1345]</a></p>
+  those bytes in Windows-1252<!--ANSI_X3.4-1968 (US-ASCII)-->. <a href="#refsENCODING">[ENCODING]</a></p>
 
   <p class="note">This includes such encodings as Shift_JIS, HZ-GB-2312, and variants of ISO-2022,
   even though it is possible in these encodings for bytes like 0x70 to be part of longer sequences
@@ -1835,9 +1835,8 @@
    different encodings at once, with different <meta charset> elements applying in each case.
   -->
 
-  <p>The term <dfn>a UTF-16 encoding</dfn> refers to any variant of UTF-16: self-describing UTF-16
-  with a BOM, ambiguous UTF-16 without a BOM, raw UTF-16LE, and raw UTF-16BE. <a
-  href="#refsRFC2781">[RFC2781]</a></p>
+  <p>The term <dfn>a UTF-16 encoding</dfn> refers to any variant of UTF-16: UTF-16LE or UTF-16BE,
+  regardless of the presence or absence of a BOM. <a href="#refsENCODING">[ENCODING]</a></p>
 
   <p>The term <dfn>code unit</dfn> is used as defined in the Web IDL specification: a 16 bit
   unsigned integer, the smallest atomic component of a <code>DOMString</code>. (This is a narrower
@@ -2211,6 +2210,10 @@
     algorithm</i>. The latter first strips a Byte Order Mark (BOM), if any, and then invokes the
     former.</p>
 
+    <p>For readability, character encodings are sometimes referenced in this specification with a
+    case that differs from the canonical case given in the encoding standard. (For example,
+    "UTF-16LE" instead of "utf16-le".)</p>
+
    </dd>
 
 
@@ -96680,14 +96683,7 @@
   UTF-32 in its algorithms; support and use of these encodings can thus lead to unexpected behavior
   in implementations of this specification.</p>
 
-  <p>When a user agent is to use the self-describing UTF-16 encoding but no Byte Order Mark (BOM)
-  has been found, user agents must default to little-endian UTF-16.</p>
 
-  <p class="note">The requirement to default UTF-16 to little-endian rather than big-endian is a
-  <span>willful violation</span> of RFC 2781, motivated by a desire for compatibility with legacy
-  content. <a href="#refsRFC2781">[RFC2781]</a></p>
-
-
   <h5>Changing the encoding while parsing</h5>
 
   <p>When the parser requires the user agent to <dfn>change the encoding</dfn>, it must run the
@@ -115757,9 +115753,6 @@
    <dt id="refsBIDI">[BIDI]</dt>
    <dd><cite><a href="http://www.unicode.org/reports/tr9/">UAX #9: Unicode Bidirectional Algorithm</a></cite>, M. Davis. Unicode Consortium.</dd>
 
-   <dt id="refsBIG5">[BIG5]</dt>
-   <dd>(Non-normative) <cite>Chinese Coded Character Set in Computer</cite>. Institute for Information Industry, March 1984.</dd>
-
    <dt id="refsBOCU1">[BOCU1]</dt>
    <dd>(Non-normative) <cite><a href="http://www.unicode.org/notes/tn6/">UTN #6: BOCU-1: MIME-Compatible Unicode Compression</a></cite>, M. Scherer, M. Davis. Unicode Consortium.</dd>
 
@@ -115785,11 +115778,8 @@
    <dd><cite><a href="http://fetch.spec.whatwg.org/">Cross-Origin Resource Sharing</a></cite>, A. van Kesteren. WHATWG.</dd>
 
    <dt id="refsCP50220">[CP50220]</dt>
-   <dd><cite><a href="http://www.iana.org/assignments/charset-reg/CP50220">CP50220</a></cite>, Y. Naruse. IANA.</dd> <!-- really should be "NARUSE, Y." or some such, but there's a western bias to these references for consistency. sorry. -->
+   <dd>(Non-normative) <cite><a href="http://www.iana.org/assignments/charset-reg/CP50220">CP50220</a></cite>, Y. Naruse. IANA.</dd> <!-- really should be "NARUSE, Y." or some such, but there's a western bias to these references for consistency. sorry. -->
 
-   <dt id="refsCP51932">[CP51932]</dt>
-   <dd><cite><a href="http://www.iana.org/assignments/charset-reg/CP51932">CP51932</a></cite>, Y. Naruse. IANA.</dd> <!-- really should be "NARUSE, Y." or some such, but there's a western bias to these references for consistency. sorry. -->
-
    <dt id="refsCSP">[CSP]</dt>
    <dd>(Non-normative) <cite><a href="http://dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-specification.dev.html">Content Security Policy</a></cite>, B. Sterne, A. Barth. W3C.</dd>
 
@@ -115930,9 +115920,6 @@
    <dt id="refsISO8601">[ISO8601]</dt>
    <dd>(Non-normative) <cite><a href="http://isotc.iso.org/livelink/livelink/4021199/ISO_8601_2004_E.zip?func=doc.Fetch&amp;nodeid=4021199">ISO8601: Data elements and interchange formats &mdash; Information interchange &mdash; Representation of dates and times</a></cite>. ISO.</dd>
 
-   <dt id="refsISO885911">[ISO885911]</dt>
-   <dd><cite><a href="http://std.dkuug.dk/jtc1/sc2/open/02n3333.pdf">ISO-8859-11: Information technology &mdash; 8-bit single-byte coded character sets &mdash; Part 11: Latin/Thai alphabet</a></cite>. ISO.</dd>
-
    <dt id="refsJLREQ">[JLREQ]</dt>
    <dd><cite><a href="http://www.w3.org/TR/jlreq/">Requirements for Japanese Text Layout</a></cite>. W3C.</dd> <!-- too many editors to list -->
 
@@ -116026,25 +116013,25 @@
    <dd><cite><a href="http://tools.ietf.org/html/rfc1321">The MD5 Message-Digest Algorithm</a></cite>, R. Rivest. IETF.</dd>
 
    <dt id="refsRFC1345">[RFC1345]</dt>
-   <dd><cite><a href="http://tools.ietf.org/html/rfc1345">Character Mnemonics and Character Sets</a></cite>, K. Simonsen. IETF.</dd>
+   <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1345">Character Mnemonics and Character Sets</a></cite>, K. Simonsen. IETF.</dd>
 
    <dt id="refsRFC1468">[RFC1468]</dt>
-   <dd><cite><a href="http://tools.ietf.org/html/rfc1468">Japanese Character Encoding for Internet Messages</a></cite>, J. Murai, M. Crispin, E. van der Poel. IETF.</dd>
+   <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1468">Japanese Character Encoding for Internet Messages</a></cite>, J. Murai, M. Crispin, E. van der Poel. IETF.</dd>
 
    <dt id="refsRFC1494">[RFC1494]</dt>
    <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1494">Equivalences between 1988 X.400 and RFC-822 Message Bodies</a></cite>, H. Alvestrand, S. Thompson. IETF.</dd>
 
    <dt id="refsRFC1554">[RFC1554]</dt>
-   <dd><cite><a href="http://tools.ietf.org/html/rfc1554">ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP</a></cite>, M. Ohta, K. Handa. IETF.</dd>
+   <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1554">ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP</a></cite>, M. Ohta, K. Handa. IETF.</dd>
 
    <dt id="refsRFC1557">[RFC1557]</dt>
-   <dd><cite><a href="http://tools.ietf.org/html/rfc1557">Korean Character Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF.</dd>
+   <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1557">Korean Character Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF.</dd>
 
    <dt id="refsRFC1842">[RFC1842]</dt>
-   <dd><cite><a href="http://tools.ietf.org/html/rfc1842">ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang. IETF.</dd>
+   <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1842">ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang. IETF.</dd>
 
    <dt id="refsRFC1922">[RFC1922]</dt>
-   <dd><cite><a href="http://tools.ietf.org/html/rfc1922">Chinese Character Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao, WCH. Chang, M. Crispin. IETF.</dd>
+   <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc1922">Chinese Character Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao, WCH. Chang, M. Crispin. IETF.</dd>
 
    <dt id="refsRFC2045">[RFC2045]</dt>
    <dd><cite><a href="http://tools.ietf.org/html/rfc2045">Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies</a></cite>, N. Freed, N. Borenstein. IETF.</dd>
@@ -116056,7 +116043,7 @@
    <dd><cite><a href="http://tools.ietf.org/html/rfc2119">Key words for use in RFCs to Indicate Requirement Levels</a></cite>, S. Bradner. IETF.</dd>
 
    <dt id="refsRFC2237">[RFC2237]</dt>
-   <dd><cite><a href="http://tools.ietf.org/html/rfc2237">Japanese Character Encoding for Internet Messages</a></cite>, K. Tamaru. IETF.</dd>
+   <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc2237">Japanese Character Encoding for Internet Messages</a></cite>, K. Tamaru. IETF.</dd>
 
    <dt id="refsRFC2246">[RFC2246]</dt>
    <dd><cite><a href="http://tools.ietf.org/html/rfc2246">The TLS Protocol Version 1.0</a></cite>, T. Dierks, C. Allen. IETF.</dd>
@@ -116079,9 +116066,6 @@
    <dt id="refsRFC2483">[RFC2483]</dt>
    <dd><cite><a href="http://tools.ietf.org/html/rfc2483">URI Resolution Services Necessary for URN Resolution</a></cite>, M. Mealling, R. Daniel. IETF.</dd>
 
-   <dt id="refsRFC2781">[RFC2781]</dt>
-   <dd><cite><a href="http://tools.ietf.org/html/rfc2781">UTF-16, an encoding of ISO 10646</a></cite>, P. Hoffman, F. Yergeau. IETF.</dd>
-
    <dt id="refsRFC3676">[RFC3676]</dt>
    <dd><cite><a href="http://tools.ietf.org/html/rfc3676">The Text/Plain Format and DelSp Parameters</a></cite>, R. Gellens. IETF.</dd>
 
@@ -116197,7 +116181,7 @@
    <dd><cite><a href="http://url.spec.whatwg.org/">URL</a></cite>, A. van Kesteren. WHATWG.</dd>
 
    <dt id="refsUTF7">[UTF7]</dt>
-   <dd><cite><a href="http://tools.ietf.org/html/rfc2152">UTF-7: A Mail-Safe Transformation Format of Unicode</a></cite>, D. Goldsmith, M. Davis. IETF.</dd>
+   <dd>(Non-normative) <cite><a href="http://tools.ietf.org/html/rfc2152">UTF-7: A Mail-Safe Transformation Format of Unicode</a></cite>, D. Goldsmith, M. Davis. IETF.</dd>
 
    <dt id="refsUTF8DET">[UTF8DET]</dt>
    <dd>(Non-normative) <cite><a href="http://www.w3.org/International/questions/qa-forms-utf-8">Multilingual form encoding</a></cite>, M. D&uuml;rst. W3C.</dd>
@@ -116208,9 +116192,6 @@
    <dt id="refsWCAG">[WCAG]</dt>
    <dd>(Non-normative) <cite><a href="http://www.w3.org/TR/WCAG20/">Web Content Accessibility Guidelines (WCAG) 2.0</a></cite>, B. Caldwell, M. Cooper, L. Reid, G. Vanderheiden. W3C.</dd>
 
-   <dt id="refsWEBADDRESSES">[WEBADDRESSES]</dt>
-   <dd><cite><a href="http://www.w3.org/html/wg/href/draft">Web addresses in HTML5</a></cite>, D. Connolly, C. Sperberg-McQueen.</dd>
-
    <dt id="refsWEBGL">[WEBGL]</dt>
    <dd><cite><a href="http://www.khronos.org/registry/webgl/specs/latest/">WebGL Specification</a></cite>, D. Jackson. Khronos Group.</dd>
 

|