HTML Standard Tracker

Filter

File a bug

SVNBugCommentTime (UTC)
3853Add note about why we strip all BOMs.2009-09-15 04:06
@@ -76637,20 +76637,26 @@ interface <dfn>MessagePort</dfn> {
   (e.g. invalid UTF-8 byte sequences in a UTF-8 input stream) are
   errors that conformance checkers are expected to report.</p>
 
   <p>Any byte or sequences of bytes in the original byte stream that
   is <span>misinterpreted for compatibility</span> is a <span>parse
   error</span>.</p>
 
   <p>One leading U+FEFF BYTE ORDER MARK character must be ignored if
   any are present.</p>
 
+  <p class="note">The requirement to strip a U+FEFF BYTE ORDER MARK
+  character regardless of whether that character was used to determine
+  the byte order is a <span>willful violation</span> of Unicode,
+  motivated by a desire to increase the resilience of user agents in
+  the face of na&iuml;ve transcoders.</p>
+
   <p>All U+0000 NULL characters in the input must be replaced by
   U+FFFD REPLACEMENT CHARACTERs. Any occurrences of such characters is
   a <span>parse error</span>.</p>
 
   <p>Any occurrences of any characters in the ranges U+0001 to U+0008,
   <!-- HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF,
   CR allowed --> U+000E to U+001F, <!-- ASCII allowed --> U+007F
   <!--to U+0084, (U+0085 NEL not allowed), U+0086--> to U+009F, U+D800
   to U+DFFF<!-- surrogates not allowed -->, U+FDD0 to U+FDEF, and
   characters U+000B, U+FFFE, U+FFFF, U+1FFFE, U+1FFFF, U+2FFFE,

|