Skip to content

Commit

Permalink
[e] (0) Some more references to UTF-8.
Browse files Browse the repository at this point in the history
git-svn-id: http://svn.whatwg.org/webapps@5258 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information
Hixie committed Aug 10, 2010
1 parent 3ef7450 commit 1d34c50
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 31 deletions.
25 changes: 13 additions & 12 deletions complete.html
Expand Up @@ -13408,12 +13408,12 @@ <h5 id=charset><span class=secno>4.2.5.5 </span>Specifying the document's charac
<a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>.</p>

<p>Authors are encouraged to use UTF-8. Conformance checkers may
advise authors against using legacy encodings.</p>
advise authors against using legacy encodings. <a href=#refsRFC3629>[RFC3629]</a></p>

<div class=impl>

<p>Authoring tools should default to using UTF-8 for newly-created
documents.</p>
documents. <a href=#refsRFC3629>[RFC3629]</a></p>

</div>

Expand Down Expand Up @@ -27759,7 +27759,7 @@ <h6 id=syntax-0><span class=secno>4.8.10.11.1 </span>Syntax</h6>

<p>A <dfn id=websrt-file>WebSRT file</dfn> must consist of a <a href=#websrt-file-body>WebSRT file
body</a> encoded as UTF-8 and labeled with the <a href=#mime-type>MIME
type</a> <code><a href=#text/srt>text/srt</a></code>.</p>
type</a> <code><a href=#text/srt>text/srt</a></code>. <a href=#refsRFC3629>[RFC3629]</a></p>

<p>A <dfn id=websrt-file-body>WebSRT file body</dfn> consists of zero or more <a href=#websrt-line-terminator title="WebSRT line terminator">WebSRT line terminators</a>,
followed by zero or more <a href=#websrt-cue title="WebSRT cue">WebSRT cues</a>
Expand Down Expand Up @@ -28027,7 +28027,7 @@ <h6 id=parsing-0><span class=secno>4.8.10.11.2 </span>Parsing</h6>
interpreting them as UTF-8, and then must parse the resulting string
according to the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. This
results in <a href=#timed-track-cue title="timed track cue">timed track cues</a>
being added to <var title="">output</var>.</p>
being added to <var title="">output</var>. <a href=#refsRFC3629>[RFC3629]</a></p>

<p>A <a href=#websrt-parser>WebSRT parser</a>, specifically its conversion and
parsing steps, is typically run asynchronously, with the input byte
Expand Down Expand Up @@ -61630,7 +61630,7 @@ <h5 id=writing-cache-manifests><span class=secno>6.6.3.2 </span>Writing cache ma
encoded using UTF-8. Data in application cache manifests is
line-based. Newlines must be represented by U+000A LINE FEED (LF)
characters, U+000D CARRIAGE RETURN (CR) characters, or U+000D
CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs.</p>
CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs. <a href=#refsRFC3629>[RFC3629]</a></p>

<p class=note>This is a <a href=#willful-violation>willful violation</a> of two
aspects of RFC 2046, which requires all <code title="">text/*</code>
Expand Down Expand Up @@ -61790,7 +61790,7 @@ <h5 id=parsing-cache-manifests><span class=secno>6.6.3.3 </span>Parsing cache ma
a U+FFFD REPLACEMENT CHARACTER. <!--All U+0000 NULL characters must
be replaced by U+FFFD REPLACEMENT CHARACTERs. (this isn't black-box
testable since neither U+0000 nor U+FFFD are valid anywhere in the
syntax and thus both will be treated the same anyway)--></li>
syntax and thus both will be treated the same anyway)--> <a href=#refsRFC3629>[RFC3629]</a></li>

<li><p>Let <var title="">base URL</var> be the <a href=#absolute-url>absolute
URL</a> representing the manifest.</li>
Expand Down Expand Up @@ -70765,7 +70765,7 @@ <h4 id=processing-model-3><span class=secno>9.2.5 </span>Processing model</h4>
steps.</p>

<p>If the attempt succeeds, then convert the script resource to
Unicode by assuming it was encoded as UTF-8, to obtain its <var title="">source</var>.</p>
Unicode by assuming it was encoded as UTF-8, to obtain its <var title="">source</var>. <a href=#refsRFC3629>[RFC3629]</a></p>

<p>Let <var title="">language</var> be JavaScript.</p>

Expand Down Expand Up @@ -71510,7 +71510,7 @@ <h4 id=importing-scripts-and-libraries><span class=secno>9.3.1 </span>Importing
steps.</p>

<p>If the attempt succeeds, then convert the script resource to
Unicode by assuming it was encoded as UTF-8, to obtain its <var title="">source</var>.</p>
Unicode by assuming it was encoded as UTF-8, to obtain its <var title="">source</var>. <a href=#refsRFC3629>[RFC3629]</a></p>

<p>Let <var title="">language</var> be JavaScript.</p>

Expand Down Expand Up @@ -72091,7 +72091,7 @@ <h4 id=parsing-an-event-stream><span class=secno>10.2.4 </span>Parsing an event
; a Unicode character other than U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR)</pre>

<p>Event streams in this format must always be encoded as
UTF-8.</p>
UTF-8. <a href=#refsRFC3629>[RFC3629]</a></p>

<p>Lines must be separated by either a U+000D CARRIAGE RETURN U+000A
LINE FEED (CRLF) character pair, a single U+000A LINE FEED (LF)
Expand All @@ -72107,8 +72107,9 @@ <h4 id=parsing-an-event-stream><span class=secno>10.2.4 </span>Parsing an event

<h4 id=event-stream-interpretation><span class=secno>10.2.5 </span>Interpreting an event stream</h4>

<p>Bytes or sequences of bytes that are not valid UTF-8 sequences
must be interpreted as the U+FFFD REPLACEMENT CHARACTER.</p>
<p>Streams must be decoded as UTF-8 text. Bytes or sequences of
bytes that are not valid UTF-8 sequences must be interpreted as the
U+FFFD REPLACEMENT CHARACTER. <a href=#refsRFC3629>[RFC3629]</a></p>

<p>One leading U+FEFF BYTE ORDER MARK character must be ignored if
any are present.</p>
Expand Down Expand Up @@ -78703,7 +78704,7 @@ <h5 id=determining-the-character-encoding><span class=secno>12.2.2.1 </span>Dete
<h5 id=character-encodings-0><span class=secno>12.2.2.2 </span>Character encodings</h5>

<p>User agents must at a minimum support the UTF-8 and Windows-1252
encodings, but may support more.</p>
encodings, but may support more. <a href=#refsRFC3629>[RFC3629]</a> <a href=#refsWIN1252>[WIN1252]</a></p>

<p class=note>It is not unusual for Web browsers to support dozens
if not upwards of a hundred distinct character encodings.</p>
Expand Down
14 changes: 7 additions & 7 deletions index
Expand Up @@ -13332,12 +13332,12 @@ people expect to have work and what is necessary.
<a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>.</p>

<p>Authors are encouraged to use UTF-8. Conformance checkers may
advise authors against using legacy encodings.</p>
advise authors against using legacy encodings. <a href=#refsRFC3629>[RFC3629]</a></p>

<div class=impl>

<p>Authoring tools should default to using UTF-8 for newly-created
documents.</p>
documents. <a href=#refsRFC3629>[RFC3629]</a></p>

</div>

Expand Down Expand Up @@ -27686,7 +27686,7 @@ interface <dfn id=timedtrackcue>TimedTrackCue</dfn> {

<p>A <dfn id=websrt-file>WebSRT file</dfn> must consist of a <a href=#websrt-file-body>WebSRT file
body</a> encoded as UTF-8 and labeled with the <a href=#mime-type>MIME
type</a> <code><a href=#text/srt>text/srt</a></code>.</p>
type</a> <code><a href=#text/srt>text/srt</a></code>. <a href=#refsRFC3629>[RFC3629]</a></p>

<p>A <dfn id=websrt-file-body>WebSRT file body</dfn> consists of zero or more <a href=#websrt-line-terminator title="WebSRT line terminator">WebSRT line terminators</a>,
followed by zero or more <a href=#websrt-cue title="WebSRT cue">WebSRT cues</a>
Expand Down Expand Up @@ -27954,7 +27954,7 @@ interface <dfn id=timedtrackcue>TimedTrackCue</dfn> {
interpreting them as UTF-8, and then must parse the resulting string
according to the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. This
results in <a href=#timed-track-cue title="timed track cue">timed track cues</a>
being added to <var title="">output</var>.</p>
being added to <var title="">output</var>. <a href=#refsRFC3629>[RFC3629]</a></p>

<p>A <a href=#websrt-parser>WebSRT parser</a>, specifically its conversion and
parsing steps, is typically run asynchronously, with the input byte
Expand Down Expand Up @@ -61566,7 +61566,7 @@ NETWORK:
encoded using UTF-8. Data in application cache manifests is
line-based. Newlines must be represented by U+000A LINE FEED (LF)
characters, U+000D CARRIAGE RETURN (CR) characters, or U+000D
CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs.</p>
CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs. <a href=#refsRFC3629>[RFC3629]</a></p>

<p class=note>This is a <a href=#willful-violation>willful violation</a> of two
aspects of RFC 2046, which requires all <code title="">text/*</code>
Expand Down Expand Up @@ -61726,7 +61726,7 @@ NETWORK:
a U+FFFD REPLACEMENT CHARACTER. <!--All U+0000 NULL characters must
be replaced by U+FFFD REPLACEMENT CHARACTERs. (this isn't black-box
testable since neither U+0000 nor U+FFFD are valid anywhere in the
syntax and thus both will be treated the same anyway)--></li>
syntax and thus both will be treated the same anyway)--> <a href=#refsRFC3629>[RFC3629]</a></li>

<li><p>Let <var title="">base URL</var> be the <a href=#absolute-url>absolute
URL</a> representing the manifest.</li>
Expand Down Expand Up @@ -71814,7 +71814,7 @@ interface <dfn id=messageport>MessagePort</dfn> {
<h5 id=character-encodings-0><span class=secno>10.2.2.2 </span>Character encodings</h5>

<p>User agents must at a minimum support the UTF-8 and Windows-1252
encodings, but may support more.</p>
encodings, but may support more. <a href=#refsRFC3629>[RFC3629]</a> <a href=#refsWIN1252>[WIN1252]</a></p>

<p class=note>It is not unusual for Web browsers to support dozens
if not upwards of a hundred distinct character encodings.</p>
Expand Down
28 changes: 16 additions & 12 deletions source
Expand Up @@ -14071,12 +14071,13 @@ people expect to have work and what is necessary.
<span>ASCII-compatible character encoding</span>.</p>

<p>Authors are encouraged to use UTF-8. Conformance checkers may
advise authors against using legacy encodings.</p>
advise authors against using legacy encodings. <a
href="#refsRFC3629">[RFC3629]</a></p>

<div class="impl">

<p>Authoring tools should default to using UTF-8 for newly-created
documents.</p>
documents. <a href="#refsRFC3629">[RFC3629]</a></p>

</div>

Expand Down Expand Up @@ -30126,7 +30127,7 @@ interface <dfn>TimedTrackCue</dfn> {

<p>A <dfn>WebSRT file</dfn> must consist of a <span>WebSRT file
body</span> encoded as UTF-8 and labeled with the <span>MIME
type</span> <code>text/srt</code>.</p>
type</span> <code>text/srt</code>. <a href="#refsRFC3629">[RFC3629]</a></p>

<p>A <dfn>WebSRT file body</dfn> consists of zero or more <span
title="WebSRT line terminator">WebSRT line terminators</span>,
Expand Down Expand Up @@ -30474,7 +30475,7 @@ interface <dfn>TimedTrackCue</dfn> {
interpreting them as UTF-8, and then must parse the resulting string
according to the <span>WebSRT parser algorithm</span> below. This
results in <span title="timed track cue">timed track cues</span>
being added to <var title="">output</var>.</p>
being added to <var title="">output</var>. <a href="#refsRFC3629">[RFC3629]</a></p>

<p>A <span>WebSRT parser</span>, specifically its conversion and
parsing steps, is typically run asynchronously, with the input byte
Expand Down Expand Up @@ -69633,7 +69634,7 @@ NETWORK:
encoded using UTF-8. Data in application cache manifests is
line-based. Newlines must be represented by U+000A LINE FEED (LF)
characters, U+000D CARRIAGE RETURN (CR) characters, or U+000D
CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs.</p>
CARRIAGE RETURN (CR) U+000A LINE FEED (LF) pairs. <a href="#refsRFC3629">[RFC3629]</a></p>

<p class="note">This is a <span>willful violation</span> of two
aspects of RFC 2046, which requires all <code title="">text/*</code>
Expand Down Expand Up @@ -69816,7 +69817,7 @@ NETWORK:
a U+FFFD REPLACEMENT CHARACTER. <!--All U+0000 NULL characters must
be replaced by U+FFFD REPLACEMENT CHARACTERs. (this isn't black-box
testable since neither U+0000 nor U+FFFD are valid anywhere in the
syntax and thus both will be treated the same anyway)--></p></li>
syntax and thus both will be treated the same anyway)--> <a href="#refsRFC3629">[RFC3629]</a></p></li>

<li><p>Let <var title="">base URL</var> be the <span>absolute
URL</span> representing the manifest.</p></li>
Expand Down Expand Up @@ -79552,7 +79553,7 @@ interface <dfn>SharedWorkerGlobalScope</dfn> : <span>WorkerGlobalScope</span> {

<p>If the attempt succeeds, then convert the script resource to
Unicode by assuming it was encoded as UTF-8, to obtain its <var
title="">source</var>.</p>
title="">source</var>. <a href="#refsRFC3629">[RFC3629]</a></p>

<p>Let <var title="">language</var> be JavaScript.</p>

Expand Down Expand Up @@ -80425,7 +80426,7 @@ interface <dfn>WorkerUtils</dfn> {

<p>If the attempt succeeds, then convert the script resource to
Unicode by assuming it was encoded as UTF-8, to obtain its <var
title="">source</var>.</p>
title="">source</var>. <a href="#refsRFC3629">[RFC3629]</a></p>

<p>Let <var title="">language</var> be JavaScript.</p>

Expand Down Expand Up @@ -81105,7 +81106,7 @@ any-char = %x0000-0009 / %x000B-000C / %x000E-10FFFF
; a Unicode character other than U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR)</pre>

<p>Event streams in this format must always be encoded as
UTF-8.</p>
UTF-8. <a href="#refsRFC3629">[RFC3629]</a></p>

<p>Lines must be separated by either a U+000D CARRIAGE RETURN U+000A
LINE FEED (CRLF) character pair, a single U+000A LINE FEED (LF)
Expand All @@ -81121,8 +81122,9 @@ any-char = %x0000-0009 / %x000B-000C / %x000E-10FFFF

<h4 id="event-stream-interpretation">Interpreting an event stream</h4>

<p>Bytes or sequences of bytes that are not valid UTF-8 sequences
must be interpreted as the U+FFFD REPLACEMENT CHARACTER.</p>
<p>Streams must be decoded as UTF-8 text. Bytes or sequences of
bytes that are not valid UTF-8 sequences must be interpreted as the
U+FFFD REPLACEMENT CHARACTER. <a href="#refsRFC3629">[RFC3629]</a></p>

<p>One leading U+FEFF BYTE ORDER MARK character must be ignored if
any are present.</p>
Expand Down Expand Up @@ -89841,7 +89843,9 @@ interface <dfn>SQLTransactionSync</dfn> {
<h5>Character encodings</h5>

<p>User agents must at a minimum support the UTF-8 and Windows-1252
encodings, but may support more.</p>
encodings, but may support more. <a
href="#refsRFC3629">[RFC3629]</a> <a
href="#refsWIN1252">[WIN1252]</a></p>

<p class="note">It is not unusual for Web browsers to support dozens
if not upwards of a hundred distinct character encodings.</p>
Expand Down

0 comments on commit 1d34c50

Please sign in to comment.