Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
[giow] (0) Captions - Stage 9.1: More parser rules for WebSRT.
git-svn-id: http://svn.whatwg.org/webapps@5080 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information
Hixie committed May 5, 2010
1 parent 3a7eaea commit 02c2631
Show file tree
Hide file tree
Showing 3 changed files with 99 additions and 24 deletions.
41 changes: 33 additions & 8 deletions complete.html
Expand Up @@ -26175,6 +26175,12 @@ <h6 id=timed-track-api><span class=secno>4.8.10.10.4 </span>Timed track API</h6>

<p class=XXX>...

<!-- XXX
Make sure that .cues and .activeCues doesn't change while script is
running, except for addCue/removeCue and the removal of all cues in
the face of a dynamic track.src change.
-->

</div>


Expand Down Expand Up @@ -26254,22 +26260,41 @@ <h6 id=syntax-0><span class=secno>4.8.10.11.1 </span>Syntax</h6>

<h6 id=parsing-0><span class=secno>4.8.10.11.2 </span>Parsing</h6>

<p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream, must
convert the bytes into Unicode characters by interpreting them as
UTF-8. Bytes or sequences of bytes that are not valid UTF-8
sequences must be interpreted as a U+FFFD REPLACEMENT CHARACTER. All
U+0000 NULL characters must be replaced by U+FFFD REPLACEMENT
CHARACTERs.</p>
<p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream and a
<a href=#timed-track-list-of-cues>timed track list of cues</a> <var title="">output</var>,
must convert the bytes into a string of Unicode characters by
interpreting them as UTF-8, and then must parse the resulting string
according to the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. A
<a href=#websrt-parser>WebSRT parser</a>, specifically its conversion and parsing
steps, is typically run asynchronously, with the input byte stream
being updated incrementally as the resource is downloaded.</p>

<p>When convering the bytes into Unicode characters, bytes or
sequences of bytes that are not valid UTF-8 sequences must be
interpreted as a U+FFFD REPLACEMENT CHARACTER, and all U+0000 NULL
characters must be replaced by U+FFFD REPLACEMENT CHARACTERs.</p>

<p>The Unicode characters from a string that must be parsed
according to the following algorithm:</p>
<p>The <dfn id=websrt-parser-algorithm>WebSRT parser algorithm</dfn> is as follows:</p>

<ol><li><p>Let <var title="">input</var> be the string being
parsed.</li>

<li><p>Let <var title="">position</var> be a pointer into <var title="">input</var>, initially pointing at the start of the
string.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
either U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those
characters, if any.</li>

<li><p>If <var title="">line</var> is the empty string, then the
file has ended. Abort these steps. The <a href=#websrt-parser>WebSRT parser</a>
has finished.</li>

<li><p class=XXX>...</li>

</ol></div>
Expand Down
41 changes: 33 additions & 8 deletions index
Expand Up @@ -26076,6 +26076,12 @@ interface <dfn id=timedtrackcue>TimedTrackCue</dfn> {

<p class=XXX>...

<!-- XXX
Make sure that .cues and .activeCues doesn't change while script is
running, except for addCue/removeCue and the removal of all cues in
the face of a dynamic track.src change.
-->

</div>


Expand Down Expand Up @@ -26155,22 +26161,41 @@ CueEvent

<h6 id=parsing-0><span class=secno>4.8.10.11.2 </span>Parsing</h6>

<p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream, must
convert the bytes into Unicode characters by interpreting them as
UTF-8. Bytes or sequences of bytes that are not valid UTF-8
sequences must be interpreted as a U+FFFD REPLACEMENT CHARACTER. All
U+0000 NULL characters must be replaced by U+FFFD REPLACEMENT
CHARACTERs.</p>
<p>A <dfn id=websrt-parser>WebSRT parser</dfn>, given an input byte stream and a
<a href=#timed-track-list-of-cues>timed track list of cues</a> <var title="">output</var>,
must convert the bytes into a string of Unicode characters by
interpreting them as UTF-8, and then must parse the resulting string
according to the <a href=#websrt-parser-algorithm>WebSRT parser algorithm</a> below. A
<a href=#websrt-parser>WebSRT parser</a>, specifically its conversion and parsing
steps, is typically run asynchronously, with the input byte stream
being updated incrementally as the resource is downloaded.</p>

<p>When convering the bytes into Unicode characters, bytes or
sequences of bytes that are not valid UTF-8 sequences must be
interpreted as a U+FFFD REPLACEMENT CHARACTER, and all U+0000 NULL
characters must be replaced by U+FFFD REPLACEMENT CHARACTERs.</p>

<p>The Unicode characters from a string that must be parsed
according to the following algorithm:</p>
<p>The <dfn id=websrt-parser-algorithm>WebSRT parser algorithm</dfn> is as follows:</p>

<ol><li><p>Let <var title="">input</var> be the string being
parsed.</li>

<li><p>Let <var title="">position</var> be a pointer into <var title="">input</var>, initially pointing at the start of the
string.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
either U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those
characters, if any.</li>

<li><p>If <var title="">line</var> is the empty string, then the
file has ended. Abort these steps. The <a href=#websrt-parser>WebSRT parser</a>
has finished.</li>

<li><p class=XXX>...</li>

</ol></div>
Expand Down
41 changes: 33 additions & 8 deletions source
Expand Up @@ -28285,6 +28285,12 @@ interface <dfn>TimedTrackCue</dfn> {

<p class="XXX">...

<!-- XXX
Make sure that .cues and .activeCues doesn't change while script is
running, except for addCue/removeCue and the removal of all cues in
the face of a dynamic track.src change.
-->

</div>


Expand Down Expand Up @@ -28379,15 +28385,21 @@ CueEvent

<h6>Parsing</h6>

<p>A <dfn>WebSRT parser</dfn>, given an input byte stream, must
convert the bytes into Unicode characters by interpreting them as
UTF-8. Bytes or sequences of bytes that are not valid UTF-8
sequences must be interpreted as a U+FFFD REPLACEMENT CHARACTER. All
U+0000 NULL characters must be replaced by U+FFFD REPLACEMENT
CHARACTERs.</p>
<p>A <dfn>WebSRT parser</dfn>, given an input byte stream and a
<span>timed track list of cues</span> <var title="">output</var>,
must convert the bytes into a string of Unicode characters by
interpreting them as UTF-8, and then must parse the resulting string
according to the <span>WebSRT parser algorithm</span> below. A
<span>WebSRT parser</span>, specifically its conversion and parsing
steps, is typically run asynchronously, with the input byte stream
being updated incrementally as the resource is downloaded.</p>

<p>When convering the bytes into Unicode characters, bytes or
sequences of bytes that are not valid UTF-8 sequences must be
interpreted as a U+FFFD REPLACEMENT CHARACTER, and all U+0000 NULL
characters must be replaced by U+FFFD REPLACEMENT CHARACTERs.</p>

<p>The Unicode characters from a string that must be parsed
according to the following algorithm:</p>
<p>The <dfn>WebSRT parser algorithm</dfn> is as follows:</p>

<ol>

Expand All @@ -28398,6 +28410,19 @@ CueEvent
title="">input</var>, initially pointing at the start of the
string.</p></li>

<li><p><span>Collect a sequence of characters</span> that are
either U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters.</p></li>

<li><p><span>Collect a sequence of characters</span> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those
characters, if any.</p></li>

<li><p>If <var title="">line</var> is the empty string, then the
file has ended. Abort these steps. The <span>WebSRT parser</span>
has finished.</p></li>

<li><p class="XXX">...</p></li>

</ol>
Expand Down

0 comments on commit 02c2631

Please sign in to comment.