HTML5 Tracker

Diff (omit for latest revision)
Filter

Short URL: http://html5.org/r/5079

SVNBugCommentTime (UTC)
5079[Gecko] [Internet Explorer] [Opera] [Webkit] Captions - Stage 9: The beginnings of the syntax and parser rules for WebSRT.2010-05-05 20:23
Index: source
===================================================================
--- source	(revision 5078)
+++ source	(revision 5079)
@@ -28303,16 +28303,108 @@
   <p>The WebSRT format (Web Subtitle Resource Tracks) is a format
   intended for marking up external timed track resources.</p>
 
+
   <h6>Syntax</h6>
 
-  <p class="XXX">...
+  <p>A <dfn>WebSRT file</dfn> must consist of a <span>WebSRT file
+  body</span> encoded as UTF-8.</p>
 
+  <p>A <dfn>WebSRT file body</dfn> consists of zero or more <span
+  title="WebSRT cue">WebSRT cues</span> separated from each other by
+  two or more <span title="WebSRT line terminator">WebSRT line
+  terminators</span>.</p>
+
+  <p>A <dfn>WebSRT cue</dfn> consists of the following components, in
+  the given order:</p>
+
+  <ol>
+   <li>Optionally, a <span>WebSRT cue identifier</span>.</li>
+   <li><span>WebSRT cue timings</span>.</li>
+   <li>Optionally, <span>WebSRT cue settings</span>.</li>
+   <li>A <span>WebSRT line terminator</span>.</li>
+   <li>Optionally, a <span>WebSRT voice declaration</span>.</li>
+   <li>One or more <span title="WebSRT cue text line">WebSRT cue text lines</span>, each separated from the next by a <span>WebSRT line terminator</span>.</li>
+   <li>Zero or more <span title="WebSRT line terminator">WebSRT line terminators</span>.</li>
+  </ol>
+
+  <p>A <dfn>WebSRT line terminator</dfn> consists of one of the
+  following:</p>
+
+  <ul class="brief">
+   <li>A U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pair.</li>
+   <li>A single U+000A LINE FEED (LF) character.</li>
+   <li>A single U+000D CARRIAGE RETURN (CR) character.</li>
+  </ul>
+
+  <p>A <dfn>WebSRT cue identifier</dfn> is any sequence of one or more
+  characters not containing the substring "<code title="">--></code>"
+  (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN
+  SIGN).</p>
+
+  <p>The <dfn>WebSRT cue timings</dfn> part of a <span>WebSRT
+  cue</span> consists of the following components, in the given
+  order:</p>
+
+  <ol>
+
+   <li>A <span>WebSRT timestamp</span> representing the start time
+   offset of the cue.</li>
+
+   <li>Optionally, a U+0020 SPACE character.</li>
+
+   <li>The string "<code title="">--></code>" (U+002D HYPHEN-MINUS,
+   U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN).</li>
+
+   <li>Optionally, a U+0020 SPACE character.</li>
+
+   <li>A <span>WebSRT timestamp</span> representing the end time
+   offset of the cue.</li>
+
+  </ol>
+
+  <p>The <dfn>WebSRT cue settings</dfn> part of a <span>WebSRT
+  cue</span> consists of the following components, in the given
+  order:</p>
+
+  <ol>
+
+   <li class="XXX">...
+
+  </ol>
+
+  <p class="XXX"><dfn>WebSRT voice declaration</dfn>; <dfn>WebSRT cue text line</dfn>; <dfn>WebSRT timestamp</dfn></p>
+
+
+  <div class="impl">
+
   <h6>Parsing</h6>
 
-  <p class="XXX">...
+  <p>A <dfn>WebSRT parser</dfn>, given an input byte stream, must
+  convert the bytes into Unicode characters by interpreting them as
+  UTF-8. Bytes or sequences of bytes that are not valid UTF-8
+  sequences must be interpreted as a U+FFFD REPLACEMENT CHARACTER. All
+  U+0000 NULL characters must be replaced by U+FFFD REPLACEMENT
+  CHARACTERs.</p>
 
+  <p>The Unicode characters from a string that must be parsed
+  according to the following algorithm:</p>
 
+  <ol>
 
+   <li><p>Let <var title="">input</var> be the string being
+   parsed.</p></li>
+
+   <li><p>Let <var title="">position</var> be a pointer into <var
+   title="">input</var>, initially pointing at the start of the
+   string.</p></li>
+
+   <li><p class="XXX">...</p></li>
+
+  </ol>
+
+  </div>
+
+
   <h5>User interface</h5>
 
   <p>The <dfn title="attr-media-controls"><code>controls</code></dfn>

|