HTML Standard Tracker

Diff (omit for latest revision)
Filter

Short URL: http://html5.org/r/4974

File a bug

SVNBugCommentTime (UTC)
4974Move things more towards what people want (details to be sorted out later by change proposal, I expect).2010-04-06 00:08
Index: source
===================================================================
--- source	(revision 4973)
+++ source	(revision 4974)
@@ -5072,39 +5072,44 @@
 
   <h4>Terminology</h4>
 
+  <!-- see also: svn diff -r3244:3245 source -->
+
   <p>A <dfn>URL</dfn> is a string used to identify a resource.</p>
 
-  <p>A <span>URL</span> is a <dfn>valid URL</dfn> if it is a
-  <span>valid Web address</span> as defined by the Web addresses
-  specification. <a href="#refsWEBADDRESSES">[WEBADDRESSES]</a></p>
+  <p>A <span>URL</span> is a <dfn>valid URL</dfn> if at least one of
+  the following conditions holds:</p>
 
-  <p>A <span>URL</span> is a <dfn>valid non-empty URL</dfn> if it is a
-  <span>valid URL</span> but it is not the empty string.</p>
+  <ul>
 
-  <p>A <span>URL</span> is an <dfn>absolute URL</dfn> if it is an
-  <span>absolute Web address</span> as defined by the Web addresses
-  specification. <a href="#refsWEBADDRESSES">[WEBADDRESSES]</a></p>
+   <li><p>The <span>URL</span> is a valid URI reference <a
+   href="#refsRFC3986">[RFC3986]</a>.</p></li>
 
-  <p>An <span>absolute URL</span> is a <dfn>hierarchical URL</dfn> if,
-  when <span title="parse a url">parsed</span>, there is a character
-  immediately after the <span title="url-scheme">&lt;scheme&gt;</span>
-  component and it is a U+002F SOLIDUS character (/).</p>
+   <li><p>The <span>URL</span> is a valid IRI reference and it has no
+   query component. <a href="#refsRFC3987">[RFC3987]</a></p></li>
 
-  <p>An <span>absolute URL</span> is an <dfn>authority-based URL</dfn>
-  if, when <span title="parse a url">parsed</span>, there are two
-  characters immediately after the <span
-  title="url-scheme">&lt;scheme&gt;</span> component and they are both
-  U+002F SOLIDUS characters (//).</p>
+   <li><p>The <span>URL</span> is a valid IRI reference and its query
+   component contains no unescaped non-ASCII characters. <a
+   href="#refsRFC3987">[RFC3987]</a></p></li>
 
+   <li><p>The <span>URL</span> is a valid IRI reference and the <span
+   title="document's character encoding">character encoding</span> of
+   the URL's <code>Document</code> is UTF-8 or UTF-16. <a
+   href="#refsRFC3987">[RFC3987]</a></p></li>
+
+  </ul>
+
+  <p>A <span>URL</span> is a <dfn>valid non-empty URL</dfn> if it is a
+  <span>valid URL</span> but it is not the empty string.</p>
+
   <div class="impl">
 
   <p>To <dfn>parse a URL</dfn> <var title="">url</var> into its
-  component parts, the user agent must use the <span>parse a Web
-  address</span> algorithm defined by the Web addresses
-  specification. <a href="#refsWEBADDRESSES">[WEBADDRESSES]</a></p>
+  component parts, the user agent must use the <span class="XXX">parse
+  an address</span> algorithm defined by the IRI specification. <a
+  href="#refsRFC3987">[RFC3987]</a></p>
 
-  <p>Parsing a URL results in the following components, again as
-  defined by the Web addresses specification:</p>
+  <p>Parsing a URL can fail. If it does not, then results in the
+  following components, again as defined by the IRI specification:</p>
 
   <ul class="brief">
    <li><dfn title="url-scheme">&lt;scheme&gt;</dfn></li>
@@ -5115,63 +5120,152 @@
    <li><dfn title="url-query">&lt;query&gt;</dfn></li>
    <li><dfn title="url-fragment">&lt;fragment&gt;</dfn></li>
    <li><dfn title="url-host-specific">&lt;host-specific&gt;</dfn></li>
-  </ul> 
+  </ul>
 
+  <hr>
+
   <p>To <dfn>resolve a URL</dfn> to an <span>absolute URL</span>
   relative to either another <span>absolute URL</span> or an element,
-  the user agent must use the <span>resolve a Web address</span>
-  algorithm defined by the Web addresses specification. <a
-  href="#refsWEBADDRESSES">[WEBADDRESSES]</a></p>
+  the user agent must use the following steps. Resolving a URL can
+  result in an error, in which case the URL is not resolvable.</p>
 
-  <p>The <dfn>document base URL</dfn> of a <code>Document</code>
-  object is the <span>absolute URL</span> obtained by running these
-  substeps:</p>
-
   <ol>
 
-   <li><p>Let <var title="">fallback base url</var> be <span>the
-   document's address</span>.</p></li>
+   <li><p>Let <var title="">url</var> be the <span>URL</span> being
+   resolved.</p></li>
 
    <li>
 
-    <!-- http://www.hixie.ch/tests/adhoc/html/navigation/javascript-url/ -->
+    <p>Let <var title="">encoding</var> be determined as follows:</p>
 
-    <!-- this should be tested in the case of a browsing context that
-    was navigated to about:blank after having been elsewhere, as
-    opposed to the about:blank used at the time of the browsing
-    context's creation. -->
+    <dl class="switch">
 
-    <p>If <var title="">fallback base url</var> is
-    <code>about:blank</code>, and the <code>Document</code>'s
-    <span>browsing context</span> has a <span>creator browsing
-    context</span>, then let <var title="">fallback base url</var>
-    be the <span>document base URL</span> of the <span>creator
-    <code>Document</code></span> instead.</p>
+     <dt>If the URL had a character encoding defined when the URL was
+     created or defined</dt>
 
+     <dd>The URL character encoding is as defined.</dd>
+
+     <dt>If the URL came from a script (e.g. as an argument to a
+     method)</dt>
+
+     <dd>The URL character encoding is the <span>script's URL character
+     encoding</span>.</dd>
+
+     <dt>If the URL came from a DOM node (e.g. from an element)</dt>
+
+     <dd>The node has a <code>Document</code>, and the URL character
+     encoding is the <span>document's character encoding</span>.</dd>
+
+    </dl>
+
    </li>
 
-   <li><p>If there is no <code>base</code> element that is both a
-   child of <span>the <code>head</code> element</span> and has an
-   <code title="attr-base-href">href</code> attribute, then the
-   <span>document base URL</span> is <var title="">fallback base
-   url</var>.</p></li>
+   <li><p>If <var title="">encoding</var> is a UTF-16 encoding, then
+   change the value of <var title="">encoding</var> to UTF-8.</p></li>
 
-   <li><p>Otherwise, let <var title="">url</var> be the value of the
-   <code title="attr-base-href">href</code> attribute of the first
-   such element.</p></li>
+   <li>
 
-   <li><p><span title="resolve a URL">Resolve</span> <var
-   title="">url</var> relative to <var title="">fallback base
-   url</var> (thus, the <code>base</code> <code
-   title="attr-base-href">href</code> attribute isn't affected by
-   <code title="attr-xml-base">xml:base</code> attributes).</p></li>
+    <p>If the algorithm was invoked with an <span>absolute URL</span>
+    to use as the base URL, let <var title="">base</var> be that
+    <span>absolute URL</span>.</p>
 
-   <li><p>The <span>document base URL</span> is the result of the
-   previous step if it was successful; otherwise it is <var
-   title="">fallback base url</var>.</p></li>
+    <p>Otherwise, let <var title="">base</var> be the <i>base URI of
+    the element</i>, as defined by the XML Base specification, with
+    <i>the base URI of the document entity</i> being defined as the
+    <span>document base URL</span> of the <code>Document</code> that
+    owns the element. <a href="#refsXMLBASE">[XMLBASE]</a></p>
 
+    <p>For the purposes of the XML Base specification, user agents
+    must act as if all <code>Document</code> objects represented XML
+    documents.</p>
+
+    <p class="note">It is possible for <code
+    title="attr-xml-base">xml:base</code> attributes to be present
+    even in HTML fragments, as such attributes can be added
+    dynamically using script. (Such scripts would not be conforming,
+    however, as <code title="attr-xml-base">xml:base</code> attributes
+    are not allowed in <span>HTML documents</span>.)</p>
+
+    <p>The <dfn>document base URL</dfn> of a <code>Document</code>
+    object is the <span>absolute URL</span> obtained by running these
+    substeps:</p>
+
+    <ol>
+
+     <li><p>Let <var title="">fallback base url</var> be <span>the
+     document's address</span>.</p></li>
+
+     <li>
+
+      <!-- http://www.hixie.ch/tests/adhoc/html/navigation/javascript-url/ -->
+
+      <!-- this should be tested in the case of a browsing context that
+      was navigated to about:blank after having been elsewhere, as
+      opposed to the about:blank used at the time of the browsing
+      context's creation. -->
+
+      <p>If <var title="">fallback base url</var> is
+      <code>about:blank</code>, and the <code>Document</code>'s
+      <span>browsing context</span> has a <span>creator browsing
+      context</span>, then let <var title="">fallback base url</var>
+      be the <span>document base URL</span> of the <span>creator
+      <code>Document</code></span> instead.</p>
+
+     </li>
+
+     <li><p>If there is no <code>base</code> element that is both a
+     child of <span>the <code>head</code> element</span> and has an
+     <code title="attr-base-href">href</code> attribute, then the
+     <span>document base URL</span> is <var title="">fallback base
+     url</var>.</p></li>
+
+     <li><p>Otherwise, let <var title="">url</var> be the value of the
+     <code title="attr-base-href">href</code> attribute of the first
+     such element.</p></li>
+
+     <li><p><span title="resolve a URL">Resolve</span> <var
+     title="">url</var> relative to <var title="">fallback base
+     url</var> (thus, the <code>base</code> <code
+     title="attr-base-href">href</code> attribute isn't affected by
+     <code title="attr-xml-base">xml:base</code> attributes).</p></li>
+
+     <li><p>The <span>document base URL</span> is the result of the
+     previous step if it was successful; otherwise it is <var
+     title="">fallback base url</var>.</p></li>
+
+    </ol>
+
+   </li>
+
+   <li><p>Return the result of applying the <span class="XXX">resolve
+   an address</span> algorithm defined by the IRI specification to
+   resolve <var title="">url</var> relative to <var
+   title="">base</var> using encoding <var title="">encoding</var>. <a
+   href="#refsRFC3987">[RFC3987]</a></p></li>
+
   </ol>
 
+  </div>
+
+  <p>A <span>URL</span> is an <dfn>absolute URL</dfn> if <span
+  title="resolve a url">resolving</span> it results in the same output
+  regardless of what it is resolved relative to, and that output is
+  not a failure.</p>
+
+  <p>An <span>absolute URL</span> is a <dfn>hierarchical URL</dfn> if,
+  when <span title="resolve a url">resolved</span> and then <span
+  title="parse a url">parsed</span>, there is a character immediately
+  after the <span title="url-scheme">&lt;scheme&gt;</span> component
+  and it is a U+002F SOLIDUS character (/).</p>
+
+  <p>An <span>absolute URL</span> is an <dfn>authority-based URL</dfn>
+  if, when <span title="resolve a url">resolved</span> and then <span
+  title="parse a url">parsed</span>, there are two characters
+  immediately after the <span title="url-scheme">&lt;scheme&gt;</span>
+  component and they are both U+002F SOLIDUS characters (//).</p>
+
+  <hr>
+
   <p>This specification defines the URL
   <dfn><code>about:legacy-compat</code></dfn> as a reserved, though
   unresolvable, <code title="">about:</code> URI, for use in <span
@@ -5187,8 +5281,6 @@
   title="attr-iframe-srcdoc">srcdoc</code> documents</span>. <a
   href="#refsABOUT">[ABOUT]</a></p>
 
-  </div>
-
   <p class="note">The term "URL" in this specification is used in a
   manner distinct from the precise technical meaning it is given in
   RFC 3986. Readers familiar with that RFC will find it easier to read

|