Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: allow percent encoded code points in the non-special URLs host #218

Merged
merged 7 commits into from
Feb 8, 2017
63 changes: 33 additions & 30 deletions url.bs
Original file line number Diff line number Diff line change
Expand Up @@ -360,9 +360,9 @@ context to be distinguished.

<h3 id=host-parsing>Host parsing</h3>

<p>The <dfn id=concept-host-parser>host parser</dfn> takes a string <var>input</var> and
an optional <var>Unicode flag</var> (unset unless stated otherwise), and then runs these
steps:
<p>The <dfn id=concept-host-parser>host parser</dfn> takes a string <var>input</var>, a boolean
<var>isSpecial</var>, and an optional <var>Unicode flag</var> (unset unless stated otherwise), and
then runs these steps:

<ol>
<li>
Expand All @@ -379,6 +379,9 @@ steps:
"<code>]</code>" removed.
</ol>

<li><p>If <var>isSpecial</var> is false, then return the result of
<a lt="opaque-host parser">opaque-host parsing</a> <var>input</var>.

<li>
<p>Let <var>domain</var> be the result of running <a>UTF-8 decode without BOM</a> on the
<a lt="percent decode">percent decoding</a> of <a>UTF-8 encode</a> on <var>input</var>.
Expand Down Expand Up @@ -449,8 +452,10 @@ The <dfn>IPv4 number parser</dfn> takes a string <var>input</var> and a
<!-- XXX well, you know, it works for ECMAScript, kinda -->
</ol>

The <dfn id=concept-ipv4-parser>IPv4 parser</dfn> takes a string <var>input</var> and then
runs these steps:
<hr>

<p>The <dfn id=concept-ipv4-parser>IPv4 parser</dfn> takes a string <var>input</var> and then runs
these steps:

<ol>
<li><p>Let <var>syntaxViolationFlag</var> be unset.
Expand Down Expand Up @@ -521,6 +526,8 @@ runs these steps:
<li><p>Return <var>ipv4</var>.
</ol>

<hr>

<p>The <dfn id=concept-ipv6-parser>IPv6 parser</dfn> takes a string <var>input</var> and
then runs these steps:

Expand Down Expand Up @@ -702,6 +709,23 @@ then runs these steps:
<a lt='IPv6 parser IPv4'>IPv4</a>, and <a lt='IPv6 parser Finale'>Finale</a> are markers. They serve
no purpose other than being a location the algorithm can jump to.

<hr>

<p>The <dfn export id=concept-opaque-host-parser>opaque-host parser</dfn> takes a string
<var>input</var>, and then runs these steps:

<ol>
<li><p>If <var>input</var> contains a <a>forbidden host code point</a> excluding "<code>%</code>",
<a>syntax violation</a>, return failure.

<li><p>Let <var>output</var> be the empty string.

<li><p>For each code point in <var>input</var>, <a>UTF-8 percent encode</a> it using the
<a>simple encode set</a>, and append the result to <var>output</var>.

<li><p>Return <var>output</var>.
</ol>


<h3 id=host-serializing>Host serializing</h3>

Expand Down Expand Up @@ -1235,26 +1259,6 @@ different document encoding. Using the <a>UTF-8</a> encoding everywhere solves t

<hr>

<p>The <dfn export id=concept-url-host-parser>URL-host parser</dfn> takes a string <var>input</var>
and a boolean <var>isSpecial</var>, and then runs these steps:</p>

<ol>
<li><p>If <var>isSpecial</var> is true, then return the result of
<a lt="host parser">host parsing</a> <var>input</var>.

<li><p>If <var>input</var> contains a <a>forbidden host code point</a>, <a>syntax violation</a>,
return failure.

<li><p>Let <var>output</var> be the empty string.

<li><p>For each code point in <var>input</var>, <a>UTF-8 percent encode</a> it using the
<a>simple encode set</a>, and append the result to <var>output</var>.

<li><p>Return <var>output</var>.
</ol>

<hr>

<p>The <dfn export id=concept-basic-url-parser lt='basic URL parser'>basic URL parser</dfn> takes a
string <var>input</var>, optionally with a <a>base URL</a> <var>base</var>, optionally with an
<a for=/>encoding</a> <var>encoding override</var>, optionally with a <a for=/>URL</a>
Expand Down Expand Up @@ -1639,7 +1643,7 @@ string <var>input</var>, optionally with a <a>base URL</a> <var>base</var>, opti
<li><p>If <var>buffer</var> is the empty string, <a>syntax violation</a>, return failure.
<!-- No URLs with port, but without host. -->

<li><p>Let <var>host</var> be the result of <a lt="URL-host parser">URL-host parsing</a>
<li><p>Let <var>host</var> be the result of <a lt="host parser">host parsing</a>
<var>buffer</var> with <var>url</var> <a>is special</a>.

<li><p>If <var>host</var> is failure, then return failure.
Expand Down Expand Up @@ -1669,7 +1673,7 @@ string <var>input</var>, optionally with a <a>base URL</a> <var>base</var>, opti
<!-- http://? -> failure
test://? -> test://? -->

<li><p>Let <var>host</var> be the result of <a lt="URL-host parser">URL-host parsing</a>
<li><p>Let <var>host</var> be the result of <a lt="host parser">host parsing</a>
<var>buffer</var> with <var>url</var> <a>is special</a>.

<li><p>If <var>host</var> is failure, then return failure.
Expand Down Expand Up @@ -1863,9 +1867,8 @@ string <var>input</var>, optionally with a <a>base URL</a> <var>base</var>, opti
<p>Otherwise, run these steps:

<ol>
<li><p>Let <var>host</var> be the result of
<a lt='host parser'>host parsing</a>
<var>buffer</var>.
<li><p>Let <var>host</var> be the result of <a lt="host parser">host parsing</a>
<var>buffer</var> with <var>url</var> <a>is special</a>.

<li><p>If <var>host</var> is failure, return failure.

Expand Down