Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-125866: RFC8089 file URIs in urllib.request #126148

Closed
wants to merge 12 commits into from

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Oct 29, 2024

Adjust urllib.request.pathname2url() and url2pathname() to generate and accept file URIs as described in RFC8089.

pathname2url() gains a new include_scheme argument, which defaults to false. When set to true, the returned URL includes a file: prefix. url2pathname() now automatically removes a file: prefix if present.

On Windows, pathname2url() now generates URIs that begin with two slashes rather than four when given a UNC path.

On other platforms, pathname2url() now generates URIs that begin with three slashes rather than one when given an absolute path. url2pathname() now performs the opposite transformation, so file:///etc/hosts becomes /etc/hosts. Furthermore, url2pathname() now ignores local hosts (like "localhost" or any alias) and raises URLError for non-local hosts.

pathname2url() examples (star marks differences):

previously now
pathname2url('foo/bar') foo/bar foo/bar
pathname2url('/foo/bar') /foo/bar ///foo/bar *
pathname2url('//foo/bar') (POSIX) //foo/bar ////foo/bar *
pathname2url('//foo/bar') (Windows) ////foo/bar //foo/bar *

url2pathname() examples (star marks differences):

previously now
url2pathname('foo/bar') foo/bar foo/bar
url2pathname('/foo/bar') /foo/bar /foo/bar
url2pathname('//foo/bar') (POSIX) //foo/bar raise URLError *
url2pathname('//foo/bar') (Windows) //foo/bar //foo/bar
url2pathname('///foo/bar') ///foo/bar /foo/bar *
url2pathname('////foo/bar') (POSIX) ////foo/bar //foo/bar *
url2pathname('////foo/bar') (Windows) //foo/bar //foo/bar
url2pathname('/////foo/bar') (Windows) ///foo/bar //foo/bar *

📚 Documentation preview 📚: https://cpython-previews--126148.org.readthedocs.build/

Adjust `urllib.request.pathname2url` and `url2pathname()` to generate and
accept file URIs as described in RFC8089.

`pathname2url()` gains a new *include_scheme* argument, which defaults to
false. When set to true, the returned URL includes a `file:` prefix.

`url2pathname()` now automatically removes a `file:` prefix if present.

On Windows, `pathname2url()` now generates URIs that begin with two slashes
rather than four when given a UNC path.

On other platforms, `pathname2url()` now generates URIs that begin with
three slashes rather than one when given an absolute path. `url2pathname()`
now performs the opposite transformation, so `file:///etc/hosts` becomes
`/etc/hosts`. Furthermore, `url2pathname()` now ignores local hosts (like
"localhost" or any alias) and raises `URLError` for non-local hosts.
@barneygale
Copy link
Contributor Author

Test failures will go away when #125739 lands

barneygale added a commit to barneygale/cpython that referenced this pull request Nov 23, 2024
… POSIX paths

When handed an absolute Windows path such as `C:\foo` or `//server/share`,
the `urllib.request.pathname2url()` function returns a URL with an
authority section, such as `///C:/foo` or `//server/share` (or before
pythonGH-126205, `////server/share`). Only the `file:` prefix is omitted.

But when handed an absolute POSIX path such as `/etc/hosts`, or a Windows
path of the same form (rooted but lacking a drive), the function returns a
URL without an authority section, such as `/etc/hosts`.

This patch corrects the discrepancy by adding a `//` prefix before
drive-less, rooted paths when generating URLs.
barneygale added a commit to barneygale/cpython that referenced this pull request Nov 24, 2024
@barneygale barneygale closed this Nov 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant