Skip to content

Commit

Permalink
pythonGH-126148: pathname2url(): add authority section for absolute…
Browse files Browse the repository at this point in the history
… POSIX paths

When handed an absolute Windows path such as `C:\foo` or `//server/share`,
the `urllib.request.pathname2url()` function returns a URL with an
authority section, such as `///C:/foo` or `//server/share` (or before
pythonGH-126205, `////server/share`). Only the `file:` prefix is omitted.

But when handed an absolute POSIX path such as `/etc/hosts`, or a Windows
path of the same form (rooted but lacking a drive), the function returns a
URL without an authority section, such as `/etc/hosts`.

This patch corrects the discrepancy by adding a `//` prefix before
drive-less, rooted paths when generating URLs.
  • Loading branch information
barneygale committed Nov 23, 2024
1 parent cc813e1 commit 1b2d2e0
Show file tree
Hide file tree
Showing 5 changed files with 32 additions and 17 deletions.
10 changes: 6 additions & 4 deletions Doc/library/urllib.request.rst
Original file line number Diff line number Diff line change
Expand Up @@ -153,12 +153,14 @@ The :mod:`urllib.request` module defines the following functions:
value will already be quoted using the :func:`~urllib.parse.quote` function.

.. versionchanged:: 3.14
Windows drive letters are no longer converted to uppercase.
Paths beginning with a slash are converted to URLs with authority
sections. For example, the path ``/etc/hosts`` is converted to
the URL ``///etc/hosts``.

.. versionchanged:: 3.14
On Windows, ``:`` characters not following a drive letter are quoted. In
previous versions, :exc:`OSError` was raised if a colon character was
found in any position other than the second character.
Windows drive letters are no longer converted to uppercase, and ``:``
characters not following a drive letter no longer cause an
:exc:`OSError` exception to be raised on Windows.


.. function:: url2pathname(path)
Expand Down
20 changes: 12 additions & 8 deletions Lib/nturl2path.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,17 @@ def pathname2url(p):
p = p[4:]
if p[:4].upper() == 'UNC/':
p = '//' + p[4:]
drive, tail = ntpath.splitdrive(p)
if drive[1:] == ':':
# DOS drive specified. Add three slashes to the start, producing
# an authority section with a zero-length authority, and a path
# section starting with a single slash.
drive = f'///{drive}'
drive, root, tail = ntpath.splitroot(p)
if drive:
if drive[1:] == ':':
# DOS drive specified. Add three slashes to the start, producing
# an authority section with a zero-length authority, and a path
# section starting with a single slash.
drive = f'///{drive}'
drive = urllib.parse.quote(drive, safe='/:')
elif root:
# Path has a root but no drive. Add an authority section.
root = f'//{root}'

drive = urllib.parse.quote(drive, safe='/:')
tail = urllib.parse.quote(tail)
return drive + tail
return drive + root + tail
10 changes: 5 additions & 5 deletions Lib/test/test_urllib.py
Original file line number Diff line number Diff line change
Expand Up @@ -1434,7 +1434,7 @@ def test_pathname2url_win(self):
self.assertEqual(fn('C:\\foo:bar'), '///C:/foo%3Abar')
self.assertEqual(fn('foo:bar'), 'foo%3Abar')
# No drive letter
self.assertEqual(fn("\\folder\\test\\"), '/folder/test/')
self.assertEqual(fn("\\folder\\test\\"), '///folder/test/')
self.assertEqual(fn("\\\\folder\\test\\"), '//folder/test/')
self.assertEqual(fn("\\\\\\folder\\test\\"), '///folder/test/')
self.assertEqual(fn('\\\\some\\share\\'), '//some/share/')
Expand All @@ -1447,7 +1447,7 @@ def test_pathname2url_win(self):
self.assertEqual(fn('//?/unc/server/share/dir'), '//server/share/dir')
# Round-tripping
urls = ['///C:',
'/folder/test/',
'///folder/test/',
'///C:/foo/bar/spam.foo']
for url in urls:
self.assertEqual(fn(urllib.request.url2pathname(url)), url)
Expand All @@ -1456,9 +1456,9 @@ def test_pathname2url_win(self):
'test specific to POSIX pathnames')
def test_pathname2url_posix(self):
fn = urllib.request.pathname2url
self.assertEqual(fn('/'), '/')
self.assertEqual(fn('/a/b.c'), '/a/b.c')
self.assertEqual(fn('/a/b%#c'), '/a/b%25%23c')
self.assertEqual(fn('/'), '///')
self.assertEqual(fn('/a/b.c'), '///a/b.c')
self.assertEqual(fn('/a/b%#c'), '///a/b%25%23c')

@unittest.skipUnless(os_helper.FS_NONASCII, 'need os_helper.FS_NONASCII')
def test_pathname2url_nonascii(self):
Expand Down
4 changes: 4 additions & 0 deletions Lib/urllib/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -1667,6 +1667,10 @@ def url2pathname(pathname):
def pathname2url(pathname):
"""OS-specific conversion from a file system path to a relative URL
of the 'file' scheme; not recommended for general use."""
if pathname[:1] == '/':
# Absolute path supplied. Add an authority section with a
# zero-length authority.
pathname = f'//{pathname}'
encoding = sys.getfilesystemencoding()
errors = sys.getfilesystemencodeerrors()
return quote(pathname, encoding=encoding, errors=errors)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
:func:`urllib.request.pathname2url` now adds an empty authority when
generating a URL for an absolute POSIX path. For example, the path
``/etc/hosts`` is converted to the scheme-less URL ``///etc/hosts``. As a
result of this change, URLs without authorities are only generated for
relative paths.

0 comments on commit 1b2d2e0

Please sign in to comment.