Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

basic-auth user and password are converted to lowercase when passed as part of "host" to URL.build #880

Closed
1 task done
starflows opened this issue Jun 5, 2023 · 11 comments · Fixed by #954
Closed
1 task done
Labels

Comments

@starflows
Copy link

Describe the bug

user and password casing is preserved when using the constructor:

>>> yarl.URL('httPS://usER:passWORD@hostNAME/paTH')
URL('https://usER:passWORD@hostname/paTH')
>>> yarl.URL('httPS://usER:passWORD@hostNAME/paTH').user
'usER'

user and password casing is lost when using URL.build:

>>> yarl.URL.build(scheme='httPS', host='usER:passWORD@hostNAME', path='/paTH')
URL('httPS://user:password@hostname/paTH')
>>> yarl.URL.build(scheme='httPS', host='usER:passWORD@hostNAME', path='/paTH').user
'user'

Note that the user & password are extracted from the string passed as host.

passing user and password explicitly also preserves the casing:

>>> yarl.URL.build(scheme='httPS', user='usER', password='passWORD', host='hostNAME', path='/paTH')
URL('httPS://usER:passWORD@hostname/paTH')
>>> yarl.URL.build(scheme='httPS', user='usER', password='passWORD', host='hostNAME', path='/paTH').user
'usER'

I'd expect either user and password to preserve casing when passed into host (preferably) or them not being extracted into the user and password attributes and instead remain in the lowercase and encoded host attribute.

maybe caused by #386

converting the host to lowercase should be done after extracting username / password.

To Reproduce

>>> yarl.URL.build(scheme='httPS', host='usER:passWORD@hostNAME', path='/paTH')
URL('httPS://user:password@hostname/paTH')
>>> yarl.URL.build(scheme='httPS', host='usER:passWORD@hostNAME', path='/paTH').user
'user'

Expected behavior

>>> yarl.URL.build(scheme='httPS', host='usER:passWORD@hostNAME', path='/paTH')
URL('httPS://user:password@hostname/paTH')
>>> yarl.URL.build(scheme='httPS', host='usER:passWORD@hostNAME', path='/paTH').user
'usER'

Logs/tracebacks

N/A

Python Version

$ python --version
Python 3.10.10

multidict Version

$ python -m pip show multidict
Name: multidict
Version: 6.0.4
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /usr/lib/python3.10/site-packages
Requires: 
Required-by: aiohttp, httpie, yarl

yarl Version

$ python -m pip show yarl
Name: yarl
Version: 1.8.2
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl/
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /home/stefan/.local/lib/python3.10/site-packages
Requires: idna, multidict
Required-by: aiohttp

OS

Arch Linux

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@webknjaz
Copy link
Member

user and password casing is lost when using URL.build:

>>> yarl.URL.build(scheme='httPS', host='usER:passWORD@hostNAME', path='/paTH')
URL('httPS://user:password@hostname/paTH')
>>> yarl.URL.build(scheme='httPS', host='usER:passWORD@hostNAME', path='/paTH').user
'user'

Note that the user & password are extracted from the string passed as host.

passing user and password explicitly also preserves the casing:

>>> yarl.URL.build(scheme='httPS', user='usER', password='passWORD', host='hostNAME', path='/paTH')
URL('httPS://usER:passWORD@hostname/paTH')
>>> yarl.URL.build(scheme='httPS', user='usER', password='passWORD', host='hostNAME', path='/paTH').user
'usER'

I'd expect either user and password to preserve casing when passed into host (preferably) or them not being extracted into the user and password attributes and instead remain in the lowercase and encoded host attribute.

I don't think it's right to use the URL.build() constructor like that... Would would anyone pass username/password into the host?

@starflows
Copy link
Author

I agree @webknjaz . Currently it does not raise an error though. The values for basicauth user and password are lowercased and used. Could be something like that:

yarl.URL.build(host='usER:passWORD@hostNAME')
ValueError: host may contain only the ASCII letters 'a' through 'z' (case-insensitive), the digits '0' through '9', and the hyphen. Hostname labels cannot begin or end with a hyphen. No other symbols, punctuation characters, or blank spaces are permitted.

On the other hand it successfully extracts the basicauth user and password. It could do that before modifying the host to lowercase.

@webknjaz
Copy link
Member

I think it shouldn't be attempting to extract those fields. The argument is called host, after all. Not host_and_maybe_username_and_password.

cc @mjpieters @Dreamsorcerer WDYT?

@webknjaz
Copy link
Member

I checked that the docs don't advertise the ability to abuse the host argument with other stuff. So I'd rather deprecate it and stick an exception later on.

@Dreamsorcerer
Copy link
Member

Yes, that shouldn't be allowed in host. We have authority if you want to pass all of that in together (the equivalent of netloc in urllib.parse).

@Dreamsorcerer
Copy link
Member

>>> yarl.URL.build(scheme='httPS', authority='usER:passWORD@hostNAME', path='/paTH')
URL('httPS://usER:passWORD@hostname/paTH')

@mjpieters
Copy link
Contributor

Agreed. Let's explicitly deprecate parsing the host value as an authority string.

@mjpieters
Copy link
Contributor

Actually, there are 3 options:

  1. in cls._encode_host(), if the host value is not an IP (v4 or v6) address, but contains any of the gen-delims reserved characters (:/?#[]@) issue a deprecation warning (which should mention that authority exists), and in a future version, raise a ValueError.
  2. skip the deprecation warning phase and go straight to raising a ValueError.
  3. in URL.build, if hostname contains a @ character, and if authority, username andpassword are not given, assume that the user wanted authority instead and treat it accordingly. Raise a deprecation warning to state that the user should really use authority instead and in a future version drop this path and raise a ValueError in cls._encode_host() as in options 1 and 2.

I'm actually leaning towards option 2 here because I don't think using URL.build(..., host=...) with an authority string was ever documented as working. In other words: the issue here is that passing in a authority value to host should have triggered a validation issue instead of magically working.

@Dreamsorcerer
Copy link
Member

Given the existence of authority, and the naming and documentation being fairly clear, I would say option 2 would be fine. Few users should have made this mistake.

I imagine it could also indicate some possibility of a security issue or similar if the host value comes from user input. e.g. It screws things up if you were adding a user explicitly:

>>> yarl.URL.build(scheme='httPS', user='foo', host='usER:passWORD@hostNAME', >
URL('httPS://foo@user:password@hostname/paTH')

Compare to authority, where this is already correctly validated:

>>> yarl.URL.build(scheme='httPS', user='foo', authority='usER:passWORD@hostNA>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/.local/lib/python3.10/site-packages/yarl/_url.py", line 233, in build
    raise ValueError(
ValueError: Can't mix "authority" with "user", "password", "host" or "port".

@Dreamsorcerer
Copy link
Member

Although I'd note that authority appears to have been added in v1.5 around 3 years ago, so maybe some risk with code that predates that.

@webknjaz
Copy link
Member

Fair. Option 2 seems the best way to go. Option 1 could help with the Hyrum's Law, though. But i don't care enough to insist. Either one would do.

By the way, I made the changelog fragment types more accurate so it can now accommodate for deprecations if we end up doing that.

I expect the following in any case:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants