-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade vendored requests to 2.26.0 #10174
Conversation
https://github.com/psf/requests/blob/master/HISTORY.md#2260-2021-07-13 psf/requests@v2.25.1...v2.26.0 requests 2.26.0 allows using charset_normalizer instead of chardet, but I did not vendor the former because chardet is used by other parts of pip.
I just did a grep and I don't see where else pip uses chardet (except in the vendored html5lib, where it seems to be optional). Can you clarify where you think it's used? |
I meant the use in html5lib |
How hard would it be to patch html5lib to use charset_normalizer instead? I took a quick look and it seems that html5lib is using an interface not present in charset_normalizer, but I’m not faimilar with the internals of either to understand whether the patch would be difficult. |
I don't want to have this flare up into another licensing discussion, so I won't link to the original thread here, but upgrading requests to 2.26.0 is fine with me as a "normal" upgrade of a vendored dependency. If html5lib can't use charset_normalizer, then IMO it's up to the people for whom our vendoring of chardet is an issue to raise that with html5lib. I'm -1 on carrying patches to html5lib to do that, as it sucks us into that whole licensing argument (which is IMO rather toxic, if the requests and pip threads I've read are anything to go by). If simply dropping chardet as a dependency, and having html5lib fall back to whatever it does when chardet isn't available, is acceptable, then I'm fine with doing that as well (but it should probably be in a separate PR, as it introduces a potential behaviour change). |
It seems like this PR is not updating the vendoring metadata correctly. Could you take a look at the CI error and the documentaion on Vendoring Policy and see if you could fix the issue? |
Sorry, I fixed the problem |
src/pip/_vendor/requests/compat.py
Outdated
from pip._vendor import chardet | ||
try: | ||
from pip._vendor import chardet | ||
except ImportError: | ||
import charset_normalizer as chardet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we patch this to remove the try-except block? I feel it's more predictable if we always import the vendored chardet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just patched it.
I thought about an alternative: substituting all charset_normalizer
imports with pip._vendor.charset_normalizer
using vendoring capabilities. This way we won't have to maintain the additional patches, and switching to charset_normalizer
will be smoother.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a good idea. Would it be easier if I merge this first or can you do that in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, let’s have this in main so we can more easily evaluate this alternative solution.
https://github.com/psf/requests/blob/master/HISTORY.md#2260-2021-07-13
psf/requests@v2.25.1...v2.26.0
requests 2.26.0 allows using charset_normalizer instead of chardet, but I did not vendor the former because chardet is used by other parts of pip.