-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SSL Error #4825
Fix SSL Error #4825
Conversation
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm supportive of using certifi
certs to support the dataset-downloading you need to do on Rocky Linux 8, but don't think we should be adding a new runtime dependency to the cugraph
Python package.
@@ -549,3 +553,13 @@ def create_directory_with_overwrite(directory): | |||
if os.path.exists(directory): | |||
shutil.rmtree(directory) | |||
os.makedirs(directory) | |||
|
|||
|
|||
def install_ssl_cert(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting this here in cugraph.utilities
means that now certifi
needs to be introduced as a runtime dependency of cugraph
.
ModuleNotFoundError: No module named 'certifi'
I'm really nervous about the idea of introducing a new runtime dependency this long after code freeze and this close to the release.
And even just in general... does this really need to be in the cugraph
package? Are those code paths in datasets.py
intended to be used by downstream users, or are they just in the cugraph
package for convenience in its own testing (and testing for cugraph-gnn
libraries)?
Here in 24.12, to minimize risk, I think we should only run this at test time in CI here.
That'd probably mean:
- make
certifi
a test-only dependency - put this bit of Python code in a script like
ci/install-certifi-certs.py
- run that in each
ci/test_*
script, before any tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it is not a good time to add an additional runtime dependency..
The codepaths in datasets.py
is used both by end users and for our own testing purposes. You can think of them as analogous to NetworkX's built-in graphs
I don't think that the SSL issue would affect our users. I think this is just affecting CI because of how our images are setup. It would probably be fine to go with the solution you suggested and install it for our tests. I'll go ahead and add that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok got it, thanks for the explanation! Then yeah, I think doing this only as a testing dependency is a good solution.
I tested whether this could just be run outside of of pytest
, like I suggested... and you're absolutely right, it can't. I guess something in this solution must also modify os.environ
or similar.
code I used to confirm that (click me)
docker run \
--rm \
-it rapidsai/citestwheel:cuda11.8.0-rockylinux8-py3.10 \
bash
python -c 'import urllib.request; urllib.request.urlretrieve("https://data.rapids.ai/cugraph/results/resultsets.tar.gz", "foo.tar.gz")'
# urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED
cat > ./install-ca-certs.py <<EOF
import certifi
from ssl import create_default_context
from urllib.request import build_opener, HTTPSHandler, install_opener
ssl_context = create_default_context(cafile=certifi.where())
https_handler = HTTPSHandler(context=ssl_context)
install_opener(build_opener(https_handler))
EOF
python -m pip install certifi
python ./install-ca-certs.py
python -c 'import urllib.request; urllib.request.urlretrieve("https://data.rapids.ai/cugraph/results/resultsets.tar.gz", "foo.tar.gz")'
# urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED
Given that, I think the way you've set this up as a testing-only dependency is a good solution. Thanks for working through it with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for helping track this down and brainstorming a solution with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support this change, especially since you saw it fix the error locally. Thanks for finding a minimal reproducible example and then putting up a fix so quickly!
@@ -458,7 +459,7 @@ def download_all(force=False): | |||
filename = meta["name"] + meta["file_type"] | |||
save_to = default_download_dir.path / filename | |||
if not save_to.is_file() or force: | |||
urllib.request.urlretrieve(meta["url"], str(save_to)) | |||
urlretrieve(meta["url"], str(save_to)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you end up needing to push another change, I recommend reverting these import changes, which just look like stylistic changes to me (please correct me).
That'd make this whole file drop out of the diff, and make us even more confident that this is safe to merge this late in the release cycle.
But the changes look fine to me so don't push another commit and go through another round of CI just to revert this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't believe I have any more changes to push as of now. The PR is just about to pass CI so let me know if you think it'll be better to still revert the changes to have as minimal of a diff has possible. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For those finding this from search... we talked offline and agreed that given the time-sensitivity of this, it wasn't worth another round of CI to revert this change.
@@ -470,6 +470,7 @@ dependencies: | |||
common: | |||
- output_types: [conda, requirements] | |||
packages: | |||
- certifi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nv-rliu Can I ask why this was added to notebook dependencies? It looks like the package is only imported in tests, not notebooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my apologies. I thought that we needed this package in the notebook tests because we also use the datasets API in the notebooks a lot. It may not be necessary after all..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if "having certifi installed" is sufficient to avoid this issue or not. Were you seeing certificate failures on notebook jobs before? Are they resolved now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Just having it installed doesn't actually do anything for the notebooks.
I am not seeing any SSL failures for the notebooks so we can remove this dependency. Should I go ahead and do that in 24.12 or should it be in 25.02?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
25.02 please!
Last night, [wheel-tests-cuml / 11.8.0, 3.10, arm64, rockylinux8, a100, latest-driver, oldest-deps](https://github.com/rapidsai/cuml/actions/runs/12273029767/job/34243006376#logs) failed. The failure occurred in `testing/dask/utils.py`'s `load_text_corpus`: https://github.com/rapidsai/cuml/blob/052cddef9648c3266974a0b43970afb347ba9d01/python/cuml/cuml/testing/dask/utils.py#L12 The errors looked like: ``` FAILED test_dask_naive_bayes.py::test_basic_fit_predict - urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)> FAILED test_dask_naive_bayes.py::test_single_distributed_exact_results - urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)> FAILED test_dask_naive_bayes.py::test_score - urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)> ``` This adopts some fixes from rapidsai/cugraph#4825 that will hopefully help with SSL certificate failures.
Addresses errors in `wheel-tests` seen [here](rapidsai#4818) Installs SSL certificates with `certifi` as outlined in this issue [here](rapidsai/build-infra#56)
Addresses errors in
wheel-tests
seen hereInstalls SSL certificates with
certifi
as outlined in this issue here