Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SSL Error #4825

Merged
merged 3 commits into from
Dec 11, 2024
Merged

Conversation

nv-rliu
Copy link
Contributor

@nv-rliu nv-rliu commented Dec 10, 2024

Addresses errors in wheel-tests seen here

Installs SSL certificates with certifi as outlined in this issue here

Copy link

copy-pr-bot bot commented Dec 10, 2024

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@nv-rliu nv-rliu added this to the 24.12 milestone Dec 10, 2024
@nv-rliu nv-rliu added bug Something isn't working breaking Breaking change labels Dec 10, 2024
@nv-rliu nv-rliu changed the base branch from branch-24.12 to branch-25.02 December 10, 2024 21:30
@nv-rliu
Copy link
Contributor Author

nv-rliu commented Dec 10, 2024

/ok to test

@nv-rliu nv-rliu requested a review from jameslamb December 10, 2024 21:36
@nv-rliu nv-rliu changed the base branch from branch-25.02 to branch-24.12 December 10, 2024 21:47
@nv-rliu nv-rliu marked this pull request as ready for review December 10, 2024 21:48
@nv-rliu nv-rliu requested a review from a team as a code owner December 10, 2024 21:48
Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm supportive of using certifi certs to support the dataset-downloading you need to do on Rocky Linux 8, but don't think we should be adding a new runtime dependency to the cugraph Python package.

@@ -549,3 +553,13 @@ def create_directory_with_overwrite(directory):
if os.path.exists(directory):
shutil.rmtree(directory)
os.makedirs(directory)


def install_ssl_cert():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting this here in cugraph.utilities means that now certifi needs to be introduced as a runtime dependency of cugraph.

ModuleNotFoundError: No module named 'certifi'

(build link)

I'm really nervous about the idea of introducing a new runtime dependency this long after code freeze and this close to the release.

And even just in general... does this really need to be in the cugraph package? Are those code paths in datasets.py intended to be used by downstream users, or are they just in the cugraph package for convenience in its own testing (and testing for cugraph-gnn libraries)?

Here in 24.12, to minimize risk, I think we should only run this at test time in CI here.

That'd probably mean:

  • make certifi a test-only dependency
  • put this bit of Python code in a script like ci/install-certifi-certs.py
  • run that in each ci/test_* script, before any tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is not a good time to add an additional runtime dependency..

The codepaths in datasets.py is used both by end users and for our own testing purposes. You can think of them as analogous to NetworkX's built-in graphs

I don't think that the SSL issue would affect our users. I think this is just affecting CI because of how our images are setup. It would probably be fine to go with the solution you suggested and install it for our tests. I'll go ahead and add that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok got it, thanks for the explanation! Then yeah, I think doing this only as a testing dependency is a good solution.

I tested whether this could just be run outside of of pytest, like I suggested... and you're absolutely right, it can't. I guess something in this solution must also modify os.environ or similar.

code I used to confirm that (click me)
docker run \
    --rm \
    -it rapidsai/citestwheel:cuda11.8.0-rockylinux8-py3.10 \
    bash

python -c 'import urllib.request; urllib.request.urlretrieve("https://data.rapids.ai/cugraph/results/resultsets.tar.gz", "foo.tar.gz")'
# urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED

cat > ./install-ca-certs.py <<EOF
import certifi
from ssl import create_default_context
from urllib.request import build_opener, HTTPSHandler, install_opener

ssl_context = create_default_context(cafile=certifi.where())
https_handler = HTTPSHandler(context=ssl_context)
install_opener(build_opener(https_handler))
EOF

python -m pip install certifi
python ./install-ca-certs.py

python -c 'import urllib.request; urllib.request.urlretrieve("https://data.rapids.ai/cugraph/results/resultsets.tar.gz", "foo.tar.gz")'
# urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED

Given that, I think the way you've set this up as a testing-only dependency is a good solution. Thanks for working through it with me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for helping track this down and brainstorming a solution with me.

@nv-rliu nv-rliu requested a review from a team as a code owner December 11, 2024 03:07
@nv-rliu nv-rliu requested review from jameslamb and removed request for a team December 11, 2024 03:07
@github-actions github-actions bot added the conda label Dec 11, 2024
Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support this change, especially since you saw it fix the error locally. Thanks for finding a minimal reproducible example and then putting up a fix so quickly!

@@ -458,7 +459,7 @@ def download_all(force=False):
filename = meta["name"] + meta["file_type"]
save_to = default_download_dir.path / filename
if not save_to.is_file() or force:
urllib.request.urlretrieve(meta["url"], str(save_to))
urlretrieve(meta["url"], str(save_to))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you end up needing to push another change, I recommend reverting these import changes, which just look like stylistic changes to me (please correct me).

That'd make this whole file drop out of the diff, and make us even more confident that this is safe to merge this late in the release cycle.

But the changes look fine to me so don't push another commit and go through another round of CI just to revert this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't believe I have any more changes to push as of now. The PR is just about to pass CI so let me know if you think it'll be better to still revert the changes to have as minimal of a diff has possible. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For those finding this from search... we talked offline and agreed that given the time-sensitivity of this, it wasn't worth another round of CI to revert this change.

@raydouglass raydouglass merged commit 2d32c86 into rapidsai:branch-24.12 Dec 11, 2024
71 of 73 checks passed
@@ -470,6 +470,7 @@ dependencies:
common:
- output_types: [conda, requirements]
packages:
- certifi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nv-rliu Can I ask why this was added to notebook dependencies? It looks like the package is only imported in tests, not notebooks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my apologies. I thought that we needed this package in the notebook tests because we also use the datasets API in the notebooks a lot. It may not be necessary after all..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if "having certifi installed" is sufficient to avoid this issue or not. Were you seeing certificate failures on notebook jobs before? Are they resolved now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Just having it installed doesn't actually do anything for the notebooks.

I am not seeing any SSL failures for the notebooks so we can remove this dependency. Should I go ahead and do that in 24.12 or should it be in 25.02?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

25.02 please!

@nv-rliu nv-rliu deleted the b24.12-fix-ssl-error branch December 11, 2024 14:55
raydouglass pushed a commit to rapidsai/cuml that referenced this pull request Dec 11, 2024
Last night, [wheel-tests-cuml / 11.8.0, 3.10, arm64, rockylinux8, a100,
latest-driver,
oldest-deps](https://github.com/rapidsai/cuml/actions/runs/12273029767/job/34243006376#logs)
failed.

The failure occurred in `testing/dask/utils.py`'s `load_text_corpus`:


https://github.com/rapidsai/cuml/blob/052cddef9648c3266974a0b43970afb347ba9d01/python/cuml/cuml/testing/dask/utils.py#L12

The errors looked like:

```
FAILED test_dask_naive_bayes.py::test_basic_fit_predict - urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)>
FAILED test_dask_naive_bayes.py::test_single_distributed_exact_results - urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)>
FAILED test_dask_naive_bayes.py::test_score - urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)>
```

This adopts some fixes from
rapidsai/cugraph#4825 that will hopefully help
with SSL certificate failures.
jameslamb pushed a commit to alexbarghi-nv/cugraph that referenced this pull request Dec 12, 2024
Addresses errors in `wheel-tests` seen
[here](rapidsai#4818)

Installs SSL certificates with `certifi` as outlined in this issue
[here](rapidsai/build-infra#56)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change bug Something isn't working conda python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants