Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pylibcugraph: declare cupy and numpy hard dependencies #4854

Open
wants to merge 8 commits into
base: branch-25.02
Choose a base branch
from

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Jan 8, 2025

While testing stuff for #4804, I found that pylibcugraph has a hard runtime dependency on cupy and numpy, but isn't declaring them

docker run \
    --rm \
    --gpus 0 \
    -it rapidsai/ci-wheel:latest \
    bash

python -m pip install 'pylibcugraph-cu12==25.2.*,>=0.0.0a0'
python -c "import pylibcugraph"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/pyenv/versions/3.12.7/lib/python3.12/site-packages/pylibcugraph/__init__.py", line 18, in <module>
    from pylibcugraph.graphs import SGGraph, MGGraph
  File "graphs.pyx", line 1, in init pylibcugraph.graphs
  File "utils.pyx", line 20, in init pylibcugraph.utils
ModuleNotFoundError: No module named 'cupy'

This declares those dependencies.

It also promotes cugraph-service-server's numpy dependency to a hard runtime dependency.

Notes for Reviewers

Evidence that pylibcugraph already has a hard dependency on these libraries

They're used unconditionally here:

import numpy
import cupy

But have import guards in other places:

# FIXME: import these modules here for now until a better pattern can be
# used for optional imports (perhaps 'import_optional()' from cugraph), or
# these are made hard dependencies.
try:
import cupy
except ModuleNotFoundError:
raise RuntimeError("sssp requires the cupy package, which could not "
"be imported")
try:
import numpy
except ModuleNotFoundError:
raise RuntimeError("sssp requires the numpy package, which could not "
"be imported")

So this PR doesn't introduce new hard dependencies... it just makes them explicit, to make it easier to install and run pylibcugraph.

How was this not caught in CI?

Import tests aren't run here for conda packages, because conda builds happen on CPU-only nodes.

rapids-conda-retry mambabuild \
--no-test \
--channel "${CPP_CHANNEL}" \
conda/recipes/pylibcugraph

And numpy and cupy are probably getting pulled in by some of the wheels' test dependencies, like cudf, here:

python -m pip install $(ls ./dist/${python_package_name}*.whl)[test]

Should we just make the other unconditional cases conditional with try-catching?

No. Talked with @rlratzel, @ChuckHastings, and @eriknw offline, and we agreed to declare these as hard runtime dependencies (and remove the try-catching in places that had it).

@jameslamb jameslamb added bug Something isn't working non-breaking Non-breaking change labels Jan 8, 2025

This comment was marked as resolved.

@jameslamb jameslamb changed the title WIP: pylibcugraph: declare cupy dependency WIP: pylibcugraph: declare cupy and numpy hard dependencies Jan 9, 2025
@jameslamb jameslamb changed the title WIP: pylibcugraph: declare cupy and numpy hard dependencies pylibcugraph: declare cupy and numpy hard dependencies Jan 10, 2025
@jameslamb jameslamb marked this pull request as ready for review January 10, 2025 14:57
@jameslamb jameslamb requested review from a team as code owners January 10, 2025 14:57
@jameslamb jameslamb requested a review from AyodeAwe January 10, 2025 14:57
@@ -3,6 +3,7 @@
[build-system]

requires = [
"numpy>=1.23,<3.0a0",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cugraph-service-server should have a hard runtime dependency on numpy, because of this:

its conda recipe does:

- numpy >=1.23,<3.0a0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, thanks for being so thorough!

Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@@ -3,6 +3,7 @@
[build-system]

requires = [
"numpy>=1.23,<3.0a0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, thanks for being so thorough!

@@ -174,6 +178,7 @@ files:
extras:
table: build-system
includes:
- depends_on_numpy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since numpy doesn’t have any matrices for CUDA version or any extra index URLs, I would treat it as a normal dependency and add it to the corresponding lists. We don’t use depends_on_numpy in other repos because it doesn’t reduce complexity. You could use a YAML anchor for deduplication of the pinning declaration if desired.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sure.

That is not how I've been thinking about when to do these depends_on_* lists. At https://github.com/rapidsai/build-planning/blob/d9e3c606d95c835ee384ac6480a4af0ac6cb024a/docs/docs/packaging.md#L181 we have this:

Dependencies appearing in several lists should be in their own standalone depends_on_{project} lists.

I can update those docs with this added nuance, I don't disagree with you.

Pushed 6a83f20 here removing depends_on_numpy in favor of anchors. That revealed that there were some other issues in cugrpah-service-server's [test] extra... it was pulling through all of cugraph's testing dependencies, which it doesn't need based on my read of https://github.com/rapidsai/cugraph/tree/branch-25.02/python/cugraph-service/server/cugraph_service_server/testing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think those docs should be updated. I think it should be something like:

Dependencies appearing in several lists can be handled in two ways. Dependencies with complex requirements such as extra index URLs or package names that vary by CUDA version (like -cuXX suffixes) or format (different conda and PyPI names) should be in their own standalone depends_on_{project} lists. Simpler dependencies can use YAML anchors to avoid duplication of the pinning specs.

I'll re-review and approve based on this. Feel free to update those docs as you see fit.

@jameslamb jameslamb requested a review from bdice January 10, 2025 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working conda non-breaking Non-breaking change python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants