Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin cuda-python=12.0.* for CUDA 12 builds. #697

Closed
wants to merge 1 commit into from

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Jan 31, 2024

#695 did not solve the problem. It appears that cuda-python 12.3 is being pulled, which is not what I hoped the solver would do. This PR changes the approach by pinning cuda-python=12.0.*.

For posterity, I saw the following results:

This pulls cuda-cudart from the nvidia channel:

mamba create -n test --dry-run -c rapidsai-nightly -c dask/label/dev -c pytorch -c conda-forge -c nvidia rapids=24.02 dask-sql=2023.11 python=3.10 cuda-version=12.0 ipython

Replace the rapids metapackage with the packages it depends on (including cuda-python, from before #695), and the problem still reproduces.

mamba create -n test --dry-run -c rapidsai-nightly -c dask/label/dev -c pytorch -c conda-forge -c nvidia "cucim=24.02.*" "cuda-python >=12.0.0,<13.0a" "cuda-version>=12,<13.0a0" "cudf=24.02.*" "cugraph=24.02.*" "cuml=24.02.*" "cuproj=24.02.*" "cupy>=12.0.0" "cuspatial=24.02.*" "custreamz=24.02.*" "cuxfilter=24.02.*" "dask-cuda=24.02.*" "libcugraph_etl=24.02.*" "nccl>=2.9.9,<3.0a0" "networkx>=2.5.1" "numba>=0.57" "numpy>=1.21" "nvtx>=0.2.1,<0.3" "pylibcugraph=24.02.*" "rapids-xgboost=24.02.*" "rmm=24.02.*" "ucx>=1.14.1" dask-sql=2023.11 python=3.10 cuda-version=12.0 ipython

Remove cuda-python from that pinning, and the problem disappeared for me (no nvidia channel packages):

mamba create -n test --dry-run -c rapidsai-nightly -c dask/label/dev -c pytorch -c conda-forge -c nvidia "cucim=24.02.*" "cuda-version>=12,<13.0a0" "cudf=24.02.*" "cugraph=24.02.*" "cuml=24.02.*" "cuproj=24.02.*" "cupy>=12.0.0" "cuspatial=24.02.*" "custreamz=24.02.*" "cuxfilter=24.02.*" "dask-cuda=24.02.*" "libcugraph_etl=24.02.*" "nccl>=2.9.9,<3.0a0" "networkx>=2.5.1" "numba>=0.57" "numpy>=1.21" "nvtx>=0.2.1,<0.3" "pylibcugraph=24.02.*" "rapids-xgboost=24.02.*" "rmm=24.02.*" "ucx>=1.14.1" dask-sql=2023.11 python=3.10 cuda-version=12.0 ipython

Changing the cuda-python pinning to cuda-python<12.3.0a0 also reproduced the problem (with conda-forge providing cuda-python 12.2 and nvidia providing cuda-cudart).

I concluded the only solution is to pin cuda-python=12.0.* to align with the cuda-version. But that will break minor version compatibility if the user wishes to use cuda-version=12.2, for example.

I think the best solution is for us to figure out a solution to: conda-forge/cuda-python-feedstock#66.

The problem statement is "we need to be able to install cuda-python=12.x alongside cuda-version=12.y for some x, y (especially x > y)". cuda-python's packaging is explicitly designed to allow CEC and therefore doesn't have to match versions exactly with CUDA Toolkit packages that are constrained by cuda-version.

@bdice bdice added the 5 - DO NOT MERGE Hold off on merging; see PR for details label Jan 31, 2024
rapids-bot bot pushed a commit to rapidsai/docker that referenced this pull request Feb 1, 2024
Fixes attempted in rapidsai/integration#695 & rapidsai/integration#697 don't play well with CEC at large, but this repo can pin `cuda-python` more restrictively.

This PR pins `cuda-python` to the CUDA `major.minor.*` version.

See also conda-forge/cuda-python-feedstock#66

Authors:
  - Ray Douglass (https://github.com/raydouglass)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Jake Awe (https://github.com/AyodeAwe)

URL: #624
@bdice
Copy link
Contributor Author

bdice commented Feb 5, 2024

Now that conda-forge/cuda-python-feedstock#66 is solved, we can close this.

@bdice bdice closed this Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - DO NOT MERGE Hold off on merging; see PR for details
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant