Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release a CPU only wheel on PyPI #10596

Closed
rbeilvert opened this issue Jul 17, 2024 · 14 comments · Fixed by #10603
Closed

Release a CPU only wheel on PyPI #10596

rbeilvert opened this issue Jul 17, 2024 · 14 comments · Fixed by #10603
Assignees

Comments

@rbeilvert
Copy link

I am using XGBoost on CPU only and do not require the NCCL library

Until before XGBoost v2.1.0 I managed to installed the lib with pip and --no-binay option to compile the package without NCCL

Since v2.1.0, nvidia-nccl-cu12 is installed from PyPI and its huge 190MB weight makes my app not fit on my Heroku server anymore.

Is there another way to avoid the installation of this dependency than to edit the pyproject.toml file from XGBoost locally and install it locally ?

@trivialfis
Copy link
Member

Perhaps pip install . --no-deps?

@trivialfis
Copy link
Member

Not sure if toml project allows a different dependencies profile that overrides the default one. So far the optional dependency configuration only allows additional dependencies.

@trivialfis
Copy link
Member

Alternatively, you can consider conda, which has greater flexibility but the version update is a bit behind.

@rbeilvert
Copy link
Author

Perhaps pip install . --no-deps?

Not ideal as I have other packages in my requirements file for which I want to keep the dependencies, unless I create a specific Heroku buildpack for XGBoost install which sounds a bit overkill.
I'd also have to check XGBoost's pyproject.toml anyway for any shenanigans every time I update its version

Why not moving "nvidia-nccl-cu12 ; platform_system == 'Linux' and platform_machine != 'aarch64'" down into [project.optional-dependencies] ?

Having NCCL as a required dependency really penalizes CPU users

@hcho3
Copy link
Collaborator

hcho3 commented Jul 17, 2024

It might be better to release a CPU-only package to PyPI, e.g. xgboost-cpu, for space-constrained environments.

@trivialfis
Copy link
Member

trivialfis commented Jul 17, 2024

I will leave it to @hcho3 to decide whether it's needed to maintain a fleet of packages (arm linux, x86 linux, windows, macos, arm macos, sdist) for a new pypi project.

It's a bit odd that pep517 has a way to turn on features in a package by having optional dependencies, but there's no way to turn off a feature. Maybe I'm missing something there.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 17, 2024

@trivialfis For xgboost-cpu, let's just maintain x86 linux and Windows (since we build GPU code only for these targets). No need to complicate things.

@hcho3 hcho3 changed the title Set NCCL dependency as optional Release a CPU only wheel on PyPI Jul 17, 2024
@hcho3 hcho3 self-assigned this Jul 17, 2024
@trivialfis
Copy link
Member

Thank you for looking into this @hcho3 . Please note that the feature has been requested for Python https://discuss.python.org/t/help-packaging-optional-application-features-using-extras/14074/18 before and there are lots of discussions. It's not surprising since packages all want to provide as many features as possible in the default build. If we were to make a new project, we should prepare for deprecating/archiving that project as well if pep517 were to support these types of optional dependencies in the future.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 17, 2024

If we were to make a new project, we should prepare for deprecating/archiving that project as well if pep517 were to support these types of optional dependencies in the future.

I agree. We can archive xgboost-cpu once the Python ecosystem provides a way to opt out of optional dependencies.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 17, 2024

@rbeilvert As a workaround, you can add the following URL to your requirements.txt:

https://files.pythonhosted.org/packages/5e/30/b4b1c071964acefeda3faade5b86e1bf2f428c5713404b212941942e835d/xgboost-2.1.0-py3-none-manylinux2014_x86_64.whl

This will install a variant of XGBoost 2.1.0 that doesn't require NCCL. Our plan is to upload this package under the name xgboost-cpu so that it's easy to locate and install.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 17, 2024

@trivialfis It turns out that I already did the necessary work to repackage XGBoost without GPU code in #10478 and #10483. So all it takes is the renaming of the package to xgboost-cpu.

@trivialfis
Copy link
Member

It still needs to remove NCCL from pyproject right?

@hcho3
Copy link
Collaborator

hcho3 commented Jul 17, 2024

It still needs to remove NCCL from pyproject right?

This line removes NCCL from pyproject:

patch -p0 < tests/buildkite/manylinux2014_warning.patch

I will adapt the patch so that it doesn't show the warning about old glibc.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 31, 2024

The xgboost-cpu package is now available on PyPI: https://pypi.org/project/xgboost-cpu/
You can install it with pip install xgboost-cpu.

@rbeilvert @agossard-gpfw @yassinezaim-cp @kmodry @jmarichez @NicolasLegeay @mathieu-luciani @gtramoy @CyprienDuv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants