Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA errors on pip install torch sort #87

Open
rjs0 opened this issue Feb 13, 2025 · 0 comments
Open

CUDA errors on pip install torch sort #87

rjs0 opened this issue Feb 13, 2025 · 0 comments

Comments

@rjs0
Copy link

rjs0 commented Feb 13, 2025

I am trying to install torchsort on a Linux device. I am doing this in a conda environment. This is the list of commands I am running:
conda create -n ts3 python=3.11
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c conda-forge numpy=1.26.0
conda install nvidia/label/cuda-11.8.0::cuda-toolkit
pip install torchsort
conda install -c conda-forge gxx_linux-64=9.4.0
export CXX=/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/bin/x86_64-conda_cos6-linux-gnu-g++
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/nfs/turbo/coe-rbg/rjsingh/miniconda/lib
pip install --force-reinstall --no-cache-dir --no-deps torchsort

When I run the command: pip install torchsort, I get the below error. I have also tried The gxx_linux instal first, and then running pip install torch sort. This gives the same error. I've also tried not doing pip install torch sort, and just doing pip install --force-reinstall --no-cache-dir --no-deps torchsort only after the previous steps.

Building wheels for collected packages: torchsort
Building wheel for torchsort (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [91 lines of output]
No CUDA runtime is found, using CUDA_HOME='/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts'
running bdist_wheel
/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/utils/cpp_extension.py:502: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
running build
running build_py
creating build/lib.linux-x86_64-cpython-311/torchsort
copying torchsort/init.py -> build/lib.linux-x86_64-cpython-311/torchsort
copying torchsort/ops.py -> build/lib.linux-x86_64-cpython-311/torchsort
running egg_info
writing torchsort.egg-info/PKG-INFO
writing dependency_links to torchsort.egg-info/dependency_links.txt
writing requirements to torchsort.egg-info/requires.txt
writing top-level names to torchsort.egg-info/top_level.txt
reading manifest file 'torchsort.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'torchsort.egg-info/SOURCES.txt'
copying torchsort/isotonic_cpu.cpp -> build/lib.linux-x86_64-cpython-311/torchsort
copying torchsort/isotonic_cuda.cu -> build/lib.linux-x86_64-cpython-311/torchsort
running build_ext
/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/utils/cpp_extension.py:424: UserWarning: There are no /nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/bin/x86_64-conda_cos6-linux-gnu-g++ version bounds defined for CUDA version 11.8
warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
building 'torchsort.isotonic_cpu' extension
creating build/temp.linux-x86_64-cpython-311/torchsort
/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/bin/x86_64-conda_cos6-linux-gnu-g++ -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/include -fPIC -I/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/include -I/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/include/TH -I/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/include/THC -I/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/include/python3.11 -c torchsort/isotonic_cpu.cpp -o build/temp.linux-x86_64-cpython-311/torchsort/isotonic_cpu.o -fopenmp -ffast-math -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=isotonic_cpu -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17
/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/bin/x86_64-conda_cos6-linux-gnu-g++ -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/include -pthread -B /nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/compiler_compat -shared -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib -Wl,-rpath-link,/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib -L/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/include build/temp.linux-x86_64-cpython-311/torchsort/isotonic_cpu.o -L/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -o build/lib.linux-x86_64-cpython-311/torchsort/isotonic_cpu.cpython-311-x86_64-linux-gnu.so
building 'torchsort.isotonic_cuda' extension
/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/tmp/pip-install-ekjemnvq/torchsort_3f22a4d91dfc49f58a97f92682a3dbbb/setup.py", line 52, in
setup(
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/init.py", line 117, in setup
return distutils.core.setup(**attrs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 186, in setup
return run_commands(dist)
^^^^^^^^^^^^^^^^^^
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 202, in run_commands
dist.run_commands()
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 983, in run_commands
self.run_command(cmd)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/dist.py", line 999, in run_command
super().run_command(command)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 1002, in run_command
cmd_obj.run()
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/command/bdist_wheel.py", line 379, in run
self.run_command("build")
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 339, in run_command
self.distribution.run_command(command)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/dist.py", line 999, in run_command
super().run_command(command)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 1002, in run_command
cmd_obj.run()
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/command/build.py", line 136, in run
self.run_command(cmd_name)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 339, in run_command
self.distribution.run_command(command)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/dist.py", line 999, in run_command
super().run_command(command)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 1002, in run_command
cmd_obj.run()
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 99, in run
_build_ext.run(self)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 365, in run
self.build_extensions()
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 873, in build_extensions
build_ext.build_extensions(self)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 481, in build_extensions
self._build_extensions_serial()
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 507, in _build_extensions_serial
self.build_extension(ext)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 264, in build_extension
_build_ext.build_extension(self, ext)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 562, in build_extension
objects = self.compiler.compile(
^^^^^^^^^^^^^^^^^^^^^^
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/setuptools/_distutils/ccompiler.py", line 607, in compile
self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 609, in unix_wrap_single_compile
cflags = unix_cuda_flags(cflags)
^^^^^^^^^^^^^^^^^^^^^^^
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 576, in unix_cuda_flags
cflags + _get_cuda_arch_flags(cflags))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/nfs/turbo/coe-rbg/rjsingh/miniconda/envs/ts/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1980, in _get_cuda_arch_flags
arch_list[-1] += '+PTX'
~~~~~~~~~^^^^
IndexError: list index out of range
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for torchsort
Running setup.py clean for torchsort
Failed to build torchsort

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant