Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission denied error in cuda_setup when any env var contains /root path #675

Closed
osma opened this issue Aug 4, 2023 · 4 comments · Fixed by #677
Closed

Permission denied error in cuda_setup when any env var contains /root path #675

osma opened this issue Aug 4, 2023 · 4 comments · Fixed by #677

Comments

@osma
Copy link
Contributor

osma commented Aug 4, 2023

My setup

Linux, Python 3.10.8, CUDA 12, pip install bitsandbytes==0.41.0

This is a Slurm cluster and there is an environment variable like this:

SLURM_SUBMIT_DIR=/root

The problem

Running python -m bitsandbytes fails with a permisison error. It appears that bitsandbytes decides to check if the path /root contains libcudart.so because this path appears in an environment variable.

Note that my LD_LIBRARY_PATH contains the real path to the CUDA libraries. cuda_setup does find them, but it doesn't stop there; instead it goes on to check all kinds of other random environment variables, including SLURM_SUBMIT_DIR in my situation, in case they also happen to contain CUDA libraries, and then stumbles on the permission error when it can't read files under /root. FWIW, I think this is also a logic error (it should stop when it finds the CUDA libraries using LD_LIBRARY_PATH and look no further) but that's beyond the scope of this report.

$ python -m bitsandbytes
Traceback (most recent call last):
  File "/appl/easybuild/opt/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/appl/easybuild/opt/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/runpy.py", line 146, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/appl/easybuild/opt/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/runpy.py", line 110, in _get_module_details
    __import__(pkg_name)
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/__init__.py", line 6, in <module>
    from . import cuda_setup, utils, research
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/research/__init__.py", line 1, in <module>
    from . import nn
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/research/nn/__init__.py", line 1, in <module>
    from .modules import LinearFP8Mixed, LinearFP8Global
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/research/nn/modules.py", line 8, in <module>
    from bitsandbytes.optim import GlobalOptimManager
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/optim/__init__.py", line 6, in <module>
    from bitsandbytes.cextension import COMPILED_WITH_CUDA
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 13, in <module>
    setup.run_cuda_setup()
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py", line 120, in run_cuda_setup
    binary_name, cudart_path, cc, cuda_version_string = evaluate_cuda_setup()
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py", line 337, in evaluate_cuda_setup
    cudart_path = determine_cuda_runtime_lib_path()
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py", line 295, in determine_cuda_runtime_lib_path
    cuda_runtime_libs.update(find_cuda_lib_in(value))
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py", line 231, in find_cuda_lib_in
    return get_cuda_runtime_lib_paths(
  File "/wrk-vakka/users/xxx/llama2-finetune/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py", line 217, in get_cuda_runtime_lib_paths
    if (path / libname).is_file():
  File "/appl/easybuild/opt/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/pathlib.py", line 1322, in is_file
    return S_ISREG(self.stat().st_mode)
  File "/appl/easybuild/opt/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/pathlib.py", line 1097, in stat
    return self._accessor.stat(self, follow_symlinks=follow_symlinks)
PermissionError: [Errno 13] Permission denied: '/root/libcudart.so'

How to replicate

  1. have CUDA installed properly so it can be found via LD_LIBRARY_PATH
  2. pip install bitsandbytes==0.41.0
  3. export MY_RANDOM_ENV_VAR=/root
  4. python -m bitsandbytes

Related issues/PRs

How to fix

This is fairly simple to fix, just need to wrap the is_file check in a try ... except block that checks for PermissionError, here: https://github.com/TimDettmers/bitsandbytes/blob/18e827d666fa2b70a12d539ccedc17aa51b2c97c/bitsandbytes/cuda_setup/main.py#L217

I will open a PR with the suggested fix shortly.

@ignasgr
Copy link

ignasgr commented Aug 6, 2023

Hi @osma, do you think #620 is a related issue?

@osma
Copy link
Contributor Author

osma commented Aug 7, 2023

@ignasgr #620 seems to be a similar issue in that bitsandbytes cuda_setup checks for existence of a file in a careless way that can trigger PermissionError exceptions. But it's happening in a different place in the code, so my suggested fix in PR #677 will most likely not fix that problem. But I think applying PR #622 would fix it.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@TimDettmers
Copy link
Collaborator

Thanks again for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants