Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update fedora cuda guide for 12.8 release #11393

Merged
merged 2 commits into from
Feb 6, 2025

Conversation

teihome
Copy link
Contributor

@teihome teihome commented Jan 24, 2025

In this pull request the cuda-fedora.md guide has been updated to use the latest release of CUDA 12.8 (previously 12.6), that was uploaded on 2025-01-17.

The new release uses the current version of Fedora 41 (previously 39).

This guide continues to use the Toolbox environment to allow easy installation on Silverblue or Workstation systems alike.

This pull request also updates the CUDA section of the build document to be more clear and descriptive for compiling for explicit compute compatibility targets.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jan 24, 2025
@teihome teihome force-pushed the cuda-fedora-guide-update branch from c0bf064 to ecb81a4 Compare January 24, 2025 16:16
@teihome teihome marked this pull request as draft January 24, 2025 18:26
@teihome
Copy link
Contributor Author

teihome commented Jan 24, 2025

Converted to a draft, as there are some compatibility issues with using the newer NVIDIA drivers in the toolbox over the host.

@teihome teihome force-pushed the cuda-fedora-guide-update branch from c87f807 to e87d080 Compare January 24, 2025 18:35
@teihome teihome force-pushed the cuda-fedora-guide-update branch from e87d080 to 6f9a843 Compare January 24, 2025 19:28
@teihome
Copy link
Contributor Author

teihome commented Jan 24, 2025

Okay, I have resolved the issue, the issue was that nvidia-driver-cuda-libs was installed in the guest even when /usr/lib64/libcuda.so.1 was supplied by the host.

When the guest libcuda was older, it would not be updated, so it matched the version of the host, but now with the guest having a newer version (570) of the libraries than the host (565), it would break.

It is fixed by never installing nvidia-driver-cuda-libs to the guest filesystem if the host is supplying CUDA.

@teihome teihome marked this pull request as ready for review January 24, 2025 19:36
@da2ce7
Copy link

da2ce7 commented Feb 5, 2025

With only minor modification it worked for the www.runpod.io config, using the docker image: registry.fedoraproject.org/fedora-toolbox:41

dnf distro-sync --assumeyes --quiet > /dev/null;
dnf install vim-default-editor -y --allowerasing  --assumeyes --quiet > /dev/null;
dnf install @c-development @development-tools cmake sshd  --assumeyes --quiet > /dev/null;
dnf config-manager addrepo --from-repofile=https://developer.download.nvidia.com/compute/cuda/repos/fedora41/x86_64/cuda-fedora41.repo;
dnf download --destdir=/tmp/nvidia-driver-libs --resolve --arch x86_64 nvidia-driver-cuda nvidia-driver-libs nvidia-driver-cuda-libs nvidia-persistenced --quiet > /dev/null;
rpm --install --verbose --hash --justdb /tmp/nvidia-driver-libs/* --quiet > /dev/null;
rm -rf /tmp/nvidia-driver-libs;
dnf install cuda  --assumeyes --quiet > /dev/null;
echo "export PATH=\$PATH:/usr/local/cuda/bin" >> /etc/profile.d/cuda.sh;
chmod +x /etc/profile.d/cuda.sh;
source /etc/profile.d/cuda.sh;
nvcc --version;
git clone --depth=1 https://github.com/ggerganov/llama.cpp.git /tmp/llama.cpp
cd /tmp/llama.cpp;
cmake -B build -DGGML_CUDA=ON;
cmake --build build --config Release -j 20;
cmake --install build;
cd ~;
rm -rf /tmp/llama.cpp;
echo "/usr/local/lib" | sudo tee /etc/ld.so.conf.d/local-lib.conf;
echo "/usr/local/lib64" | sudo tee /etc/ld.so.conf.d/local-lib64.conf;
ldconfig;

I think that this documentation update can be merged as it is.

@da2ce7
Copy link

da2ce7 commented Feb 5, 2025

I now have made a template based upon the guide provided here: https://runpod.io/console/deploy?template=mtwj86pqgc&ref=r0lfrx3d

@da2ce7
Copy link

da2ce7 commented Feb 6, 2025

Perhaps @ericcurtin and @ngxson would like to review this documentation update, I feel that it might have been lost in the history.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also cc @ericcurtin if you want to take a look

@ericcurtin
Copy link
Collaborator

Approved, Nvidia provide UBI9 based containers which are quite useful also, we use them in RamaLama

@ericcurtin ericcurtin merged commit 9ab42dc into ggml-org:master Feb 6, 2025
2 checks passed
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
* docs: update fedora cuda guide for 12.8 release

* docs: build cuda update
orca-zhang pushed a commit to orca-zhang/llama.cpp that referenced this pull request Feb 26, 2025
* docs: update fedora cuda guide for 12.8 release

* docs: build cuda update
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
* docs: update fedora cuda guide for 12.8 release

* docs: build cuda update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants