Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vulkan : add backend registry / device interfaces #9721

Merged
merged 3 commits into from
Oct 17, 2024
Merged

Conversation

slaren
Copy link
Collaborator

@slaren slaren commented Oct 3, 2024

No description provided.

@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Oct 3, 2024
@slaren slaren marked this pull request as ready for review October 3, 2024 23:27
@slaren
Copy link
Collaborator Author

slaren commented Oct 3, 2024

@0cc4m This PR has two additional changes:

  • Translates the device index in ggml_backend_vk_get_device_description (I believe this was a bug)
  • Changes the names of the backends/buffers etc to Vulkan<idx>. This is the intended use for the name of these objects, a more detailed description can now be obtained using the ggml-backend device interface.

After this change it is possible to use Vulkan and CUDA in the same llama.cpp build (you may have the disable the NVIDIA devices in the Vulkan backend using the GGML_VK_VISIBLE_DEVICES environment variable).

@0cc4m 0cc4m self-requested a review October 4, 2024 05:23
@MaggotHATE
Copy link
Contributor

Seems to work fine (Win10), but I'm noticing another increase in layer size. Previously with Mistral-Nemo-Instruct-2407.q5_k_l I could offload 5 layers on 3GB VRAM, now it's only 3. Is it expected? The total VRAM usage is pretty much the same as before backend registry updates.

@slaren
Copy link
Collaborator Author

slaren commented Oct 8, 2024

I don't think there are any changes here that could increase the memory usage. It's just exposing existing functionality of the vulkan backend through a different interface.

@0cc4m
Copy link
Collaborator

0cc4m commented Oct 16, 2024

@slaren Thank you for implementing this. I can confirm it builds on Linux and that the code looks good. I can't fully test it currently since my server is still disassembled cause I'm in the process of moving between cities. I should be able to reassemble it this weekend, but I'm still very busy. You can decide if you prefer to wait or if you think it's ready to merge.

@slaren
Copy link
Collaborator Author

slaren commented Oct 16, 2024

Can you check the changes to ggml_backend_vk_get_device_description? Previously, it wouldn't translate the device index to the indexes given by GGML_VK_VISIBLE_DEVICES, which I believe was a bug. Other than that, I think that there is very little chance that this PR breaks anything.

@0cc4m
Copy link
Collaborator

0cc4m commented Oct 16, 2024

That was a bug, yeah.

@slaren slaren merged commit f010b77 into master Oct 17, 2024
54 checks passed
@slaren slaren deleted the sl/vulkan-reg-2 branch October 17, 2024 00:47
drollings pushed a commit to drollings/llama.cpp that referenced this pull request Oct 18, 2024
* vulkan : add backend registry / device interfaces

* llama : print devices used on model load
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
* vulkan : add backend registry / device interfaces

* llama : print devices used on model load
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* vulkan : add backend registry / device interfaces

* llama : print devices used on model load
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* vulkan : add backend registry / device interfaces

* llama : print devices used on model load
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants