Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : accept a list of devices to use to offload a model #10497

Merged
merged 5 commits into from
Nov 25, 2024

Conversation

slaren
Copy link
Collaborator

@slaren slaren commented Nov 25, 2024

Adds parameters --device (-dev) and --device-draft (-devd) to specify a comma-separated list of devices to use with the main and draft models. Use --list-devices to see the list of available devices. --dev none will cause GPU usage to be disabled completely, including for large batches.

API changes: adds devices parameter to llama_model_params. If NULL, uses all the available devices (same as the current behavior). ggml_backend_reg_by_name and ggml_backend_dev_by_name are now case-insensitive.

@slaren slaren added the breaking change Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility. label Nov 25, 2024
@slaren slaren force-pushed the sl/llama-dev-selection branch from 8a793ce to f4457cb Compare November 25, 2024 16:16
@slaren slaren force-pushed the sl/llama-dev-selection branch from a6fc1d9 to d6cf918 Compare November 25, 2024 17:01
@github-actions github-actions bot added examples server ggml changes relating to the ggml tensor library for machine learning labels Nov 25, 2024
@slaren slaren merged commit 10bce04 into master Nov 25, 2024
3 of 55 checks passed
@slaren slaren deleted the sl/llama-dev-selection branch November 25, 2024 18:30
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Dec 20, 2024
…#10497)

* llama : accept a list of devices to use to offload a model

* accept `--dev none` to completely disable offloading

* fix dev list with dl backends

* rename env parameter to LLAMA_ARG_DEVICE for consistency
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility. examples ggml changes relating to the ggml tensor library for machine learning server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants