Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : model-based max number of graph nodes calculation #8970

Merged
merged 2 commits into from
Aug 12, 2024

Conversation

nicoboss
Copy link
Contributor

This fixes #8950 and #8615
This PR builds on top of the changes made in #8622

Calculate the max number of nodes using max(8192, model.tensors_by_name.size()*5) as recommended by @slaren in #8950 (comment). I specified 8192 as minimum to ensure this change will not break any currently working models.

Thanks to this change I was able to run inference on BigLlama-3.1-681B-Instruct and Meta-Llama-3-405B-Instruct-Up-Merge booth of which did not load prior to this change.

@mofosyne mofosyne added Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix merge ready indicates that this may be ready to merge soon and is just holding out in case of objections labels Aug 10, 2024
src/llama.cpp Outdated Show resolved Hide resolved
@slaren slaren merged commit 0fd93cd into ggerganov:master Aug 12, 2024
50 of 52 checks passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
…8970)

* llama : model-based max number of graph nodes calculation

* Update src/llama.cpp

---------

Co-authored-by: slaren <slarengh@gmail.com>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
…8970)

* llama : model-based max number of graph nodes calculation

* Update src/llama.cpp

---------

Co-authored-by: slaren <slarengh@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge ready indicates that this may be ready to merge soon and is just holding out in case of objections Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: BigLlama-3.1-681B-Instruct requires llama_model_max_nodes to return a higher value
3 participants