-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add support for Mistral-Nemo by supporting head_dim through config #2254
Conversation
Thank you @shaltielshmid for a nice initiative 🙌 This might also be a good PR to look at from |
@shaltielshmid thanks for the PR. I created another smaller PR (other fixes have happened more generally regarding Happy to take your if you take it out of draft and use the same code (we don't want to pass head_dim, as it can safely be inferred from the config, and we have to handle the case where it doesn't exist because old configs do not define |
Done. Took it out of draft - feel free to take over the PR. |
Your solution would be better if we were still using |
key in mistralConfig (as defined in transformers).
…fig (#2254) * Support passing head_dim through config * Using `head_dim` as a fallback is necessary since it's a non standard key in mistralConfig (as defined in transformers). * Shorter diff. --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
…fig (#2254) * Support passing head_dim through config * Using `head_dim` as a fallback is necessary since it's a non standard key in mistralConfig (as defined in transformers). * Shorter diff. --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
…uggingface#2254) * Support passing head_dim through config * Using `head_dim` as a fallback is necessary since it's a non standard key in mistralConfig (as defined in transformers). * Shorter diff. --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
What does this PR do?
Added support for Mistral-Nemo by allowing the head dimension to be specified in the configuration file. The changes included here mimic the behavior done in the
transformers
package by Mistral here.As discussed in the issue (#2252) with @ErikKaum - this PR allows the model to be loaded successfully and launched, but when sending a request I'm getting gibberish output. I tested running the same prompt with installing transformers from source, and the results were great (even with default temperature). I'm not familiar with how TGI works internally, so would appreciate any advice.
How to replicate:
Launch the TGI:
And then pinging it:
The output I'm getting is:
Fixes # (issue)
#2252: Add support for Mistral-Nemo
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@ErikKaum