Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add support for Mistral-Nemo by supporting head_dim through config #2254

Merged
merged 4 commits into from
Jul 23, 2024

Conversation

shaltielshmid
Copy link
Contributor

What does this PR do?

Added support for Mistral-Nemo by allowing the head dimension to be specified in the configuration file. The changes included here mimic the behavior done in the transformers package by Mistral here.

As discussed in the issue (#2252) with @ErikKaum - this PR allows the model to be loaded successfully and launched, but when sending a request I'm getting gibberish output. I tested running the same prompt with installing transformers from source, and the results were great (even with default temperature). I'm not familiar with how TGI works internally, so would appreciate any advice.

How to replicate:

Launch the TGI:

git checkout add-mistral-nemo
docker build . -t tgi
docker run --gpus all --shm-size 1g -e HF_TOKEN=<hf_token> -p 8080:80 -v $(pwd)/data:/data tgi --model-id mistralai/Mistral-Nemo-Base-2407

And then pinging it:

curl 127.0.0.1:8080/generate \
    -X POST \
    -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
    -H 'Content-Type: application/json'

The output I'm getting is:

{"generated_text":" Deep zweigeschossy Bildholespy skulpt house Schriftstellerin\n\n في Bildholespy skulpt house Schriftstellerin\n\n في Bildh"}

Fixes # (issue)

#2252: Add support for Mistral-Nemo

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ErikKaum

@ErikKaum
Copy link
Member

ErikKaum commented Jul 19, 2024

Thank you @shaltielshmid for a nice initiative 🙌

This might also be a good PR to look at from transformers

@shaltielshmid
Copy link
Contributor Author

Thanks @ErikKaum - I see that the changes done in the PR you mentioned is already done in TGI here.

@Narsil
Copy link
Collaborator

Narsil commented Jul 23, 2024

@shaltielshmid thanks for the PR.

I created another smaller PR (other fixes have happened more generally regarding head_dim in other places.
#2283

Happy to take your if you take it out of draft and use the same code (we don't want to pass head_dim, as it can safely be inferred from the config, and we have to handle the case where it doesn't exist because old configs do not define head_dim.

@shaltielshmid shaltielshmid marked this pull request as ready for review July 23, 2024 09:43
@shaltielshmid
Copy link
Contributor Author

shaltielshmid commented Jul 23, 2024

Done. Took it out of draft - feel free to take over the PR.
I will note one change that I kept: In a recent commit you added a check if hasattr(config, 'head_dim') - which I removed, since I define the property in the config constructor and therefore it will always exist.
But feel free to undo it or just take your PR - either way. Do you know if TGI works now for mistral-nemo?

@Narsil
Copy link
Collaborator

Narsil commented Jul 23, 2024

Your solution would be better if we were still using MistralConfig but we are not. We removed a lot of complexity by relying more on transformers config directly as it simplifies code here quite a bit (with some cost regarding legacy, like the definition of head_dim which is non standard.)

@Narsil Narsil merged commit 3961e32 into huggingface:main Jul 23, 2024
1 of 5 checks passed
ErikKaum pushed a commit that referenced this pull request Jul 25, 2024
…fig (#2254)

* Support passing head_dim through config

* Using `head_dim` as a fallback is necessary since it's a non standard
key in mistralConfig (as defined in transformers).

* Shorter diff.

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
ErikKaum pushed a commit that referenced this pull request Jul 26, 2024
…fig (#2254)

* Support passing head_dim through config

* Using `head_dim` as a fallback is necessary since it's a non standard
key in mistralConfig (as defined in transformers).

* Shorter diff.

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
@ErikKaum ErikKaum mentioned this pull request Jul 29, 2024
2 tasks
yuanwu2017 pushed a commit to yuanwu2017/tgi-gaudi that referenced this pull request Sep 26, 2024
…uggingface#2254)

* Support passing head_dim through config

* Using `head_dim` as a fallback is necessary since it's a non standard
key in mistralConfig (as defined in transformers).

* Shorter diff.

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants