-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vocab size mismatch #3900
Comments
This is an easy fix. There should be a .json file (probably params.json) inside the llama-2-7b-chat folder. Here is my params.json file: {"dim": 4096, "multiple_of": 256, "n_heads": 32, "n_layers": 32, "norm_eps": 1e-06, "vocab_size": 32000} |
Thanks for the fix, i did downloaded older version repo where i have not faced this issue few days back. :) |
same issue, why meta write -1 ? |
I also hit this and I think the correct way to fix it is in the convert script to simply remove the "vocab_size" if it equals -1 which will result in getting it from the tok_embeddings.weight. |
When vocab_size is detected to be -1 simply remove its value from the parsed params.json and fallback to using the tok_embeddings.weight. Fixes ggerganov#3900
When vocab_size is detected to be -1 simply remove its value from the parsed params.json and fallback to using the tok_embeddings.weight. Fixes ggerganov#3900
+1 I ran into this also ( |
Had the same issue with llama-2-7b. |
Confirming this fixed the issue with the most recent Llama download |
Ran into a similar issue today, using the current tip of master bcc0eb4:
|
I confirm this fixed the issue with Llama2 7b models |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Llama_2_7B-chat
vocab size mismatch (model has -1 but tokenizer.model has 32000)
The text was updated successfully, but these errors were encountered: