Fix rope theta for OpenLlama #29893

jla524 · 2024-03-27T05:55:05Z

What does this PR do?

Fixes #29506 (issue)

Rope theta has been added in the config with the same default value as Llama.

Results:

% python3
Python 3.12.2 (main, Feb  6 2024, 20:19:44) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from transformers import OpenLlamaForCausalLM
>>> model = OpenLlamaForCausalLM.from_pretrained("openlm-research/open_llama_7b")
You are using a model of type llama to instantiate a model of type open-llama. This is not supported for all configurations of models and can yield errors.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:22<00:00, 11.41s/it]
Some weights of OpenLlamaForCausalLM were not initialized from the model checkpoint at openlm-research/open_llama_7b and are newly initialized: ['model.embed_layer_norm.bias', 'model.embed_layer_norm.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Who can review?

@ArthurZucker and @kevin-guimard-ext

ArthurZucker

Thanks!

HuggingFaceDocBuilderDev · 2024-03-30T15:49:00Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

fix: rope_theta for open llama

4168ab9

ArthurZucker approved these changes Mar 30, 2024

View reviewed changes

ArthurZucker merged commit 6fd93fe into huggingface:main Mar 30, 2024
8 checks passed

jla524 deleted the fix_open_llama branch April 12, 2024 05:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix rope theta for OpenLlama #29893

Fix rope theta for OpenLlama #29893

jla524 commented Mar 27, 2024

ArthurZucker left a comment

HuggingFaceDocBuilderDev commented Mar 30, 2024

Fix rope theta for OpenLlama #29893

Fix rope theta for OpenLlama #29893

Conversation

jla524 commented Mar 27, 2024

What does this PR do?

Who can review?

ArthurZucker left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Mar 30, 2024