Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Falcon tie_word_embeddings in GGUF #35715

Merged
merged 4 commits into from
Jan 16, 2025
Merged

Conversation

MekkCyber
Copy link
Contributor

@MekkCyber MekkCyber commented Jan 15, 2025

What does this PR do?

In the modeling_gguf_pytorch_utils.py file, the value of tie_word_embeddings is determined by the presence (or absence) of the tensor output.weight in the GGUF file. While this approach is generally a good indicator, the Falcon architecture is an exception to the rule as you can see here : https://huggingface.co/tiiuae/falcon-7b/tree/main?show_file_info=model-00002-of-00002.safetensors (hf format) and here https://huggingface.co/tensorblock/falcon-7b-GGUF/tree/main?show_file_info=falcon-7b-Q2_K.gguf (gguf format)

In Falcon, word_embeddings are tied to the lm_head weights. Despite this, output.weight is still present in the GGUF file, and lm_head is included in the Hugging Face model format. To handle this edge case, I added an exception array for such architectures.

This issue was causing a subtle error related to parameters not being on the same device, which was only discoverable in multi-GPU settings.

Who can review ?

@SunMarc @Isotr0py

@MekkCyber MekkCyber requested review from SunMarc and removed request for Rocketknight1 and ArthurZucker January 15, 2025 15:57
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Isotr0py
Copy link
Collaborator

Isotr0py commented Jan 15, 2025

But I remember tie_word_embeddings is not used by modeling_falcon.py?

BTW, if the issue about tie_word_embedding is the case, can we also update the falcon conversion test to cover this edge case?

@MekkCyber
Copy link
Contributor Author

tie_word_embeddings is not used directly in the modeling file of the model, instead it's used in modeling_utils.py : https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L1858
It was the failure of the Falcon conversion test that made me realize the error, and now it passes successfully.

Copy link
Collaborator

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for fixing!

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Let's do that for falcon but if more issues starts to appear, let's modify the tests slightly to not take into account the device of the tensors

@SunMarc SunMarc merged commit fd4f14c into main Jan 16, 2025
26 checks passed
@SunMarc SunMarc deleted the fix_ggml_falcon_tiewordembeddings branch January 16, 2025 12:18
bursteratom pushed a commit to bursteratom/transformers that referenced this pull request Jan 31, 2025
* fix falcon tie_word_embeddings

* fix style
elvircrn pushed a commit to elvircrn/transformers that referenced this pull request Feb 13, 2025
* fix falcon tie_word_embeddings

* fix style
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants