Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix training issues #36158

Merged
merged 3 commits into from
Feb 13, 2025
Merged

fix training issues #36158

merged 3 commits into from
Feb 13, 2025

Conversation

ArthurZucker
Copy link
Collaborator

What does this PR do?

Fixes #35990! Paligemma did not have past_key_values

@ArthurZucker
Copy link
Collaborator Author

cc @muellerzr and @SunMarc WDYT about just by default setting this for all PreTrainedModels? (If we forget about it, people are not bothered

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@SunMarc
Copy link
Member

SunMarc commented Feb 13, 2025

Yeah sounds good to me @ArthurZucker. Maybe we can put it here since we only use it in trainer. This way, we don't add it in the config of models that potentially dont need it.

        if ignore_keys is None:
            if hasattr(self.model, "config"):
                ignore_keys = getattr(self.model.config, "keys_to_ignore_at_inference", ["past_key_values"])

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
@ArthurZucker ArthurZucker marked this pull request as ready for review February 13, 2025 13:27
Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot !

@ArthurZucker ArthurZucker merged commit 0ca7259 into main Feb 13, 2025
26 checks passed
@ArthurZucker ArthurZucker deleted the fix-palig branch February 13, 2025 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Transformers PaliGemma evaluate and compute_loss fail with tensors/device errors
3 participants