Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idefics2 raises shape error in the backward pass with gradient_checkpointing #30301

Closed
chenzizhao opened this issue Apr 17, 2024 · 3 comments · Fixed by #30320
Closed

Idefics2 raises shape error in the backward pass with gradient_checkpointing #30301

chenzizhao opened this issue Apr 17, 2024 · 3 comments · Fixed by #30320

Comments

@chenzizhao
Copy link
Contributor

I get a shape error if I PEFT the latest Idefics2 model at head with gradient checkpointing. The symptom is identical to this. Seems like the following block isn't compatible with gradient checkpointing.

if use_cache:
if not isinstance(past_key_values, Cache):
past_key_values = DynamicCache.from_legacy_cache(past_key_values)
past_seen_tokens = past_key_values.get_usable_length(seq_length)

In my local workspace a fix similar to 91f4b7e removed the shape error but a quick review of the logic from a maintainer would be lovely. Happy to submit PR after. Thanks!

@chenzizhao
Copy link
Contributor Author

Patch: a317323

@amyeroberts
Copy link
Collaborator

Hi @chenzizhao, thanks for raising this and providing links to the patch and relevant commits!

Would you like to open a PR with these changes? This way you get the github contribution for the solution.

cc @younesbelkada @gante for reference

@younesbelkada
Copy link
Contributor

Nice catch @chenzizhao !
This is not catched by our CI sadly, we need to work on that ! adding it on my TODOs !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants