-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add on_optimizer_step to callback options #31095
Add on_optimizer_step to callback options #31095
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me thanks !
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me as well! Nice job. cc @amyeroberts for final review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, LGTM! Thanks a lot 🤗
Hello, thank you for your contribution. I used on_optimizer_step to print the gradient, and all the printed values were None, but the final grad_norm had a value. Why is that? |
Same question with @PangziZhang523 . Do you know how to fix this problem plz? |
It is working fine for me. |
Hi @colmon46, if we use deepspeed or FSDP, we should use self.model_wrapped to get gradient of every layers, but do you know how can we get self.model_wrap in callbacks, 3ku |
Same question. Is there a solution already? thx~ |
Add on_optimizer_step callback option in TrainerCallbacks
Aside: This is my first open source pull request, so any feedback would be much appreciated!
The test
tests/trainer/test_trainer_callback.py
has been modified appropriately to invoke the new callback method.Fixes #31033 (issue)
Reviewers
As tagged in initial issue - @muellerzr @younesbelkada