Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add on_optimizer_step to callback options #31095

Merged
merged 4 commits into from
May 29, 2024

Conversation

dhruvbpai
Copy link
Contributor

Add on_optimizer_step callback option in TrainerCallbacks

Aside: This is my first open source pull request, so any feedback would be much appreciated!

The test tests/trainer/test_trainer_callback.py has been modified appropriately to invoke the new callback method.

Fixes #31033 (issue)

Reviewers

As tagged in initial issue - @muellerzr @younesbelkada

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me thanks !

@younesbelkada younesbelkada requested a review from muellerzr May 29, 2024 08:45
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me as well! Nice job. cc @amyeroberts for final review

@muellerzr muellerzr requested a review from amyeroberts May 29, 2024 14:15
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, LGTM! Thanks a lot 🤗

@LysandreJik LysandreJik merged commit 5c88253 into huggingface:main May 29, 2024
21 checks passed
@dhruvbpai dhruvbpai deleted the before_optimizer_step branch May 29, 2024 19:20
@PangziZhang523
Copy link

Hello, thank you for your contribution. I used on_optimizer_step to print the gradient, and all the printed values ​​were None, but the final grad_norm had a value. Why is that?

@colmon46
Copy link

Hello, thank you for your contribution. I used on_optimizer_step to print the gradient, and all the printed values ​​were None, but the final grad_norm had a value. Why is that?

Same question with @PangziZhang523 . Do you know how to fix this problem plz?

@Arunprakash-A
Copy link
Contributor

It is working fine for me.

@YeLuoSuiYou
Copy link

Hello, thank you for your contribution. I used on_optimizer_step to print the gradient, and all the printed values ​​were None, but the final grad_norm had a value. Why is that?

Same question with @PangziZhang523 . Do you know how to fix this problem plz?

Hi @colmon46, if we use deepspeed or FSDP, we should use self.model_wrapped to get gradient of every layers, but do you know how can we get self.model_wrap in callbacks, 3ku

@Gaoyg
Copy link

Gaoyg commented Nov 7, 2024

Hello, thank you for your contribution. I used on_optimizer_step to print the gradient, and all the printed values ​​were None, but the final grad_norm had a value. Why is that?

Same question with @PangziZhang523 . Do you know how to fix this problem plz?

Same question. Is there a solution already? thx~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add per-parameter gradient logging (and before optimizer step callback)
10 participants