-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add warning and info message for beta and gamma parameters #33192
base: main
Are you sure you want to change the base?
Conversation
cc @amyeroberts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this @zly-idleness!
At the moment, this involves iterating over the loaded keys twice - once on L4116 and then on L4146. It also involves a redefinition of original_loaded_keys
. We should rework the code to reduce the double logic
Thank you for pointing that out. I'll refactor the code to eliminate redundancy |
src/transformers/modeling_utils.py
Outdated
old_keys.append(key) | ||
new_keys.append(new_key) | ||
renamed_keys[key] = new_key | ||
loaded_keys[i] = new_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work - this will modify both loaded_keys
and original_loaded_keys
:
In [1]: li_0 = list(range(10))
In [2]: li_1 = li_0
In [3]: for i in range(10):
...: if i % 2:
...: li_0[i] = -1
...:
In [4]: li_0
Out[4]: [0, -1, 2, -1, 4, -1, 6, -1, 8, -1]
In [5]: li_1
Out[5]: [0, -1, 2, -1, 4, -1, 6, -1, 8, -1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you sir ! You are totally right , I forget add a copy() to ensure that the original_loaded_keys remain unaltered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating!
At the moment, the logic is being forced around the old renaming code. There are also changes in the PR to the olmoe model which should be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There shouldn't be any changes to this file in the PR
src/transformers/modeling_utils.py
Outdated
warning_msg += 'contains parameters that have been renamed internally ("gamma" and "beta" in parameters) (a few are listed below but more are present in the model):\n' | ||
logger.warning(warning_msg) | ||
for old_key, new_key in renamed_keys.items(): | ||
warning_msg += f"* `{old_key}` -> `{new_key}`\n" | ||
warning_msg += "If you are using a model from the Hub, consider submitting a PR to adjust these weights and help future users." | ||
logger.info(warning_msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message isn't consistent - at the moment all of the renamed keys will be listed. Like in the other places where this logic is added, let's just take the first renames
src/transformers/modeling_utils.py
Outdated
return None | ||
|
||
for i, key in enumerate(loaded_keys): | ||
new_key = _fix_key(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than try and use the existing _fix_key
logic - it would be better to rework this to:
- Not use
copy
- To only add the first renaming case for gamma and beta respectively
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok , I will do it
What does this PR do?
This adds a warning message to notify about the renaming of gamma and beta parameters during initialisation and also during loading.
before:
after:
Fixes #29554 and #33190 (issue)
Before submitting
Pull Request section?
to it if that's the case.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag

members/contributors who may be interested in your PR.