-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forbid PretrainedConfig
from saving generate
parameters; Update deprecations in generate
-related code 🧹
#32659
Conversation
@@ -12,45 +12,13 @@ | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
|
|||
import inspect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR disabled TF conversion (the warning said it would be removed in v4.43), but kept the error message which redirects to the safetensor conversion guide :)
Added a note to myself to remove this file in the future (v4.47).
@@ -693,7 +693,7 @@ def forward( | |||
past_key_values = DynamicCache.from_legacy_cache(past_key_values) | |||
logger.warning_once( | |||
"Using `past_key_values` as a tuple is deprecated and will be removed in v4.45. " | |||
"Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)" | |||
"Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/internal/generation_utils#transformers.Cache)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
found in my quick scan for deprecations (CMD+F on "v4.41") -- let's use our latest docs to guide our users
1303438
to
f14a71a
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Killing old code 🤩
"We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. " | ||
"Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)" | ||
"Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/internal/generation_utils#transformers.Cache)" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not trigger this warning within the DynamicCache.from_legacy_cache call - avoids repeating in all the modeling code + avoids forgetting to add in modelling code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to use DynamicCache.from_legacy_cache
in non-deprecated ways :p for instance, some optimum integrations use it
# 1. The parameter is set to a default generation value (from `generate`'s perspective, it's the same | ||
# as if nothing is set) | ||
is_default_generation_value = getattr(self, parameter_name) == default_value | ||
# 2. The parameter is set as default in the model config (BC support for models like BART) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This situation was not encoded before this PR (and caused tests to fail, god bless tests 🙏 )
@@ -2549,26 +2549,20 @@ def save_pretrained( | |||
# Save the config | |||
if is_main_process: | |||
if not _hf_peft_config_loaded: | |||
# If the model config has set attributes that should be in the generation config, move them there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of warning (before this PR) / raising an exception (this PR if the lines below were not added) when model.config
holds things that should be in model.generation_config
AND were explicitly set by the user (e.g. the user runs model.config.top_k = 5
before training), now we automagically move misplaced items ✨
Automagically moving things is not usually recommended, but this is a special case: model.save_pretrained
is often run at the end of training jobs, crashing here means that $$$ can be lost
(@amyeroberts don't re-review yet, I'm going through failed tests -- mostly cases where we are explicitly setting |
@gante I'm going to unsubscribe to avoid notifications for each push. Just ping / @ me to let me know when it's ready for review again! |
<Tip warning={true}> | ||
|
||
Setting parameters for sequence generation in the model config is deprecated. For backward compatibility, loading | ||
some of them will still be possible, but attempting to overwrite them will throw an exception -- you should set | ||
them in a [~transformers.GenerationConfig]. Check the documentation of [~transformers.GenerationConfig] for more | ||
information about the individual parameters. | ||
|
||
</Tip> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of documenting generate
parameters in PretrainedConfig
, which is deprecated, show a warning in the docs pointing towards GenerationConfig
suppressed_indices = [] | ||
for token in self.begin_suppress_tokens: | ||
if token < scores.shape[-1]: # to ensure we don't go beyond the vocab size | ||
suppressed_indices.extend([[i, token] for i in range(scores.shape[0])]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We disallowed storing custom generate flags in the model config -> we can no longer set custom generate parameterization in the model tester [which is not necessarily bad, all models should be able to generate with their default parameters!] -> we can't hardcode our way out of edge cases
This change fixes the edge case where we set a token that's not in the vocab to be suppressed. The alternative would be to reconfigure the tester to have a vocab size larger than the default value for this parameter, >50k tokens. Given that it is pretty harmless, I went with this :)
Confirmed that there were no slow TF Whisper test regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so I understand this is to handle cases coming from our testing suite? I don't think we want to handle this in core modeling code: I'd argue things should fail if tokens are selected outside of the vocab. Is there anything we can modify in the tester instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amyeroberts We can increase the vocabulary size of the tester to the original vocabulary size of 50+k tokens (instead of the dummy vocab size of 100).
Happy to do that instead :)
[Note: given TF's indexing, where GPU+no XLA gladly accepts out-of-bounds indexes, this logic actually streamlines the behavior across devices and/or XLA.]
# 1) the generation config must have been created from the model config (`_from_model_config` field); | ||
# 2) the generation config must have seen no modification since its creation (the hash is the same); | ||
# 3) the user must have set generation parameters in the model config. | ||
# NOTE: `torch.compile` can't compile `hash`, this legacy support is disabled with compilation. | ||
if ( | ||
not is_torchdynamo_compiling() | ||
and self.generation_config._from_model_config | ||
and self.generation_config._original_object_hash == hash(self.generation_config) | ||
and self.config._has_non_default_generation_parameters() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was actually incorrect: we want to detect changes against the default model config, not against the default generation parameters
(To recap: if a user updates the model config and the generation config was automatically generated from it, regenerate it and throw a warning)
"encoder" in kwargs and "decoder" in kwargs | ||
), "Config has to be initialized with encoder and decoder config" | ||
if "encoder" not in kwargs or "decoder" not in kwargs: | ||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a try/except for composite models, catching ValueError
, which led to this discovery :) You'll see this pattern fixed in other composite models.
# if however the token to be generated is already at -inf then it can lead token | ||
# `nan` values and thus break generation | ||
self.forced_bos_token_id = None | ||
self.forced_eos_token_id = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go check the PR that added these, we can read that this change was added to maybe fix a bug that happens sporadically 👀
I don't agree with the description of the problem given our current code base, I believe the root cause has been addressed. Furthermore, it's good to remove fudge factors 😉
[note: bart has non-none forced_eos_token_id
, for instance. this means our tests now run against the expected parameterization]
@@ -368,7 +360,6 @@ def __init__( | |||
decoder_attention_heads=4, | |||
max_position_embeddings=30, | |||
is_encoder_decoder=False, | |||
encoder_no_repeat_ngram_size=0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was meant to make the prior version of test_generate_continue_from_past_key_values
(touched in this PR) work.
@@ -41,7 +41,6 @@ def __init__( | |||
act_dim=6, | |||
state_dim=17, | |||
hidden_size=23, | |||
max_length=11, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea why this was here :)
@unittest.skip(reason="To fix, Mamba 2 cache slicing test case is an edge case") | ||
def test_inputs_embeds_matches_input_ids_with_generate(self): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test is also failing on main
-- perhaps because of concurrent merges?
@unittest.skip(reason="TODO @gante not super important and failing.") | ||
def test_inputs_embeds_matches_input_ids_with_generate(self): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failing on main
too.
I suspect this didn't get detected in the (recent) PR that added it because of our test fetched, which only checks the main models on these wide changes (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you open an issue to track?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The issue catches a wider net than skipping test_inputs_embeds_matches_input_ids_with_generate
, as this skip is merely a symptom of the root problem :) )
@amyeroberts Ready 🙌 Notes:
|
PretrainedConfig
from saving generate
parameters; Update deprecations in generate
-related code 🧹
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for handling all of these updates!
A few questions / comments but overall looks good 🤗
tests/utils/test_modeling_utils.py
Outdated
self.assertEqual(len(logs.output), 1) | ||
self.assertIn("Your generation config was originally created from the model config", logs.output[0]) | ||
model.save_pretrained(tmp_dir) | ||
self.assertTrue(model.config.repetition_penalty != 3.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it equal in this case? Is it unset or its old value?
If it's its old value, I'm worried this could cause confusion (do we raise an error if the value is in the model config?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amyeroberts that is a great question -- what gets written in model.config
in this case. I agree with your comment (If it's its old value, I'm worried this could cause confusion
), which was the actual behavior in the commit you reviewed, so I've made the following adjustments:
- Moving a parameter from
model.config
tomodel.generation_config
sets that parameter toNone
inmodel.config
- Storing a generation parameter as
None
inmodel.config
is now accepted (even if it is not the default value) - Updated the warning message, explaining to the user why they are seeing it
- The test now also checks that the warning is thrown. It is also commented, so it should be simple to understand what should be happening.
🤗
@unittest.skip(reason="TODO @gante not super important and failing.") | ||
def test_inputs_embeds_matches_input_ids_with_generate(self): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you open an issue to track?
src/transformers/models/encoder_decoder/configuration_encoder_decoder.py
Outdated
Show resolved
Hide resolved
# Overwritten from the parent class: AST is not compatible with `generate`, but has a config parameter sharing the | ||
# same name (`max_length`). Sharing the same name triggers checks regarding the config -> generation_config | ||
# generative parameters deprecation cycle, overwriting this function prevents this from happening. | ||
def _get_non_default_generation_parameters(self) -> Dict[str, Any]: | ||
return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this - why would there be checks run on a config -> generation_config if the model isn't compatible with generate?
It seems like a bad pattern to need a generate specific method added to config if the model can't be used for generation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand and share your pain 😬 Allow me to elaborate.
The root issue is in the model config. A model config is agnostic to whether the model can or cannot generate, unless we hardcode it in the config class or access external objects.
When we call config.save_pretrained
, we don't know whether we should or should not check generation parameters. Since we want to move all generation-related logic to GenerationConfig
, whose one of the goals is to avoid clashes like this and allow proper validation, we have to assume all configs may be holding generate parameters. The current logic enforces no custom generate parameters are held in the model config, so no new models parameterize generate
through the model config.
AST is the exception. It doesn't support generate
, but it has a model parameter whose name is also used in generate
(max_length
). If a user creates a model with a custom max_length
, which should be allowed, the current save_pretrained
would crash because it thinks it has a custom generate
parameter. This workaround effectively skips the generate
checks, since we know it can't generate.
Between all three alternatives I could see, this seemed the best solution -- especially since it is a temporary fix, to be removed after we untangle the two configs (v5.0?)
Considered fixes:
- skip generate-related checks on this specific class (implemented)
- add a parameter to ALL model config classes to store whether the model can generate
- add an argument to
config.save_pretrained
to check whether to checkgenerate
parameters, receivemodel.can_generate()
when called frommodel.save_pretrained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense - thanks for taking the time to explain!
suppressed_indices = [] | ||
for token in self.begin_suppress_tokens: | ||
if token < scores.shape[-1]: # to ensure we don't go beyond the vocab size | ||
suppressed_indices.extend([[i, token] for i in range(scores.shape[0])]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so I understand this is to handle cases coming from our testing suite? I don't think we want to handle this in core modeling code: I'd argue things should fail if tokens are selected outside of the vocab. Is there anything we can modify in the tester instead?
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
…decoder.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
a806b06
to
9f91b5d
Compare
CI was failing, rebased with main to check if that was the issue EDIT: root cause identified, it's unrelated to this PR and being fixed (the CI image needs an update) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating and taking the time to answer all my qs.
As this touches so many models and core logic, we should ideally do a [run_all]
run of slow tests to make sure everything is tip-top before merge
@@ -1599,14 +1600,30 @@ def test_safetensors_torch_from_torch_sharded(self): | |||
for p1, p2 in zip(model.parameters(), new_model.parameters()): | |||
self.assertTrue(torch.equal(p1, p2)) | |||
|
|||
def test_modifying_model_config_causes_warning_saving_generation_config(self): | |||
def test_modifying_model_config_gets_moved_to_generation_config(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice :)
# Overwritten from the parent class: AST is not compatible with `generate`, but has a config parameter sharing the | ||
# same name (`max_length`). Sharing the same name triggers checks regarding the config -> generation_config | ||
# generative parameters deprecation cycle, overwriting this function prevents this from happening. | ||
def _get_non_default_generation_parameters(self) -> Dict[str, Any]: | ||
return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense - thanks for taking the time to explain!
…eprecations in `generate`-related code 🧹 (huggingface#32659) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
What does this PR do?
See title :) I did a quick scan and update over things that fulfilled the following criteria:
Key changes: