Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forbid PretrainedConfig from saving generate parameters; Update deprecations in generate-related code 🧹 #32659

Merged
merged 36 commits into from
Aug 23, 2024

Conversation

gante
Copy link
Member

@gante gante commented Aug 13, 2024

What does this PR do?

See title :) I did a quick scan and update over things that fulfilled the following criteria:

  • had deprecation messages
  • were related to what I've been working on

Key changes:

  1. Saving a model config with parameters meant to be in the generation config is no longer allowed
  2. However, at model saving time (which saves related configs), move those parameters to the generation config before potentially complaining about it. This is to prevent fine-tuning jobs from failing after significant compute $ is spent.

@@ -12,45 +12,13 @@
# See the License for the specific language governing permissions and
# limitations under the License.

import inspect
Copy link
Member Author

@gante gante Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR disabled TF conversion (the warning said it would be removed in v4.43), but kept the error message which redirects to the safetensor conversion guide :)

Added a note to myself to remove this file in the future (v4.47).

@@ -693,7 +693,7 @@ def forward(
past_key_values = DynamicCache.from_legacy_cache(past_key_values)
logger.warning_once(
"Using `past_key_values` as a tuple is deprecated and will be removed in v4.45. "
"Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)"
"Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/internal/generation_utils#transformers.Cache)"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found in my quick scan for deprecations (CMD+F on "v4.41") -- let's use our latest docs to guide our users

@gante gante requested a review from amyeroberts August 13, 2024 15:44
@gante gante force-pushed the deprecations_aug branch from 1303438 to f14a71a Compare August 13, 2024 15:45
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Killing old code 🤩

Comment on lines 959 to 954
"We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. "
"Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)"
"Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/internal/generation_utils#transformers.Cache)"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not trigger this warning within the DynamicCache.from_legacy_cache call - avoids repeating in all the modeling code + avoids forgetting to add in modelling code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to use DynamicCache.from_legacy_cache in non-deprecated ways :p for instance, some optimum integrations use it

# 1. The parameter is set to a default generation value (from `generate`'s perspective, it's the same
# as if nothing is set)
is_default_generation_value = getattr(self, parameter_name) == default_value
# 2. The parameter is set as default in the model config (BC support for models like BART)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This situation was not encoded before this PR (and caused tests to fail, god bless tests 🙏 )

@@ -2549,26 +2549,20 @@ def save_pretrained(
# Save the config
if is_main_process:
if not _hf_peft_config_loaded:
# If the model config has set attributes that should be in the generation config, move them there.
Copy link
Member Author

@gante gante Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of warning (before this PR) / raising an exception (this PR if the lines below were not added) when model.config holds things that should be in model.generation_config AND were explicitly set by the user (e.g. the user runs model.config.top_k = 5 before training), now we automagically move misplaced items ✨

Automagically moving things is not usually recommended, but this is a special case: model.save_pretrained is often run at the end of training jobs, crashing here means that $$$ can be lost

@amyeroberts amyeroberts self-requested a review August 14, 2024 11:51
@gante
Copy link
Member Author

gante commented Aug 14, 2024

(@amyeroberts don't re-review yet, I'm going through failed tests -- mostly cases where we are explicitly setting generate arguments in model.config, precisely what we don't want users to do)

@amyeroberts
Copy link
Collaborator

@gante I'm going to unsubscribe to avoid notifications for each push. Just ping / @ me to let me know when it's ready for review again!

Comment on lines +84 to +91
<Tip warning={true}>

Setting parameters for sequence generation in the model config is deprecated. For backward compatibility, loading
some of them will still be possible, but attempting to overwrite them will throw an exception -- you should set
them in a [~transformers.GenerationConfig]. Check the documentation of [~transformers.GenerationConfig] for more
information about the individual parameters.

</Tip>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of documenting generate parameters in PretrainedConfig, which is deprecated, show a warning in the docs pointing towards GenerationConfig

Comment on lines +523 to +526
suppressed_indices = []
for token in self.begin_suppress_tokens:
if token < scores.shape[-1]: # to ensure we don't go beyond the vocab size
suppressed_indices.extend([[i, token] for i in range(scores.shape[0])])
Copy link
Member Author

@gante gante Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We disallowed storing custom generate flags in the model config -> we can no longer set custom generate parameterization in the model tester [which is not necessarily bad, all models should be able to generate with their default parameters!] -> we can't hardcode our way out of edge cases

This change fixes the edge case where we set a token that's not in the vocab to be suppressed. The alternative would be to reconfigure the tester to have a vocab size larger than the default value for this parameter, >50k tokens. Given that it is pretty harmless, I went with this :)

Confirmed that there were no slow TF Whisper test regressions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so I understand this is to handle cases coming from our testing suite? I don't think we want to handle this in core modeling code: I'd argue things should fail if tokens are selected outside of the vocab. Is there anything we can modify in the tester instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amyeroberts We can increase the vocabulary size of the tester to the original vocabulary size of 50+k tokens (instead of the dummy vocab size of 100).

Happy to do that instead :)

[Note: given TF's indexing, where GPU+no XLA gladly accepts out-of-bounds indexes, this logic actually streamlines the behavior across devices and/or XLA.]

# 1) the generation config must have been created from the model config (`_from_model_config` field);
# 2) the generation config must have seen no modification since its creation (the hash is the same);
# 3) the user must have set generation parameters in the model config.
# NOTE: `torch.compile` can't compile `hash`, this legacy support is disabled with compilation.
if (
not is_torchdynamo_compiling()
and self.generation_config._from_model_config
and self.generation_config._original_object_hash == hash(self.generation_config)
and self.config._has_non_default_generation_parameters()
Copy link
Member Author

@gante gante Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was actually incorrect: we want to detect changes against the default model config, not against the default generation parameters

(To recap: if a user updates the model config and the generation config was automatically generated from it, regenerate it and throw a warning)

"encoder" in kwargs and "decoder" in kwargs
), "Config has to be initialized with encoder and decoder config"
if "encoder" not in kwargs or "decoder" not in kwargs:
raise ValueError(
Copy link
Member Author

@gante gante Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a try/except for composite models, catching ValueError, which led to this discovery :) You'll see this pattern fixed in other composite models.

# if however the token to be generated is already at -inf then it can lead token
# `nan` values and thus break generation
self.forced_bos_token_id = None
self.forced_eos_token_id = None
Copy link
Member Author

@gante gante Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go check the PR that added these, we can read that this change was added to maybe fix a bug that happens sporadically 👀

I don't agree with the description of the problem given our current code base, I believe the root cause has been addressed. Furthermore, it's good to remove fudge factors 😉

[note: bart has non-none forced_eos_token_id, for instance. this means our tests now run against the expected parameterization]

@@ -368,7 +360,6 @@ def __init__(
decoder_attention_heads=4,
max_position_embeddings=30,
is_encoder_decoder=False,
encoder_no_repeat_ngram_size=0,
Copy link
Member Author

@gante gante Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was meant to make the prior version of test_generate_continue_from_past_key_values (touched in this PR) work.

@@ -41,7 +41,6 @@ def __init__(
act_dim=6,
state_dim=17,
hidden_size=23,
max_length=11,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea why this was here :)

Comment on lines 230 to 232
@unittest.skip(reason="To fix, Mamba 2 cache slicing test case is an edge case")
def test_inputs_embeds_matches_input_ids_with_generate(self):
pass
Copy link
Member Author

@gante gante Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is also failing on main -- perhaps because of concurrent merges?

Comment on lines 416 to 418
@unittest.skip(reason="TODO @gante not super important and failing.")
def test_inputs_embeds_matches_input_ids_with_generate(self):
pass
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing on main too.

I suspect this didn't get detected in the (recent) PR that added it because of our test fetched, which only checks the main models on these wide changes (?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you open an issue to track?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#32913

(The issue catches a wider net than skipping test_inputs_embeds_matches_input_ids_with_generate, as this skip is merely a symptom of the root problem :) )

@gante
Copy link
Member Author

gante commented Aug 15, 2024

@amyeroberts Ready 🙌

Notes:

  1. saving a config with custom generate parameters now creates an exception -> all testers that relied on setting custom generation parameters to work around edge cases failed [because we pass the tester args to the model config] -> this PR fixes edge cases/updates tests 💪
  2. touching many files triggers a lot of test runs. A few unrelated broken tests were found (and fixed) in the process
  3. I've added GH comments on details that are both not useful for the future codebase AND not obvious why the change was made
  4. an unrelated pipeline test is failing due to timeout, triggering a new run is not making CI green. I'll need you to merge when you're happy with the PR :D

@gante gante changed the title Remove/update deprecated generate-related code 🧹 Forbid PretrainedConfig from saving generate parameters; Update deprecations in generate-related code 🧹 Aug 15, 2024
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for handling all of these updates!

A few questions / comments but overall looks good 🤗

self.assertEqual(len(logs.output), 1)
self.assertIn("Your generation config was originally created from the model config", logs.output[0])
model.save_pretrained(tmp_dir)
self.assertTrue(model.config.repetition_penalty != 3.0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it equal in this case? Is it unset or its old value?

If it's its old value, I'm worried this could cause confusion (do we raise an error if the value is in the model config?)

Copy link
Member Author

@gante gante Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amyeroberts that is a great question -- what gets written in model.config in this case. I agree with your comment (If it's its old value, I'm worried this could cause confusion), which was the actual behavior in the commit you reviewed, so I've made the following adjustments:

  1. Moving a parameter from model.config to model.generation_config sets that parameter to None in model.config
  2. Storing a generation parameter as None in model.config is now accepted (even if it is not the default value)
  3. Updated the warning message, explaining to the user why they are seeing it
  4. The test now also checks that the warning is thrown. It is also commented, so it should be simple to understand what should be happening.
    🤗

Comment on lines 416 to 418
@unittest.skip(reason="TODO @gante not super important and failing.")
def test_inputs_embeds_matches_input_ids_with_generate(self):
pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you open an issue to track?

Comment on lines +124 to +128
# Overwritten from the parent class: AST is not compatible with `generate`, but has a config parameter sharing the
# same name (`max_length`). Sharing the same name triggers checks regarding the config -> generation_config
# generative parameters deprecation cycle, overwriting this function prevents this from happening.
def _get_non_default_generation_parameters(self) -> Dict[str, Any]:
return {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this - why would there be checks run on a config -> generation_config if the model isn't compatible with generate?

It seems like a bad pattern to need a generate specific method added to config if the model can't be used for generation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand and share your pain 😬 Allow me to elaborate.

The root issue is in the model config. A model config is agnostic to whether the model can or cannot generate, unless we hardcode it in the config class or access external objects.

When we call config.save_pretrained, we don't know whether we should or should not check generation parameters. Since we want to move all generation-related logic to GenerationConfig, whose one of the goals is to avoid clashes like this and allow proper validation, we have to assume all configs may be holding generate parameters. The current logic enforces no custom generate parameters are held in the model config, so no new models parameterize generate through the model config.

AST is the exception. It doesn't support generate, but it has a model parameter whose name is also used in generate (max_length). If a user creates a model with a custom max_length, which should be allowed, the current save_pretrained would crash because it thinks it has a custom generate parameter. This workaround effectively skips the generate checks, since we know it can't generate.

Between all three alternatives I could see, this seemed the best solution -- especially since it is a temporary fix, to be removed after we untangle the two configs (v5.0?)

Considered fixes:

  • skip generate-related checks on this specific class (implemented)
  • add a parameter to ALL model config classes to store whether the model can generate
  • add an argument to config.save_pretrained to check whether to check generate parameters, receive model.can_generate() when called from model.save_pretrained

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense - thanks for taking the time to explain!

Comment on lines +523 to +526
suppressed_indices = []
for token in self.begin_suppress_tokens:
if token < scores.shape[-1]: # to ensure we don't go beyond the vocab size
suppressed_indices.extend([[i, token] for i in range(scores.shape[0])])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so I understand this is to handle cases coming from our testing suite? I don't think we want to handle this in core modeling code: I'd argue things should fail if tokens are selected outside of the vocab. Is there anything we can modify in the tester instead?

@gante gante force-pushed the deprecations_aug branch from a806b06 to 9f91b5d Compare August 21, 2024 10:45
@gante
Copy link
Member Author

gante commented Aug 21, 2024

CI was failing, rebased with main to check if that was the issue

EDIT: root cause identified, it's unrelated to this PR and being fixed (the CI image needs an update)

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating and taking the time to answer all my qs.

As this touches so many models and core logic, we should ideally do a [run_all] run of slow tests to make sure everything is tip-top before merge

@@ -1599,14 +1600,30 @@ def test_safetensors_torch_from_torch_sharded(self):
for p1, p2 in zip(model.parameters(), new_model.parameters()):
self.assertTrue(torch.equal(p1, p2))

def test_modifying_model_config_causes_warning_saving_generation_config(self):
def test_modifying_model_config_gets_moved_to_generation_config(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :)

Comment on lines +124 to +128
# Overwritten from the parent class: AST is not compatible with `generate`, but has a config parameter sharing the
# same name (`max_length`). Sharing the same name triggers checks regarding the config -> generation_config
# generative parameters deprecation cycle, overwriting this function prevents this from happening.
def _get_non_default_generation_parameters(self) -> Dict[str, Any]:
return {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense - thanks for taking the time to explain!

@gante gante merged commit 970a16e into huggingface:main Aug 23, 2024
23 checks passed
@gante gante deleted the deprecations_aug branch August 23, 2024 10:12
BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024
…eprecations in `generate`-related code 🧹 (huggingface#32659)

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants