Forbid `PretrainedConfig` from saving `generate` parameters; Update deprecations in `generate`-related code 🧹 #32659

gante · 2024-08-13T15:36:52Z

What does this PR do?

See title :) I did a quick scan and update over things that fulfilled the following criteria:

had deprecation messages
were related to what I've been working on

Key changes:

Saving a model config with parameters meant to be in the generation config is no longer allowed
However, at model saving time (which saves related configs), move those parameters to the generation config before potentially complaining about it. This is to prevent fine-tuning jobs from failing after significant compute $ is spent.

gante · 2024-08-13T15:38:32Z

src/transformers/commands/pt_to_tf.py

@@ -12,45 +12,13 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-import inspect


TL;DR disabled TF conversion (the warning said it would be removed in v4.43), but kept the error message which redirects to the safetensor conversion guide :)

Added a note to myself to remove this file in the future (v4.47).

gante · 2024-08-13T15:43:34Z

src/transformers/models/bloom/modeling_bloom.py

@@ -693,7 +693,7 @@ def forward(
            past_key_values = DynamicCache.from_legacy_cache(past_key_values)
            logger.warning_once(
                "Using `past_key_values` as a tuple is deprecated and will be removed in v4.45. "
-                "Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)"
+                "Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/internal/generation_utils#transformers.Cache)"


found in my quick scan for deprecations (CMD+F on "v4.41") -- let's use our latest docs to guide our users

HuggingFaceDocBuilderDev · 2024-08-13T16:05:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Killing old code 🤩

amyeroberts · 2024-08-13T16:17:31Z

src/transformers/models/llama/modeling_llama.py

                "We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. "
-                "Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)"
+                "Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/internal/generation_utils#transformers.Cache)"
            )


Why not trigger this warning within the DynamicCache.from_legacy_cache call - avoids repeating in all the modeling code + avoids forgetting to add in modelling code

We may want to use DynamicCache.from_legacy_cache in non-deprecated ways :p for instance, some optimum integrations use it

gante · 2024-08-13T17:14:10Z

src/transformers/configuration_utils.py

+                # 1. The parameter is set to a default generation value (from `generate`'s perspective, it's the same
+                # as if nothing is set)
+                is_default_generation_value = getattr(self, parameter_name) == default_value
+                # 2. The parameter is set as default in the model config (BC support for models like BART)


This situation was not encoded before this PR (and caused tests to fail, god bless tests 🙏 )

gante · 2024-08-13T17:18:52Z

src/transformers/modeling_utils.py

@@ -2549,26 +2549,20 @@ def save_pretrained(
        # Save the config
        if is_main_process:
            if not _hf_peft_config_loaded:
+                # If the model config has set attributes that should be in the generation config, move them there.


Instead of warning (before this PR) / raising an exception (this PR if the lines below were not added) when model.config holds things that should be in model.generation_config AND were explicitly set by the user (e.g. the user runs model.config.top_k = 5 before training), now we automagically move misplaced items ✨

Automagically moving things is not usually recommended, but this is a special case: model.save_pretrained is often run at the end of training jobs, crashing here means that $$$ can be lost

gante · 2024-08-14T11:54:56Z

(@amyeroberts don't re-review yet, I'm going through failed tests -- mostly cases where we are explicitly setting generate arguments in model.config, precisely what we don't want users to do)

amyeroberts · 2024-08-15T11:13:36Z

@gante I'm going to unsubscribe to avoid notifications for each push. Just ping / @ me to let me know when it's ready for review again!

.circleci/create_circleci_config.py

gante · 2024-08-15T11:46:09Z

src/transformers/configuration_utils.py

+    <Tip warning={true}>
+
+    Setting parameters for sequence generation in the model config is deprecated. For backward compatibility, loading
+    some of them will still be possible, but attempting to overwrite them will throw an exception -- you should set
+    them in a [~transformers.GenerationConfig]. Check the documentation of [~transformers.GenerationConfig] for more
+    information about the individual parameters.
+
+    </Tip>


Instead of documenting generate parameters in PretrainedConfig, which is deprecated, show a warning in the docs pointing towards GenerationConfig

gante · 2024-08-15T11:53:04Z

src/transformers/generation/tf_logits_process.py

+        suppressed_indices = []
+        for token in self.begin_suppress_tokens:
+            if token < scores.shape[-1]:  # to ensure we don't go beyond the vocab size
+                suppressed_indices.extend([[i, token] for i in range(scores.shape[0])])


We disallowed storing custom generate flags in the model config -> we can no longer set custom generate parameterization in the model tester [which is not necessarily bad, all models should be able to generate with their default parameters!] -> we can't hardcode our way out of edge cases

This change fixes the edge case where we set a token that's not in the vocab to be suppressed. The alternative would be to reconfigure the tester to have a vocab size larger than the default value for this parameter, >50k tokens. Given that it is pretty harmless, I went with this :)

Confirmed that there were no slow TF Whisper test regressions.

Just so I understand this is to handle cases coming from our testing suite? I don't think we want to handle this in core modeling code: I'd argue things should fail if tokens are selected outside of the vocab. Is there anything we can modify in the tester instead?

@amyeroberts We can increase the vocabulary size of the tester to the original vocabulary size of 50+k tokens (instead of the dummy vocab size of 100).

Happy to do that instead :)

[Note: given TF's indexing, where GPU+no XLA gladly accepts out-of-bounds indexes, this logic actually streamlines the behavior across devices and/or XLA.]

gante · 2024-08-15T11:56:46Z

src/transformers/generation/utils.py

            # 1) the generation config must have been created from the model config (`_from_model_config` field);
            # 2) the generation config must have seen no modification since its creation (the hash is the same);
            # 3) the user must have set generation parameters in the model config.
            # NOTE: `torch.compile` can't compile `hash`, this legacy support is disabled with compilation.
            if (
                not is_torchdynamo_compiling()
-                and self.generation_config._from_model_config
-                and self.generation_config._original_object_hash == hash(self.generation_config)
-                and self.config._has_non_default_generation_parameters()


this was actually incorrect: we want to detect changes against the default model config, not against the default generation parameters

(To recap: if a user updates the model config and the generation config was automatically generated from it, regenerate it and throw a warning)

gante · 2024-08-15T11:58:42Z

src/transformers/models/encoder_decoder/configuration_encoder_decoder.py

-            "encoder" in kwargs and "decoder" in kwargs
-        ), "Config has to be initialized with encoder and decoder config"
+        if "encoder" not in kwargs or "decoder" not in kwargs:
+            raise ValueError(


I added a try/except for composite models, catching ValueError, which led to this discovery :) You'll see this pattern fixed in other composite models.

gante · 2024-08-15T12:00:43Z

tests/models/bart/test_modeling_bart.py

-        # if however the token to be generated is already at -inf then it can lead token
-        # `nan` values and thus break generation
-        self.forced_bos_token_id = None
-        self.forced_eos_token_id = None


If we go check the PR that added these, we can read that this change was added to maybe fix a bug that happens sporadically 👀

I don't agree with the description of the problem given our current code base, I believe the root cause has been addressed. Furthermore, it's good to remove fudge factors 😉

[note: bart has non-none forced_eos_token_id, for instance. this means our tests now run against the expected parameterization]

gante · 2024-08-15T12:01:53Z

tests/models/blenderbot/test_modeling_blenderbot.py

@@ -368,7 +360,6 @@ def __init__(
        decoder_attention_heads=4,
        max_position_embeddings=30,
        is_encoder_decoder=False,
-        encoder_no_repeat_ngram_size=0,


I think this was meant to make the prior version of test_generate_continue_from_past_key_values (touched in this PR) work.

gante · 2024-08-15T12:03:28Z

tests/models/decision_transformer/test_modeling_decision_transformer.py

@@ -41,7 +41,6 @@ def __init__(
        act_dim=6,
        state_dim=17,
        hidden_size=23,
-        max_length=11,


No idea why this was here :)

gante · 2024-08-15T12:04:06Z

tests/models/mamba2/test_modeling_mamba2.py

+    @unittest.skip(reason="To fix, Mamba 2 cache slicing test case is an edge case")
+    def test_inputs_embeds_matches_input_ids_with_generate(self):
+        pass


this test is also failing on main -- perhaps because of concurrent merges?

gante · 2024-08-15T12:05:59Z

tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py

+    @unittest.skip(reason="TODO @gante not super important and failing.")
+    def test_inputs_embeds_matches_input_ids_with_generate(self):
+        pass


Failing on main too.

I suspect this didn't get detected in the (recent) PR that added it because of our test fetched, which only checks the main models on these wide changes (?)

Could you open an issue to track?

#32913

(The issue catches a wider net than skipping test_inputs_embeds_matches_input_ids_with_generate, as this skip is merely a symptom of the root problem :) )

gante · 2024-08-15T12:14:17Z

@amyeroberts Ready 🙌

Notes:

saving a config with custom generate parameters now creates an exception -> all testers that relied on setting custom generation parameters to work around edge cases failed [because we pass the tester args to the model config] -> this PR fixes edge cases/updates tests 💪
touching many files triggers a lot of test runs. A few unrelated broken tests were found (and fixed) in the process
I've added GH comments on details that are both not useful for the future codebase AND not obvious why the change was made
an unrelated pipeline test is failing due to timeout, triggering a new run is not making CI green. I'll need you to merge when you're happy with the PR :D

amyeroberts

Thanks for handling all of these updates!

A few questions / comments but overall looks good 🤗

amyeroberts · 2024-08-15T15:52:28Z

tests/utils/test_modeling_utils.py

-            self.assertEqual(len(logs.output), 1)
-            self.assertIn("Your generation config was originally created from the model config", logs.output[0])
+            model.save_pretrained(tmp_dir)
+            self.assertTrue(model.config.repetition_penalty != 3.0)


What does it equal in this case? Is it unset or its old value?

If it's its old value, I'm worried this could cause confusion (do we raise an error if the value is in the model config?)

@amyeroberts that is a great question -- what gets written in model.config in this case. I agree with your comment (If it's its old value, I'm worried this could cause confusion), which was the actual behavior in the commit you reviewed, so I've made the following adjustments:

Moving a parameter from model.config to model.generation_config sets that parameter to None in model.config

Storing a generation parameter as None in model.config is now accepted (even if it is not the default value)

Updated the warning message, explaining to the user why they are seeing it

The test now also checks that the warning is thrown. It is also commented, so it should be simple to understand what should be happening.
🤗

amyeroberts · 2024-08-15T16:45:25Z

tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py

+    @unittest.skip(reason="TODO @gante not super important and failing.")
+    def test_inputs_embeds_matches_input_ids_with_generate(self):
+        pass


Could you open an issue to track?

src/transformers/models/rag/configuration_rag.py

src/transformers/models/encoder_decoder/configuration_encoder_decoder.py

.circleci/create_circleci_config.py

amyeroberts · 2024-08-19T11:01:05Z

...sformers/models/audio_spectrogram_transformer/configuration_audio_spectrogram_transformer.py

+    # Overwritten from the parent class: AST is not compatible with `generate`, but has a config parameter sharing the
+    # same name (`max_length`). Sharing the same name triggers checks regarding the config -> generation_config
+    # generative parameters deprecation cycle, overwriting this function prevents this from happening.
+    def _get_non_default_generation_parameters(self) -> Dict[str, Any]:
+        return {}


I'm not sure I understand this - why would there be checks run on a config -> generation_config if the model isn't compatible with generate?

It seems like a bad pattern to need a generate specific method added to config if the model can't be used for generation

I understand and share your pain 😬 Allow me to elaborate.

The root issue is in the model config. A model config is agnostic to whether the model can or cannot generate, unless we hardcode it in the config class or access external objects.

When we call config.save_pretrained, we don't know whether we should or should not check generation parameters. Since we want to move all generation-related logic to GenerationConfig, whose one of the goals is to avoid clashes like this and allow proper validation, we have to assume all configs may be holding generate parameters. The current logic enforces no custom generate parameters are held in the model config, so no new models parameterize generate through the model config.

AST is the exception. It doesn't support generate, but it has a model parameter whose name is also used in generate (max_length). If a user creates a model with a custom max_length, which should be allowed, the current save_pretrained would crash because it thinks it has a custom generate parameter. This workaround effectively skips the generate checks, since we know it can't generate.

Between all three alternatives I could see, this seemed the best solution -- especially since it is a temporary fix, to be removed after we untangle the two configs (v5.0?)

Considered fixes:

skip generate-related checks on this specific class (implemented)

add a parameter to ALL model config classes to store whether the model can generate

add an argument to config.save_pretrained to check whether to check generate parameters, receive model.can_generate() when called from model.save_pretrained

Make sense - thanks for taking the time to explain!

amyeroberts · 2024-08-19T11:13:56Z

src/transformers/generation/tf_logits_process.py

+        suppressed_indices = []
+        for token in self.begin_suppress_tokens:
+            if token < scores.shape[-1]:  # to ensure we don't go beyond the vocab size
+                suppressed_indices.extend([[i, token] for i in range(scores.shape[0])])


Just so I understand this is to handle cases coming from our testing suite? I don't think we want to handle this in core modeling code: I'd argue things should fail if tokens are selected outside of the vocab. Is there anything we can modify in the tester instead?

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

…decoder.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

gante · 2024-08-21T10:45:19Z

CI was failing, rebased with main to check if that was the issue

EDIT: root cause identified, it's unrelated to this PR and being fixed (the CI image needs an update)

amyeroberts

Thanks for iterating and taking the time to answer all my qs.

As this touches so many models and core logic, we should ideally do a [run_all] run of slow tests to make sure everything is tip-top before merge

amyeroberts · 2024-08-22T18:46:01Z

tests/utils/test_modeling_utils.py

@@ -1599,14 +1600,30 @@ def test_safetensors_torch_from_torch_sharded(self):
        for p1, p2 in zip(model.parameters(), new_model.parameters()):
            self.assertTrue(torch.equal(p1, p2))

-    def test_modifying_model_config_causes_warning_saving_generation_config(self):
+    def test_modifying_model_config_gets_moved_to_generation_config(self):


amyeroberts · 2024-08-22T18:47:51Z

...sformers/models/audio_spectrogram_transformer/configuration_audio_spectrogram_transformer.py

+    # Overwritten from the parent class: AST is not compatible with `generate`, but has a config parameter sharing the
+    # same name (`max_length`). Sharing the same name triggers checks regarding the config -> generation_config
+    # generative parameters deprecation cycle, overwriting this function prevents this from happening.
+    def _get_non_default_generation_parameters(self) -> Dict[str, Any]:
+        return {}


Make sense - thanks for taking the time to explain!

…eprecations in `generate`-related code 🧹 (huggingface#32659) Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

gante commented Aug 13, 2024

View reviewed changes

gante requested a review from amyeroberts August 13, 2024 15:44

gante force-pushed the deprecations_aug branch from 1303438 to f14a71a Compare August 13, 2024 15:45

amyeroberts approved these changes Aug 13, 2024

View reviewed changes

gante commented Aug 13, 2024

View reviewed changes

amyeroberts self-requested a review August 14, 2024 11:51

gante commented Aug 15, 2024

View reviewed changes

.circleci/create_circleci_config.py Outdated Show resolved Hide resolved

gante commented Aug 15, 2024

View reviewed changes

gante changed the title ~~Remove/update deprecated generate-related code 🧹~~ Forbid PretrainedConfig from saving generate parameters; Update deprecations in generate-related code 🧹 Aug 15, 2024

gante mentioned this pull request Aug 16, 2024

Llava Onevision: add model #32673

Merged

amyeroberts reviewed Aug 19, 2024

View reviewed changes

gante added 5 commits August 21, 2024 10:44

deprecated stuff

af3efdd

update test

9450f0d

update test

a95b705

automagically fix the bad configs

ba5d985

fix test

1b6c637

gante and others added 14 commits August 21, 2024 10:44

single worker for tf examples job

0a51dad

handle defaults for composite models

8619937

update tf whisper logits processors to handle edge case

884013f

even more precise

8ab4304

move defaults up

a649bdf

last one?

686f428

handle exception

655dc29

one more edge case

3e50e5c

another one

328c3f8

this one is flaky

aa2e93e

make fixup

ac13bb1

accept storing None; moving sets to None

021785b

Update src/transformers/models/rag/configuration_rag.py

747bb1d

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/models/encoder_decoder/configuration_encoder_…

9f91b5d

…decoder.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

gante force-pushed the deprecations_aug branch from a806b06 to 9f91b5d Compare August 21, 2024 10:45

gante mentioned this pull request Aug 21, 2024

run_summarization likely race condition creates flaky test_tensorflow_examples #32667

Closed

remove CI TF examples fix (moved to huggingface#32935)

146ea02

amyeroberts approved these changes Aug 22, 2024

View reviewed changes

gante and others added 2 commits August 23, 2024 10:14

Merge branch 'main' into deprecations_aug

a3ff1eb

[test_all] trigger all tests

e9a9a82

gante merged commit 970a16e into huggingface:main Aug 23, 2024
23 checks passed

gante deleted the deprecations_aug branch August 23, 2024 10:12

B-Step62 mentioned this pull request Sep 9, 2024

Remove usage of a stale model from Transformers tests mlflow/mlflow#13098

Merged

39 tasks

gante mentioned this pull request Sep 13, 2024

Pipeline: no side-effects on model.config and model.generation_config 🔫 #33480

Merged

gante mentioned this pull request Oct 2, 2024

Config: allow saving with generative parameters if untouched #33886

Closed

Forbid PretrainedConfig from saving generate parameters; Update deprecations in generate-related code 🧹 #32659

Forbid PretrainedConfig from saving generate parameters; Update deprecations in generate-related code 🧹 #32659

Conversation

gante commented Aug 13, 2024 • edited Loading

What does this PR do?

gante Aug 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Aug 13, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante Aug 13, 2024 • edited Loading

Choose a reason for hiding this comment

gante commented Aug 14, 2024 • edited Loading

amyeroberts commented Aug 15, 2024

Choose a reason for hiding this comment

gante Aug 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante Aug 15, 2024 • edited Loading

Choose a reason for hiding this comment

gante Aug 15, 2024 • edited Loading

Choose a reason for hiding this comment

gante Aug 15, 2024 • edited Loading

Choose a reason for hiding this comment

gante Aug 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante Aug 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante commented Aug 15, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante Aug 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante commented Aug 21, 2024 • edited Loading

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Forbid `PretrainedConfig` from saving `generate` parameters; Update deprecations in `generate`-related code 🧹 #32659

Forbid `PretrainedConfig` from saving `generate` parameters; Update deprecations in `generate`-related code 🧹 #32659

gante commented Aug 13, 2024 •

edited

Loading

gante Aug 13, 2024 •

edited

Loading

gante Aug 13, 2024 •

edited

Loading

gante commented Aug 14, 2024 •

edited

Loading

gante Aug 15, 2024 •

edited

Loading

gante Aug 15, 2024 •

edited

Loading

gante Aug 15, 2024 •

edited

Loading

gante Aug 15, 2024 •

edited

Loading

gante Aug 15, 2024 •

edited

Loading

gante Aug 15, 2024 •

edited

Loading

gante Aug 21, 2024 •

edited

Loading

gante commented Aug 21, 2024 •

edited

Loading