Remove deprecated logic and warnings #30743

amyeroberts · 2024-05-10T13:38:39Z

What does this PR do?

Kills a bunch of deprecated code

HuggingFaceDocBuilderDev · 2024-05-10T14:02:29Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Thanks for the cleanup! A few nits here and there and we'll be good to go

src/transformers/models/cohere/modeling_cohere.py

ArthurZucker · 2024-05-15T08:08:15Z

src/transformers/models/falcon/modeling_falcon.py

@@ -395,11 +394,6 @@ def forward(
        output_attentions: bool = False,
        **kwargs,


Suggested change

**kwargs,

ArthurZucker · 2024-05-15T08:08:21Z

src/transformers/models/falcon/modeling_falcon.py

@@ -551,14 +545,6 @@ def forward(
        output_attentions: bool = False,
        **kwargs,


Suggested change

**kwargs,

ArthurZucker · 2024-05-15T08:08:28Z

src/transformers/models/falcon/modeling_falcon.py

@@ -789,11 +775,6 @@ def forward(
        output_attentions: bool = False,
        **kwargs,


Suggested change

**kwargs,

ArthurZucker · 2024-05-15T08:08:41Z

src/transformers/models/gemma/modeling_gemma.py

same here kwargs needs to be removed

ArthurZucker · 2024-05-15T08:09:03Z

src/transformers/models/llama/modeling_llama.py

-    @property
-    def sin_cached(self):
-        logger.warning_once(
-            "The sin_cached attribute will be removed in 4.39. Bear in mind that its contents changed in v4.38. Use "
-            "the forward method of RoPE from now on instead. It is not used in the `LlamaAttention` class"
-        )
-        return self._sin_cached
-
-    @property
-    def cos_cached(self):
-        logger.warning_once(
-            "The cos_cached attribute will be removed in 4.39. Bear in mind that its contents changed in v4.38. Use "
-            "the forward method of RoPE from now on instead. It is not used in the `LlamaAttention` class"
-        )
-        return self._cos_cached
-


the buffers that are registered need to be deleted as well

src/transformers/models/llama/modeling_llama.py

src/transformers/models/maskformer/modeling_maskformer.py

ArthurZucker · 2024-05-15T08:09:35Z

src/transformers/models/mistral/modeling_mistral.py

same comments as for llama!

ArthurZucker · 2024-05-15T08:09:54Z

src/transformers/models/phi3/modeling_phi3.py

same comment as llama

…lved

amyeroberts · 2024-05-16T09:57:48Z

@ArthurZucker Thanks for the review! I've removed all of the kwargs being passed now, and the _cos_cached, _sin_cached buffers

ArthurZucker

Thanks 🤗

- Comply with the the tensor building logic introduced in huggingface#30743 - Add referencing to the optimized Attention Factor equation - Remove Dynamic YaRN for a more agile deployment Co-authored-by: mig-mfreitas <mig-mfreitas@users.noreply.github.com>

* Add YaRN and Dynamic-YaRN RoPE Scaling Methods YaRN (Yet another RoPE extension method) combines the NTK-By-Parts Interpolation and Attention Scaling methods, improving upon existing RoPE interpolation methods for longer context window sizes. Fine-tuned models maintain their original performance across benchmarks while enabling efficient extrapolation and transfer learning for quicker convergence, especially in compute-limited environments. We implement YaRN and Dynamic-YaRN for the following list of models: - LLaMA - Falcon - GPT-NeoX - Olmo - Persimmon - Phi - StableLM - OpenLLaMA New unit tests are added to assert YaRN's correct behavior on both short and long sequence inputs. For more details, please refer to https://arxiv.org/abs/2309.00071. Co-authored-by: Miguel Almeida <miguel.pessanha.almeida@tecnico.ulisboa.pt> * Refactor YaRN implementation for LLaMA Iterate on YaRN implementation for LLaMA and remove diff from remaining models for increased PR modularity. This commit includes the following changes: - Merge 'yarn_rope_scaling' and 'rope_scaling' dictionaries - Remove unnecessary attributes ('extrapolation_factor' and 'finetuned') from YaRN classes - Inherit 'forward' method in YaRN classes from superclass - Rename 'yarn' method to 'compute_yarn_scaling' - Extend YaRN tests with further assertions - Fix style inconsistencies Co-authored-by: Miguel Monte e Freitas <miguelmontefreitas@tecnico.ulisboa.pt> * Refactor Tensor Building Logic for YaRN - Comply with the the tensor building logic introduced in #30743 - Add referencing to the optimized Attention Factor equation - Remove Dynamic YaRN for a more agile deployment Co-authored-by: mig-mfreitas <mig-mfreitas@users.noreply.github.com> * remove unwanted file --------- Co-authored-by: Miguel Almeida <miguel.pessanha.almeida@tecnico.ulisboa.pt> Co-authored-by: mig-mfreitas <mig-mfreitas@users.noreply.github.com> Co-authored-by: Joao Gante <joao@huggingface.co>

amyeroberts requested a review from ArthurZucker May 13, 2024 16:48

ArthurZucker reviewed May 15, 2024

View reviewed changes

amyeroberts added 5 commits May 16, 2024 09:51

Remove deprecated logic and warnings

539fe1d

Add back some code that seems to be important...

02ca9a3

Let's just add all he nllb stuff back; removing it is a bit more invo…

a9ab7e3

…lved

Remove kwargs

d12bcd0

Remove more kwargs

76cfebe

amyeroberts force-pushed the remove-old-image-processor-warnings branch from fe89540 to 76cfebe Compare May 16, 2024 10:05

amyeroberts requested a review from ArthurZucker May 16, 2024 11:51

ArthurZucker approved these changes May 17, 2024

View reviewed changes

amyeroberts merged commit 57c965a into huggingface:main May 17, 2024
23 checks passed

amyeroberts deleted the remove-old-image-processor-warnings branch May 17, 2024 11:16

ArthurZucker mentioned this pull request May 20, 2024

Llama Attention Call should not pass **kwargs #30523

Closed

4 tasks

younesbelkada mentioned this pull request May 31, 2024

Llama et al. / FSDP : Fix breaking change in 4.40 for FSDP #31161

Merged

gante mentioned this pull request Jun 17, 2024

Add YaRN and Dynamic-YaRN RoPE Scaling Methods #30910

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove deprecated logic and warnings #30743

Remove deprecated logic and warnings #30743

amyeroberts commented May 10, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented May 10, 2024

ArthurZucker left a comment

ArthurZucker May 15, 2024

ArthurZucker May 15, 2024

ArthurZucker May 15, 2024

ArthurZucker May 15, 2024

ArthurZucker May 15, 2024

ArthurZucker May 15, 2024

ArthurZucker May 15, 2024

amyeroberts commented May 16, 2024

ArthurZucker left a comment

		@@ -395,11 +394,6 @@ def forward(
		output_attentions: bool = False,
		**kwargs,

		@@ -551,14 +545,6 @@ def forward(
		output_attentions: bool = False,
		**kwargs,

		@@ -789,11 +775,6 @@ def forward(
		output_attentions: bool = False,
		**kwargs,

Remove deprecated logic and warnings #30743

Remove deprecated logic and warnings #30743

Conversation

amyeroberts commented May 10, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented May 10, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

amyeroberts commented May 16, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

amyeroberts commented May 10, 2024 •

edited

Loading