Fix max size deprecated warning #34998

HichTala · 2024-11-28T14:14:18Z

This pull request focuses on removing the deprecated max_size argument from the preprocess method across multiple image processing modules in the transformers library. This change simplifies the code and aligns with the planned deprecation of the max_size argument.

Key changes include:

Removal of `max_size` argument:

src/transformers/models/conditional_detr/image_processing_conditional_detr.py: Removed max_size argument from the preprocess method and its usage in the get_size_dict and resize calls. [1] [2] [3]
src/transformers/models/deformable_detr/image_processing_deformable_detr.py: Removed max_size argument from the preprocess method and its usage in the get_size_dict and resize calls. [1] [2] [3]
src/transformers/models/detr/image_processing_detr.py: Removed max_size argument from the preprocess method and its usage in the get_size_dict and resize calls. [1] [2] [3]
src/transformers/models/grounding_dino/image_processing_grounding_dino.py: Removed max_size argument from the preprocess method and its usage in the get_size_dict and resize calls. [1] [2] [3]
src/transformers/models/yolos/image_processing_yolos.py: Removed max_size argument from the preprocess method and its usage in the get_size_dict and resize calls. [1] [2] [3]# What does this PR do?

Fixes #34977

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@qubvel here is my pull request.

…` and triggered unnecessary deprecated warning

HichTala · 2024-11-28T14:34:49Z

While working in this PR I've notices some other possible improvement, to advance with the expected depreciation of the max_size argument.

Here I'm not sure what type size we want to handle :

transformers/src/transformers/models/detr/image_processing_detr.py

Lines 992 to 1000 in 5e8c1d7

    
           if "max_size" in kwargs: 
        
               logger.warning_once( 
        
                   "The `max_size` parameter is deprecated and will be removed in v4.26. " 
        
                   "Please specify in `size['longest_edge'] instead`.", 
        
               ) 
        
               max_size = kwargs.pop("max_size") 
        
           else: 
        
               max_size = None 
        
           size = get_size_dict(size, max_size=max_size, default_to_square=False)

If it is a only Dict type, like the typing of the function specify, then max_size variable can be removed, we can imagine something like this ?

size = size if size is not None else {"shortest_edge": 800, "longest_edge": 1333}
if "max_size" in kwargs:
    logger.warning_once(
        "The `max_size` parameter is deprecated and will be removed in v4.26. "
        "Please specify in `size['longest_edge'] instead`.",
    )
    size['longest_edge'] = kwargs.pop("max_size")

size = get_size_dict(size, default_to_square=False)

I've also noticed weird use of size in SameImageProcessor

transformers/src/transformers/models/sam/image_processing_sam.py

Lines 136 to 137 in 5e8c1d7

    
           size = size if size is not None else {"longest_edge": 1024} 
        
           size = get_size_dict(max_size=size, default_to_square=False) if not isinstance(size, dict) else size

I think we should normalized to use of Dict across the code, for example here by passing something like {'longest_edge': size} as argument ?

And here also the typing of the function specify the Dict type for size, but we are also handling other types, is this ok ?

If I get more precision, I'd be happy to contribute and may be work on the suppression of the max_size parameter across the code ?

qubvel

@HichTala Thanks for cleaning this up!

HichTala · 2024-11-28T14:38:54Z

No worries, if you'd like me to clean up a bit more as I explained in my previous comment, don't hesitate to let me know!

qubvel · 2024-11-28T14:40:13Z

I suppose max_size can be entirely removed here, because we are at version 4.47 right now

if "max_size" in kwargs: 
     logger.warning_once( 
         "The `max_size` parameter is deprecated and will be removed in v4.26. " 
         "Please specify in `size['longest_edge'] instead`.", 
     ) 
     max_size = kwargs.pop("max_size") 
 else: 
     max_size = None 
 size = get_size_dict(size, max_size=max_size, default_to_square=False)

qubvel · 2024-11-28T14:42:57Z

And here also the typing of the function specify the Dict type for size, but we are also handling other types, is this ok ?

There might be configs on the Hub that still have an integer as a size parameter. Thats why here we make it for backward compatability

HichTala · 2024-11-28T14:45:58Z

Ok then, I'll try to clean up the detr variants a bit more and try not to break everything. 🫡

HuggingFaceDocBuilderDev · 2024-11-28T15:04:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Add a test to ensure test can pass successfully and backward compatibility

HichTala · 2024-11-28T16:52:13Z

The tests pipelines still use max_size

transformers/tests/models/detr/test_image_processing_detr.py

Lines 167 to 169 in 0b5b5e6

    
           image_processor = image_processing_class.from_dict( 
        
               self.image_processor_dict, size=42, max_size=84, pad_and_return_pixel_mask=False 
        
           )

I don't know if I'm allowed to modify those ?

Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys

qubvel · 2024-11-28T18:36:54Z

Thanks for iterating! it seems its a breaking change, you can check with https://huggingface.co/microsoft/table-transformer-detection/ model on your branch and main:

from transformers import AutoProcessor

image_processor = AutoProcessor.from_pretrained(model.id)
print(image_processor.size)

qubvel · 2024-11-28T18:37:29Z

src/transformers/models/conditional_detr/image_processing_conditional_detr.py

+        size = (
+            {"shortest_edge": size, "longest_edge": 1333} if isinstance(size, int) else size
+        )  # Backwards compatibility
+        size = get_size_dict(size, default_to_square=False)


comment should be above to avoid this line break

HichTala · 2024-11-28T21:08:29Z

The model you sent use max_size:
https://huggingface.co/microsoft/table-transformer-detection/blob/main/preprocessor_config.json
(L16)

Do we raise an error saying it's deprecated, or do we put back the deprecation warnings?

qubvel · 2024-11-29T12:44:00Z

Ideally, it would be great to make PRs with updated size for all detection models with over 100 downloads on the hub, merge them, and then eliminate the max size in the codebase. It might be a bit time-consuming, so it's up to you! If you choose this route, I can share a draft script on how to fetch the required configs and make PRs. Otherwise, let's revert to backward compatibility with a warning and clean up just where it's possible.

I'm happy with whichever decision you make!

HichTala · 2024-11-29T13:32:47Z

I'm ok to work on this. I think I'll rollback this PR to point where backward compatibility is ensured with warning, I'll update the "version 4.26" to "future version" it's may be more meaning full. That way, it can be merged and my code will finally be able to launch without those flooding warning messages. I'll then open some new PRs to start eliminating the max size.

It would be great if you can share the draft script with me.

This reverts commit c3040ac.

This reverts commit ac4522e.

This reverts commit eaed96f.

This reverts commit 1925ee3.

This reverts commit d8e7e6f.

qubvel · 2024-11-29T14:04:02Z

Here is a draft I used to fetch object detection models image processor's configs:

from transformers import AutoProcessor
from huggingface_hub import list_models, hf_hub_download
import json
import os

# Filter models with the tag 'pipeline_tag=object-detection'
models = list_models(tags="object-detection", gated=False, sort="downloads", limit=1000)

count = 0
for model in models:
    try:
        # Download preprocessing_config.json
        config_path = hf_hub_download(repo_id=model.id, filename="preprocessor_config.json")
        with open(config_path, "r") as file:
            config = json.load(file)

        # Check if 'size' is a dict
        if not isinstance(config.get("size"), dict):
            print(f"#{count} Model: {model.id}")
            print(f"Created: {model.created_at}")
            print(f"Downloads: {model.downloads}\n")
            print()
            count += 1
            
            # resotor processor
            image_processor = AutoProcessor.from_pretrained(model.id)
            print(image_processor.size)

    except Exception as e:
        print(f"Error processing model {model.id}: {e}")

The idea is to run this on main to load configs and then push them

You first load it

image_processor = AutoProcessor.from_pretrained(model.id)

And then push back + open a PR

image_processor.save_preatrained(model.id, create_pr=True)   # or smth similar

I will also need links to all the PRs, to share them with someone who can merge them.

Please be careful in order not to spam all models, lets do it batch-wise, for the first iteration we can do it for the first five models.

HichTala · 2024-11-29T14:23:40Z

Thanks for sharing! I’ll probably work on this at the start of next week. Could you let me know how can I provide links to the PRs once they’re created?

qubvel · 2024-11-29T14:48:51Z

Probably, save_pretrained returns something, but I'm not sure! It's something that has to be explored. Alternatively, you can use the huggingface_hub library to upload a json file and similarly open a PR. This way, it will return a link/object containing a link.

qubvel · 2024-11-29T14:52:21Z

I’ll probably work on this at the start of next week.

No worries at all! Manage your time according to your preferences, and we will always be happy with your contributions. 🤗

qubvel

Thanks! We can merge it in this state and continue in a follow-up PR.

qubvel · 2024-12-03T11:08:46Z

@ArthurZucker, a minor cleanup for max_size variables and warnings in image processors.

ArthurZucker

Hey! Scanned a bit, I am guessing this was breaking for a few models so we are not removing support?
IMO we should still break as we had a deprecation cycle but try to help the models on the hub, unless our deprecation cycle did not cover a certain case -> we do another deprecation cycle for the missing cases!

qubvel · 2024-12-20T10:26:04Z

@ArthurZucker this one should not break anything, just a minor clean-up. The second one (linked) is breaking, we are working to update hub config first

ArthurZucker

Sure! Let's juste remove the futur version, it's vague and unhelpful. We can leave it as is people know it's already deprecated!

ArthurZucker · 2024-12-23T15:41:20Z

src/transformers/models/conditional_detr/image_processing_conditional_detr.py

-                "The `max_size` parameter is deprecated and will be removed in v4.26. "
+                "The `max_size` parameter is deprecated and will be removed in a future version. "


ArthurZucker · 2024-12-23T15:41:28Z

src/transformers/models/detr/image_processing_detr.py

@@ -1347,7 +1346,7 @@ def preprocess(

        do_resize = self.do_resize if do_resize is None else do_resize
        size = self.size if size is None else size
-        size = get_size_dict(size=size, max_size=max_size, default_to_square=False)
+        size = get_size_dict(size=size, default_to_square=False)


cool! thanks

This reverts commit 2b53f9e

HichTala added 2 commits November 28, 2024 10:50

Remove unused max_size variable in processor which was always `None…

df09974

…` and triggered unnecessary deprecated warning

Remove unused max_size variable in processor which was always `None…

12c74f6

…` and triggered unnecessary deprecated warning

qubvel self-requested a review November 28, 2024 14:32

qubvel added Vision Processing cleanup labels Nov 28, 2024

qubvel approved these changes Nov 28, 2024

View reviewed changes

Remove deprecated warnings and eliminate max_size usage

d8e7e6f

Test use int as argument for size

1925ee3

Add a test to ensure test can pass successfully and backward compatibility

HichTala added 3 commits November 28, 2024 17:59

The test pipelines still use max_size

eaed96f

Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys

Reformatting

ac4522e

Reformatting

c3040ac

HichTala requested a review from qubvel November 28, 2024 17:47

qubvel reviewed Nov 28, 2024

View reviewed changes

HichTala added 5 commits November 29, 2024 14:40

Revert "Reformatting"

130ec54

This reverts commit c3040ac.

Revert "Reformatting"

52c1800

This reverts commit ac4522e.

Revert "The test pipelines still use max_size"

96df609

This reverts commit eaed96f.

Revert "Test use int as argument for size"

f09450b

This reverts commit 1925ee3.

Revert "Remove deprecated warnings and eliminate max_size usage"

5ced3d8

This reverts commit d8e7e6f.

HichTala added 2 commits November 29, 2024 14:44

Change version 4.26 to "a future version"

2b53f9e

Reformatting

7b757e3

Merge branch 'main' into fix-max-size-deprecated-warning

d873553

qubvel self-requested a review December 3, 2024 10:46

qubvel approved these changes Dec 3, 2024

View reviewed changes

qubvel requested a review from ArthurZucker December 3, 2024 11:07

HichTala mentioned this pull request Dec 4, 2024

[Clean-up] Planned removal of the max_size argument #35090

Open

5 tasks

ArthurZucker reviewed Dec 20, 2024

View reviewed changes

ArthurZucker reviewed Dec 23, 2024

View reviewed changes

Revert "Change version 4.26 to "a future version""

71639a0

This reverts commit 2b53f9e

HichTala requested a review from ArthurZucker December 23, 2024 18:22

HichTala added 2 commits January 7, 2025 16:12

Merge branch 'huggingface:main' into fix-max-size-deprecated-warning

2d9d7a9

Merge branch 'huggingface:main' into fix-max-size-deprecated-warning

3418e64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix max size deprecated warning #34998

Fix max size deprecated warning #34998

HichTala commented Nov 28, 2024 •

edited

Loading

HichTala commented Nov 28, 2024

qubvel left a comment

HichTala commented Nov 28, 2024

qubvel commented Nov 28, 2024

qubvel commented Nov 28, 2024

HichTala commented Nov 28, 2024

HuggingFaceDocBuilderDev commented Nov 28, 2024

HichTala commented Nov 28, 2024

qubvel commented Nov 28, 2024 •

edited

Loading

qubvel Nov 28, 2024

HichTala commented Nov 28, 2024 •

edited

Loading

qubvel commented Nov 29, 2024 •

edited

Loading

HichTala commented Nov 29, 2024

qubvel commented Nov 29, 2024

HichTala commented Nov 29, 2024

qubvel commented Nov 29, 2024

qubvel commented Nov 29, 2024

qubvel left a comment

qubvel commented Dec 3, 2024

ArthurZucker left a comment •

edited

Loading

qubvel commented Dec 20, 2024 •

edited

Loading

ArthurZucker left a comment

ArthurZucker Dec 23, 2024

HichTala Dec 23, 2024

ArthurZucker Dec 23, 2024

		"The `max_size` parameter is deprecated and will be removed in v4.26. "
		"The `max_size` parameter is deprecated and will be removed in a future version. "

Fix max size deprecated warning #34998

Are you sure you want to change the base?

Fix max size deprecated warning #34998

Conversation

HichTala commented Nov 28, 2024 • edited Loading

Removal of max_size argument:

Before submitting

Who can review?

HichTala commented Nov 28, 2024

qubvel left a comment

Choose a reason for hiding this comment

HichTala commented Nov 28, 2024

qubvel commented Nov 28, 2024

qubvel commented Nov 28, 2024

HichTala commented Nov 28, 2024

HuggingFaceDocBuilderDev commented Nov 28, 2024

HichTala commented Nov 28, 2024

qubvel commented Nov 28, 2024 • edited Loading

qubvel Nov 28, 2024

Choose a reason for hiding this comment

HichTala commented Nov 28, 2024 • edited Loading

qubvel commented Nov 29, 2024 • edited Loading

HichTala commented Nov 29, 2024

qubvel commented Nov 29, 2024

HichTala commented Nov 29, 2024

qubvel commented Nov 29, 2024

qubvel commented Nov 29, 2024

qubvel left a comment

Choose a reason for hiding this comment

qubvel commented Dec 3, 2024

ArthurZucker left a comment • edited Loading

Choose a reason for hiding this comment

qubvel commented Dec 20, 2024 • edited Loading

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Dec 23, 2024

Choose a reason for hiding this comment

HichTala Dec 23, 2024

Choose a reason for hiding this comment

ArthurZucker Dec 23, 2024

Choose a reason for hiding this comment

HichTala commented Nov 28, 2024 •

edited

Loading

Removal of `max_size` argument:

qubvel commented Nov 28, 2024 •

edited

Loading

HichTala commented Nov 28, 2024 •

edited

Loading

qubvel commented Nov 29, 2024 •

edited

Loading

ArthurZucker left a comment •

edited

Loading

qubvel commented Dec 20, 2024 •

edited

Loading