Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ViTImageProcessorFast to tests #31424

Merged

Conversation

amyeroberts
Copy link
Collaborator

@amyeroberts amyeroberts commented Jun 14, 2024

What does this PR do?

Accidentally didn't add ViTImageProcessorFast to the test suite when adding the class to the library in #28847 as fast_image_processor_class wasn't set in the tester.

Adds it to the tester and fixes anything needed for tests to pass as well as two additional tests to make sure cross loading between the fast and slow image processors works as expected

@@ -151,6 +151,11 @@ def center_crop(
**kwargs,
)

def to_dict(self):
encoder_dict = super().to_dict()
encoder_dict.pop("_valid_processor_keys", None)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@molbap Would be good to have your confirmation here that we can remove - I don't think we want to save this out in the config: it's private and wouldn't be used when creating a new config class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I agree! no use having it written to the config

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@molbap

Just wondering:

  • why is _valid_processor_keys instance attribute? (could be class attribute?)

image_processor_class += "Fast"
elif not use_fast and image_processor_class.endswith("Fast"):
image_processor_class = image_processor_class[:-4]
if use_fast is not None:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only update the image_processor_class if use_fast is explicitly set. This makes sure that if a fast image processor is saved out, it will be loaded in as a fast image processor from AutoImageProcessor by default

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the idea. But it looks like to me here you want to make sure if a slow (rather than slow as you mentioned above) image processor is saved, and use_fast is not specified, we want to load with fast image processor.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I completely understood your comment, but let me try and explain the desired behaviour, and you can let me know if I've mistaken something :)

At the moment, we don't want use_fast to default to either False or True, as this will affect the image processor loaded in regardless of the type of image processor saved out.

Desired behaviours

  • Saved slow image processor -> loads slow image processor by default (use_fast unset)
  • Saved fast image processor -> loads fast image processor by default (use_fast unset)
  • Save slow image processor, use_fast=False -> loads slow image processor
  • Save slow image processor, use_fast=True -> loads fast image processor
  • Save fast image processor, use_fast=False -> loads slow image processor
  • Save fast image processor, use_fast=True -> loads fast image processor

i.e. if a fast image processor is available, but a slow image processor is saved out, we still want to default to the slow image processor for now as they won't produce exactly the same outputs, and instead inform them that a fast version is available. This is to avoid unexpected behaviour

Copy link
Collaborator

@ydshieh ydshieh Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I got it now.

that if a fast image processor is saved out, it will be loaded in as a fast image processor if use_fast is None

which is not the current main but addressed in this PR.!

sorry (not easy to reason about all these cases and motivation)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is not the current main but addressed in this PR.!

Yep! It was a bug that I hadn't noticed before 😬

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@amyeroberts amyeroberts force-pushed the test-fast-vit-image-processor branch 6 times, most recently from e888ec4 to cf622c7 Compare June 19, 2024 18:35
@@ -772,7 +772,7 @@ def preprocess(
ignore_index,
do_reduce_labels,
return_tensors,
input_data_format=input_data_format,
input_data_format=data_format,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this model (and the ones below with this change) the "input_data_format" is now data_format because the images were changed to data_format in the _preprocess_image step.

This wasn't previously caught because the test images and the defaults were always in channels_last format.

Now, we're feeding in channels_first numpy images for tests, as this is the standard format for numpy images.

_ = image_processor(image, return_tensors="pt")
return time.time() - start

dummy_images = torch.randint(0, 255, (4, 3, 224, 224), dtype=torch.uint8)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing with a PIL image was a bit flaky in terms of time comparisons...

@@ -238,6 +239,52 @@ def test_image_processor_save_load_with_autoimageprocessor(self):

self.assertEqual(image_processor_second.to_dict(), image_processor_first.to_dict())

def test_save_load_fast_slow(self):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests to make sure we can correctly cross-load fast and slow image processors directly with its class and with the auto class.

@amyeroberts amyeroberts requested a review from ydshieh June 20, 2024 14:11
@ydshieh
Copy link
Collaborator

ydshieh commented Jun 20, 2024

Hi @amyeroberts . Could you help me to get some more context regarding the changes in this PR 🙏 ?

From the title and the description, I understand:

  • add ViTImageProcessorFast to the list so we can test it
  • fixes anything needed for tests to pass as well as two additional tests

However, I don't understand why we have the changes below:

  • In several files:
            elif isinstance(image, np.ndarray):
                h, w = image.shape[:2]

Is this because previously we don't test with np.ndarray but now we test in this PR?

  • The following two comments seems a bit contradictory to me, but I'm sure I miss some details.

    • we're feeding in channels_first numpy images for tests, as this is the standard format for numpy images.
    • # Numpy images are typically in channels last format
  • why we have special changes in tests/models/idefics2/test_image_processing_idefics2.py?

  • In tests/test_image_processing_common.py , why channels_first is changed to channels_last ?

Hope I don't ask too much 😅

@amyeroberts
Copy link
Collaborator Author

@ydshieh Of course! :) Sorry that I didn't make it clear in the PR description.

Is this because previously we don't test with np.ndarray but now we test in this PR?

We previously tested with numpy images. However, the numpy images created were in channels first format i.e. (C, H, W). They were essentially numpy versions of images in torch.

When we added the fast image processor, we had to include logic to convert both PIL.Image.Image images and numpy arrays to torch tensors. torchvision has:

  • ToTensor method which takes numpy and PIL.Image.Image, converts to a torch tensor, then normalizes
  • PILToTensor method, which takes a PIL.Image.Image and converts to a torch tensor but importantly doesn't normalize

We wanted an operation equivalent to PILToTensor for numpy arrays (converts to torch tensor but doesn't normalize) as we want to control the normalization factor and whether normalizaton happens (with do_normalize), so we added NumpyToTensor

Within PyTorch transforms, there is the assumption that incoming numpy arrays are in channels last format. This is reasonable, as this is normally true.

In order for this transform to now work for the fast image processors, the numpy arrays made for testing needed to be made in channels last, (H, W, C) format. Previously, as the tests had all arrays in channels_first format, the default if/else logic was fine, but now we need to condition on whether the input is a numpy array to get the correct size.

The following two comments seems a bit contradictory to me, but I'm sure I miss some details.

They completely are, mistake on my part, I'll update!

why we have special changes in tests/models/idefics2/test_image_processing_idefics2.py?

  • The numpify changes and adding numpy_4_channels test method are because idefics2 has a particular input format.
  • for image_processing_class in self.image_processor_list: addition is because for idefics I had to update the numpy_4_channels test method. It's not technically necessary in this PR (happy to remove), and is a change to be made to all image processor tests which override the mixin's original implementation. Otherwise only the slow image processor is ever tested

In tests/test_image_processing_common.py , why channels_first is changed to channels_last ?

For the numpy tests, the inputs are now in channels_last format. I could also remove input_format completely from the call and allow this to be inferred, but decided to just update the current values for now

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for explaining with details. LGTM with some comments/nits if you want to update directly in this PR

image_processor_class += "Fast"
elif not use_fast and image_processor_class.endswith("Fast"):
image_processor_class = image_processor_class[:-4]
if use_fast is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the idea. But it looks like to me here you want to make sure if a slow (rather than slow as you mentioned above) image processor is saved, and use_fast is not specified, we want to load with fast image processor.

@@ -151,6 +151,11 @@ def center_crop(
**kwargs,
)

def to_dict(self):
encoder_dict = super().to_dict()
encoder_dict.pop("_valid_processor_keys", None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@molbap

Just wondering:

  • why is _valid_processor_keys instance attribute? (could be class attribute?)

@amyeroberts amyeroberts force-pushed the test-fast-vit-image-processor branch from d97c848 to 90c9584 Compare June 24, 2024 17:06
image_processor_class += "Fast"
elif not use_fast and image_processor_class.endswith("Fast"):
image_processor_class = image_processor_class[:-4]
if use_fast is not None:
Copy link
Collaborator

@ydshieh ydshieh Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I got it now.

that if a fast image processor is saved out, it will be loaded in as a fast image processor if use_fast is None

which is not the current main but addressed in this PR.!

sorry (not easy to reason about all these cases and motivation)

@amyeroberts amyeroberts merged commit 0f67ba1 into huggingface:main Jun 25, 2024
23 checks passed
@amyeroberts amyeroberts deleted the test-fast-vit-image-processor branch June 25, 2024 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants