-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🔴 🔴 🔴 Added segmentation maps
support for DPT image processor
#34345
🔴 🔴 🔴 Added segmentation maps
support for DPT image processor
#34345
Conversation
cc @molbap as well in case bandwidth permits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - just a small refactor of the method to be more aligned with existing models!
|
||
def test_call_segmentation_maps(self): | ||
# Initialize image_processing | ||
image_processing = self.image_processing_class(**self.image_processor_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, image_processor
would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed image_processing
to image_processor
. Should I also rename it in the other tests?
if segmentation_maps is not None: | ||
segmentation_maps = [to_numpy_array(segmentation_map) for segmentation_map in segmentation_maps] | ||
|
||
# Add channel dimension if missing - needed for certain transformations | ||
if segmentation_maps[0].ndim == 2: | ||
added_channel_dim = True | ||
segmentation_maps = [segmentation_map[None, ...] for segmentation_map in segmentation_maps] | ||
input_data_format = ChannelDimension.FIRST | ||
else: | ||
added_channel_dim = False | ||
if input_data_format is None: | ||
input_data_format = infer_channel_dimension_format(segmentation_maps[0], num_channels=1) | ||
|
||
if do_reduce_labels: | ||
segmentation_maps = [self.reduce_label(segmentation_map) for segmentation_map in segmentation_maps] | ||
|
||
if do_resize: | ||
segmentation_maps = [ | ||
self.resize( | ||
image=segmentation_map, | ||
size=size, | ||
resample=resample, | ||
keep_aspect_ratio=keep_aspect_ratio, | ||
ensure_multiple_of=ensure_multiple_of, | ||
input_data_format=input_data_format, | ||
) | ||
for segmentation_map in segmentation_maps | ||
] | ||
|
||
if do_pad: | ||
segmentation_maps = [ | ||
self.pad_image( | ||
image=segmentation_map, size_divisor=size_divisor, input_data_format=input_data_format | ||
) | ||
for segmentation_map in segmentation_maps | ||
] | ||
|
||
# Remove extra channel dimension if added for processing | ||
if added_channel_dim: | ||
segmentation_maps = [segmentation_map.squeeze(0) for segmentation_map in segmentation_maps] | ||
segmentation_maps = [segmentation_map.astype(np.int64) for segmentation_map in segmentation_maps] | ||
|
||
data["labels"] = segmentation_maps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect - if there isn't any difference with Beit, can this be wrapped in a _preprocess_segmentation_map()
method in a loop, that can be flagged as # Copied from ...
the beit image processor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapped segmentation map preprocessing code to _preprocess_segmentation_map()
, and also moved image preprocessing to separate _preprocess_image()
function and general preprocessing functionality to _preprocess()
function.
Could you please re-review the pull request? In the last commit I made all the changes you asked for: wrapped segmentation map preprocessing code to separate functions, added comments and renamed a variable in tests. Do I need to make any other changes to the code? |
hey @simonreise , will review in a moment, we were all at a team gathering last week hence the inactivity. On my radar! |
@molbap you are fobidden to work for this week 🤣 go and rest, @yonigozlan will have a look! 🤗 |
friendly ping @yonigozlan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
Looks good to me, and something that would be useful to have.
Just left some comments about adding # Copied from
statements where needed, and about breaking backward compatibility (mainly intended for a core maintainer)
def reduce_label(self, label: ImageInput) -> np.ndarray: | ||
label = to_numpy_array(label) | ||
# Avoid using underflow conversion | ||
label[label == 0] = 255 | ||
label = label - 1 | ||
label[label == 254] = 255 | ||
return label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be fully copied from beit image processor, you should add a # Copied from
statement above if that's the case :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
def __call__(self, images, segmentation_maps=None, **kwargs): | ||
# Overrides the `__call__` method of the `Preprocessor` class such that the images and segmentation maps can both | ||
# be passed in as positional arguments. | ||
return super().__call__(images, segmentation_maps=segmentation_maps, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here for adding a # Copied from
, and same for all the other methods copied from beit as well.
@filter_out_non_signature_kwargs() | ||
def preprocess( | ||
self, | ||
images: ImageInput, | ||
segmentation_maps: Optional[ImageInput] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit tricky as it could be a breaking change, if some users use do_resize
etc. as args and not kwargs. However this would not be good practice, and I don't see any way of adding segmentation_maps
processing without breaking BC. I'll let a core maintainer give the green light on this or not.
|
||
def test_call_segmentation_maps(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before about the # Copied from, if this and the following tests are indeed fully copied from beit
Thanks, but the After making the required changes, you can ensure everything is in order by running the |
Thanks for iterating! you just have to rebase on main and check that the tests are still passing, then LGTM! |
…ithub.com/simonreise/transformers into segmentation-maps-for-dpt-image-processor
Everything looks good to me, pinging @ArthurZucker for a final review. |
There was a merge conflict that appeared after #35439 was merged into main. So I also changed the order of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't mind breaking this, but we need to add a 🔴 on the PR name to sort it out when releasing! 🤗
segmentation maps
support for DPT image processorsegmentation maps
support for DPT image processor
…ingface#34345) * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements
…ingface#34345) * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements
…ingface#34345) * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements
Added
segmentation maps
support for DPT image processorMost of image processors for vision models that support semantic segmentation task accept
images
andsegmentation_maps
as inputs, but for some reason DPT image processor does not process segmentation maps, only images. This PR can make code that one uses for training or evaluation of semantic segmentation models more reusable, as now DPT image processor can process segmentation maps as most of other image processors do.I also added
do_reduce_labels
arg because other image processors that support segmentation masks use it.I added two new tests: one that tests segmentation_masks support and one that tests if do_reduce_labels work as expected.
Most of the code is adapted from BEIT image processor.
Before submitting
Pull Request section?
to it if that's the case.
Who can review?
@amyeroberts, @qubvel