Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rtdetr-v2 version of code #33244

Closed
wants to merge 29 commits into from
Closed

Conversation

SangbumChoi
Copy link
Contributor

What does this PR do?

This is the code of compatible for rtdetr-v2. https://github.com/lyuwenyu/RT-DETR/blob/main/rtdetrv2_pytorch/configs/rtdetrv2/rtdetrv2_r18vd_120e_coco.yml

At this moment I just uploaded rtdetrv2_r18vd for the test, but while in the reviewing code I will also upload other model weight also. https://huggingface.co/danelcsb/rtdetr_v2_r18vd/tree/main

@qubvel @amyeroberts
Screenshot 2024-09-02 at 2 56 22 PM

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

@SangbumChoi
Copy link
Contributor Author

CI error seems unrelated (jax, albumentation uninstalled error)

@amyeroberts
Copy link
Collaborator

Thanks for adding @SangbumChoi!

As v2 is released with a new paper, it should be added as it's own, separate model in the repo.

@SangbumChoi
Copy link
Contributor Author

@amyeroberts There is no problem making with v2 independent repo, however should I make it compatible to import v1 configuration in v2?

@amyeroberts
Copy link
Collaborator

@SangbumChoi As there appears to be v2 specific checkpoints, I'd say no.

@SangbumChoi SangbumChoi marked this pull request as draft September 2, 2024 12:59
@SangbumChoi
Copy link
Contributor Author

Files to be transferred https://huggingface.co/danelcsb

>>> url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
>>> image = Image.open(requests.get(url, stream=True).raw)

>>> image_processor = RTDetrImageProcessor.from_pretrained("danelcsb/rtdetr_v2_r50vd")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be changed after approval

>>> image = Image.open(requests.get(url, stream=True).raw)

>>> image_processor = RTDetrImageProcessor.from_pretrained("danelcsb/rtdetr_v2_r50vd")
>>> model = RTDetrV2ForObjectDetection.from_pretrained("danelcsb/rtdetr_v2_r50vd")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be changed after approval


_CONFIG_FOR_DOC = "RTDetrV2Config"
# TODO: Replace all occurrences of the checkpoint with the final one
_CHECKPOINT_FOR_DOC = ""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be changed after approval

from PIL import Image


CHECKPOINT = "danelcsb/rtdetr_v2_r50vd" # TODO: replace
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be changed after approval

@SangbumChoi SangbumChoi marked this pull request as ready for review September 10, 2024 05:32
@SangbumChoi
Copy link
Contributor Author

SangbumChoi commented Sep 10, 2024

@amyeroberts RTDetrV2 is ready 👍🏼

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great - thanks for adding!

Just a few small comments. Final step after addressing these is running the slow tests for the model before merge. Could you push an empty commit with the message [run_slow] rt_detr_v2?



@require_torch
class RTDetrV2ModelTest(ModelTesterMixin, PipelineTesterMixin, unittest.TestCase):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use # Copied from for the tests too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amyeroberts Well actually since the configuration of RTDetr and RTDetrV2 is different. However, I will add the part that I can do e.g. RTDetrV2ResNetModelTester

Copy link
Contributor Author

@SangbumChoi SangbumChoi Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait how can we use # Copied from since it starts from transformers ?

# Copied from transformers.models.rt_detr.modeling_rt_detr.RTDetrPreTrainedModel with RTDetr->RTDetrV2,rt_detr->rt_detr_v2

Comment on lines 1677 to 1694
num_backbone_outs = len(config.decoder_in_channels)
decoder_input_proj_list = []
for _ in range(num_backbone_outs):
in_channels = config.decoder_in_channels[_]
decoder_input_proj_list.append(
nn.Sequential(
nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False),
nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
)
)
for _ in range(config.num_feature_levels - num_backbone_outs):
decoder_input_proj_list.append(
nn.Sequential(
nn.Conv2d(in_channels, config.d_model, kernel_size=3, stride=2, padding=1, bias=False),
nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
)
)
in_channels = config.d_model
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above - this makes it more explicit which dimensions are being used wrt the scope

Suggested change
num_backbone_outs = len(config.decoder_in_channels)
decoder_input_proj_list = []
for _ in range(num_backbone_outs):
in_channels = config.decoder_in_channels[_]
decoder_input_proj_list.append(
nn.Sequential(
nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False),
nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
)
)
for _ in range(config.num_feature_levels - num_backbone_outs):
decoder_input_proj_list.append(
nn.Sequential(
nn.Conv2d(in_channels, config.d_model, kernel_size=3, stride=2, padding=1, bias=False),
nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
)
)
in_channels = config.d_model
decoder_input_proj_list = []
for in_channels in config.decoder_in_channels:
decoder_input_proj_list.append(
nn.Sequential(
nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False),
nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
)
)
decoder_input_proj_list.append(
nn.Sequential(
nn.Conv2d(config.decoder_in_channels[-1], config.d_model, kernel_size=3, stride=2, padding=1, bias=False),
nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
)
)
for _ in range(config.num_feature_levels - num_backbone_outs - 1):
decoder_input_proj_list.append(
nn.Sequential(
nn.Conv2d(config.d_model, config.d_model, kernel_size=3, stride=2, padding=1, bias=False),
nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
)
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amyeroberts Unlike above case I think this is not always true when config.num_feature_levels = num_backbone_outs. Let me think of it and try to fix it

@yonigozlan
Copy link
Member

Hi @SangbumChoi! Excited to see RT-DETR-V2 in Transformers thanks for working on this!
As the implementation is so similar to RT-DETR and contains a lot of copied from, I think it could really benefit from using the new Modular system: more info and examples here and here.

As most of the work is already done in this PR, using Modular should be straightforward: you could put every module that does not include a copied from inside the modular file and discard the ones that do, and you should also be able to simplify the part that don't use copied from if they are similar/only add logic to parts in RT-DETR, using inheritance.

Happy to help if you have any questions!

@SangbumChoi SangbumChoi mentioned this pull request Nov 18, 2024
3 tasks
@SangbumChoi SangbumChoi mentioned this pull request Dec 16, 2024
2 tasks
@SangbumChoi
Copy link
Contributor Author

close since there is another PR for modular function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants