-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rtdetr-v2 version of code #33244
Conversation
CI error seems unrelated (jax, albumentation uninstalled error) |
Thanks for adding @SangbumChoi! As v2 is released with a new paper, it should be added as it's own, separate model in the repo. |
@amyeroberts There is no problem making with v2 independent repo, however should I make it compatible to import v1 configuration in v2? |
@SangbumChoi As there appears to be v2 specific checkpoints, I'd say no. |
Files to be transferred https://huggingface.co/danelcsb |
>>> url = 'http://images.cocodataset.org/val2017/000000039769.jpg' | ||
>>> image = Image.open(requests.get(url, stream=True).raw) | ||
|
||
>>> image_processor = RTDetrImageProcessor.from_pretrained("danelcsb/rtdetr_v2_r50vd") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to be changed after approval
>>> image = Image.open(requests.get(url, stream=True).raw) | ||
|
||
>>> image_processor = RTDetrImageProcessor.from_pretrained("danelcsb/rtdetr_v2_r50vd") | ||
>>> model = RTDetrV2ForObjectDetection.from_pretrained("danelcsb/rtdetr_v2_r50vd") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to be changed after approval
|
||
_CONFIG_FOR_DOC = "RTDetrV2Config" | ||
# TODO: Replace all occurrences of the checkpoint with the final one | ||
_CHECKPOINT_FOR_DOC = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to be changed after approval
from PIL import Image | ||
|
||
|
||
CHECKPOINT = "danelcsb/rtdetr_v2_r50vd" # TODO: replace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to be changed after approval
@amyeroberts RTDetrV2 is ready 👍🏼 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great - thanks for adding!
Just a few small comments. Final step after addressing these is running the slow tests for the model before merge. Could you push an empty commit with the message [run_slow] rt_detr_v2
?
src/transformers/models/rt_detr_v2/modeling_rt_detr_v2_resnet.py
Outdated
Show resolved
Hide resolved
|
||
|
||
@require_torch | ||
class RTDetrV2ModelTest(ModelTesterMixin, PipelineTesterMixin, unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use # Copied from
for the tests too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amyeroberts Well actually since the configuration of RTDetr
and RTDetrV2
is different. However, I will add the part that I can do e.g. RTDetrV2ResNetModelTester
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait how can we use # Copied from
since it starts from transformers ?
# Copied from transformers.models.rt_detr.modeling_rt_detr.RTDetrPreTrainedModel with RTDetr->RTDetrV2,rt_detr->rt_detr_v2
num_backbone_outs = len(config.decoder_in_channels) | ||
decoder_input_proj_list = [] | ||
for _ in range(num_backbone_outs): | ||
in_channels = config.decoder_in_channels[_] | ||
decoder_input_proj_list.append( | ||
nn.Sequential( | ||
nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False), | ||
nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | ||
) | ||
) | ||
for _ in range(config.num_feature_levels - num_backbone_outs): | ||
decoder_input_proj_list.append( | ||
nn.Sequential( | ||
nn.Conv2d(in_channels, config.d_model, kernel_size=3, stride=2, padding=1, bias=False), | ||
nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | ||
) | ||
) | ||
in_channels = config.d_model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above - this makes it more explicit which dimensions are being used wrt the scope
num_backbone_outs = len(config.decoder_in_channels) | |
decoder_input_proj_list = [] | |
for _ in range(num_backbone_outs): | |
in_channels = config.decoder_in_channels[_] | |
decoder_input_proj_list.append( | |
nn.Sequential( | |
nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False), | |
nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | |
) | |
) | |
for _ in range(config.num_feature_levels - num_backbone_outs): | |
decoder_input_proj_list.append( | |
nn.Sequential( | |
nn.Conv2d(in_channels, config.d_model, kernel_size=3, stride=2, padding=1, bias=False), | |
nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | |
) | |
) | |
in_channels = config.d_model | |
decoder_input_proj_list = [] | |
for in_channels in config.decoder_in_channels: | |
decoder_input_proj_list.append( | |
nn.Sequential( | |
nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False), | |
nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | |
) | |
) | |
decoder_input_proj_list.append( | |
nn.Sequential( | |
nn.Conv2d(config.decoder_in_channels[-1], config.d_model, kernel_size=3, stride=2, padding=1, bias=False), | |
nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | |
) | |
) | |
for _ in range(config.num_feature_levels - num_backbone_outs - 1): | |
decoder_input_proj_list.append( | |
nn.Sequential( | |
nn.Conv2d(config.d_model, config.d_model, kernel_size=3, stride=2, padding=1, bias=False), | |
nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | |
) | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amyeroberts Unlike above case I think this is not always true when config.num_feature_levels = num_backbone_outs
. Let me think of it and try to fix it
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Hi @SangbumChoi! Excited to see RT-DETR-V2 in Transformers thanks for working on this! As most of the work is already done in this PR, using Happy to help if you have any questions! |
close since there is another PR for modular function |
What does this PR do?
This is the code of compatible for rtdetr-v2. https://github.com/lyuwenyu/RT-DETR/blob/main/rtdetrv2_pytorch/configs/rtdetrv2/rtdetrv2_r18vd_120e_coco.yml
At this moment I just uploaded rtdetrv2_r18vd for the test, but while in the reviewing code I will also upload other model weight also. https://huggingface.co/danelcsb/rtdetr_v2_r18vd/tree/main
@qubvel @amyeroberts

Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.