Add rtdetr-v2 version of code #33244

SangbumChoi · 2024-09-02T05:57:35Z

What does this PR do?

This is the code of compatible for rtdetr-v2. https://github.com/lyuwenyu/RT-DETR/blob/main/rtdetrv2_pytorch/configs/rtdetrv2/rtdetrv2_r18vd_120e_coco.yml

At this moment I just uploaded rtdetrv2_r18vd for the test, but while in the reviewing code I will also upload other model weight also. https://huggingface.co/danelcsb/rtdetr_v2_r18vd/tree/main

@qubvel @amyeroberts

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

…to rtdetr_v2

SangbumChoi · 2024-09-02T07:17:50Z

CI error seems unrelated (jax, albumentation uninstalled error)

amyeroberts · 2024-09-02T10:16:42Z

Thanks for adding @SangbumChoi!

As v2 is released with a new paper, it should be added as it's own, separate model in the repo.

SangbumChoi · 2024-09-02T11:21:03Z

@amyeroberts There is no problem making with v2 independent repo, however should I make it compatible to import v1 configuration in v2?

amyeroberts · 2024-09-02T12:26:03Z

@SangbumChoi As there appears to be v2 specific checkpoints, I'd say no.

SangbumChoi · 2024-09-08T12:32:44Z

Files to be transferred https://huggingface.co/danelcsb

SangbumChoi · 2024-09-10T05:15:51Z

docs/source/en/model_doc/rt_detr_v2.md

+>>> url = 'http://images.cocodataset.org/val2017/000000039769.jpg' 
+>>> image = Image.open(requests.get(url, stream=True).raw)
+
+>>> image_processor = RTDetrImageProcessor.from_pretrained("danelcsb/rtdetr_v2_r50vd")


Need to be changed after approval

SangbumChoi · 2024-09-10T05:15:57Z

docs/source/en/model_doc/rt_detr_v2.md

+>>> image = Image.open(requests.get(url, stream=True).raw)
+
+>>> image_processor = RTDetrImageProcessor.from_pretrained("danelcsb/rtdetr_v2_r50vd")
+>>> model = RTDetrV2ForObjectDetection.from_pretrained("danelcsb/rtdetr_v2_r50vd")


Need to be changed after approval

SangbumChoi · 2024-09-10T05:17:21Z

src/transformers/models/rt_detr_v2/modeling_rt_detr_v2.py

+
+_CONFIG_FOR_DOC = "RTDetrV2Config"
+# TODO: Replace all occurrences of the checkpoint with the final one
+_CHECKPOINT_FOR_DOC = ""


Need to be changed after approval

SangbumChoi · 2024-09-10T05:18:07Z

tests/models/rt_detr_v2/test_modeling_rt_detr_v2.py

+    from PIL import Image
+
+
+CHECKPOINT = "danelcsb/rtdetr_v2_r50vd"  # TODO: replace


Need to be changed after approval

SangbumChoi · 2024-09-10T05:32:58Z

@amyeroberts RTDetrV2 is ready 👍🏼

amyeroberts

Looks great - thanks for adding!

Just a few small comments. Final step after addressing these is running the slow tests for the model before merge. Could you push an empty commit with the message [run_slow] rt_detr_v2?

src/transformers/models/rt_detr_v2/modeling_rt_detr_v2_resnet.py

src/transformers/models/rt_detr_v2/modeling_rt_detr_v2.py

amyeroberts · 2024-09-25T10:51:08Z

tests/models/rt_detr_v2/test_modeling_rt_detr_v2.py

+
+
+@require_torch
+class RTDetrV2ModelTest(ModelTesterMixin, PipelineTesterMixin, unittest.TestCase):


We should use # Copied from for the tests too

@amyeroberts Well actually since the configuration of RTDetr and RTDetrV2 is different. However, I will add the part that I can do e.g. RTDetrV2ResNetModelTester

Wait how can we use # Copied from since it starts from transformers ?

# Copied from transformers.models.rt_detr.modeling_rt_detr.RTDetrPreTrainedModel with RTDetr->RTDetrV2,rt_detr->rt_detr_v2

src/transformers/models/rt_detr_v2/modeling_rt_detr_v2.py

amyeroberts · 2024-09-25T11:08:35Z

src/transformers/models/rt_detr_v2/modeling_rt_detr_v2.py

+        num_backbone_outs = len(config.decoder_in_channels)
+        decoder_input_proj_list = []
+        for _ in range(num_backbone_outs):
+            in_channels = config.decoder_in_channels[_]
+            decoder_input_proj_list.append(
+                nn.Sequential(
+                    nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False),
+                    nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
+                )
+            )
+        for _ in range(config.num_feature_levels - num_backbone_outs):
+            decoder_input_proj_list.append(
+                nn.Sequential(
+                    nn.Conv2d(in_channels, config.d_model, kernel_size=3, stride=2, padding=1, bias=False),
+                    nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
+                )
+            )
+            in_channels = config.d_model


As above - this makes it more explicit which dimensions are being used wrt the scope

Suggested change

num_backbone_outs = len(config.decoder_in_channels)

decoder_input_proj_list = []

for _ in range(num_backbone_outs):

in_channels = config.decoder_in_channels[_]

decoder_input_proj_list.append(

nn.Sequential(

nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False),

nn.BatchNorm2d(config.d_model, config.batch_norm_eps),

)

)

for _ in range(config.num_feature_levels - num_backbone_outs):

decoder_input_proj_list.append(

nn.Sequential(

nn.Conv2d(in_channels, config.d_model, kernel_size=3, stride=2, padding=1, bias=False),

nn.BatchNorm2d(config.d_model, config.batch_norm_eps),

)

)

in_channels = config.d_model

decoder_input_proj_list = []

for in_channels in config.decoder_in_channels:

decoder_input_proj_list.append(

nn.Sequential(

nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False),

nn.BatchNorm2d(config.d_model, config.batch_norm_eps),

)

)

decoder_input_proj_list.append(

nn.Sequential(

nn.Conv2d(config.decoder_in_channels[-1], config.d_model, kernel_size=3, stride=2, padding=1, bias=False),

nn.BatchNorm2d(config.d_model, config.batch_norm_eps),

)

)

for _ in range(config.num_feature_levels - num_backbone_outs - 1):

decoder_input_proj_list.append(

nn.Sequential(

nn.Conv2d(config.d_model, config.d_model, kernel_size=3, stride=2, padding=1, bias=False),

nn.BatchNorm2d(config.d_model, config.batch_norm_eps),

)

)

@amyeroberts Unlike above case I think this is not always true when config.num_feature_levels = num_backbone_outs. Let me think of it and try to fix it

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

… into rtdetr_v2

yonigozlan · 2024-10-05T18:38:55Z

Hi @SangbumChoi! Excited to see RT-DETR-V2 in Transformers thanks for working on this!
As the implementation is so similar to RT-DETR and contains a lot of copied from, I think it could really benefit from using the new Modular system: more info and examples here and here.

As most of the work is already done in this PR, using Modular should be straightforward: you could put every module that does not include a copied from inside the modular file and discard the ones that do, and you should also be able to simplify the part that don't use copied from if they are similar/only add logic to parts in RT-DETR, using inheritance.

Happy to help if you have any questions!

SangbumChoi · 2024-12-16T13:56:15Z

close since there is another PR for modular function

SangbumChoi added 8 commits August 29, 2024 05:48

tmp

13feb5c

add custom function for deformable_attention

730f2d7

make style

bb8044e

add rtdetr_v2

6c7ced6

make style

71ee1fb

Merge branch 'main' of https://github.com/SangbumChoi/transformers in…

dcc05d1

…to rtdetr_v2

add docstring

cad399a

add docstring

4e3a064

NielsRogge mentioned this pull request Sep 2, 2024

RT-DETR is now available in Hugging Face Transformers lyuwenyu/RT-DETR#413

Open

SangbumChoi marked this pull request as draft September 2, 2024 12:59

SangbumChoi added 5 commits September 2, 2024 13:42

revert rt_detr

0693e5d

add rt_detr_v2

75462bd

temp commit

c3dab0e

1st draft

47b61c0

Add more model file

b7cd250

SangbumChoi added 5 commits September 10, 2024 02:00

change to v2

8697337

add missing test file

2e543c8

change docs

43ed118

change value in test

2c55549

make to v2

a2116ea

SangbumChoi commented Sep 10, 2024

View reviewed changes

SangbumChoi marked this pull request as ready for review September 10, 2024 05:32

amyeroberts added the run-slow label Sep 25, 2024

amyeroberts reviewed Sep 25, 2024

View reviewed changes

SangbumChoi and others added 11 commits September 25, 2024 21:19

Update src/transformers/models/rt_detr_v2/modeling_rt_detr_v2.py

d45aed5

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/models/rt_detr_v2/modeling_rt_detr_v2.py

b5c63f1

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/models/rt_detr_v2/modeling_rt_detr_v2.py

06eac9b

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

renaming

5f654de

Merge branch 'rtdetr_v2' of https://github.com/SangbumChoi/transformers…

d970397

… into rtdetr_v2

tmp

63e2f96

make style

2579984

enc -> encoder

5b8b496

tmp

3465ff2

revert to original

565fc31

revert

8943878

qubvel added New model Vision labels Oct 2, 2024

SangbumChoi mentioned this pull request Nov 18, 2024

Adding RTDETRv2 #34773

Merged

3 tasks

SangbumChoi mentioned this pull request Dec 16, 2024

Request to add D-FINE #35283

Open

2 tasks

SangbumChoi closed this Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rtdetr-v2 version of code #33244

Add rtdetr-v2 version of code #33244

SangbumChoi commented Sep 2, 2024

SangbumChoi commented Sep 2, 2024

amyeroberts commented Sep 2, 2024

SangbumChoi commented Sep 2, 2024

amyeroberts commented Sep 2, 2024

SangbumChoi commented Sep 8, 2024

SangbumChoi Sep 10, 2024

SangbumChoi Sep 10, 2024

SangbumChoi Sep 10, 2024

SangbumChoi Sep 10, 2024

SangbumChoi commented Sep 10, 2024 •

edited

Loading

amyeroberts left a comment

amyeroberts Sep 25, 2024

SangbumChoi Sep 25, 2024

SangbumChoi Sep 25, 2024 •

edited

Loading

amyeroberts Sep 25, 2024

SangbumChoi Sep 25, 2024

yonigozlan commented Oct 5, 2024

SangbumChoi commented Dec 16, 2024

		from PIL import Image


		CHECKPOINT = "danelcsb/rtdetr_v2_r50vd" # TODO: replace



		@require_torch
		class RTDetrV2ModelTest(ModelTesterMixin, PipelineTesterMixin, unittest.TestCase):

Add rtdetr-v2 version of code #33244

Add rtdetr-v2 version of code #33244

Conversation

SangbumChoi commented Sep 2, 2024

What does this PR do?

Before submitting

SangbumChoi commented Sep 2, 2024

amyeroberts commented Sep 2, 2024

SangbumChoi commented Sep 2, 2024

amyeroberts commented Sep 2, 2024

SangbumChoi commented Sep 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SangbumChoi commented Sep 10, 2024 • edited Loading

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SangbumChoi Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yonigozlan commented Oct 5, 2024

SangbumChoi commented Dec 16, 2024

SangbumChoi commented Sep 10, 2024 •

edited

Loading

SangbumChoi Sep 25, 2024 •

edited

Loading