Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the accuracy of Detection & Segmentation models by using SOTA recipes and primitives #5307

Closed
datumbox opened this issue Jan 28, 2022 · 3 comments · Fixed by #5715, #5756, #5763 or #5773

Comments

@datumbox
Copy link
Contributor

datumbox commented Jan 28, 2022

🚀 The feature

Similar to #3995 but focus on Object Detection and Segmentation.

Kick-off a Batteries Included phase 2 project that will focus on improving object detection and segmentation. After adding the necessary primitives, create a new recipe that improves the accuracy of existing models and retrain them to offer better weights to the community.

Results

Best currently available models achieved:

The above results were achieved by building on top of work done by @rbgirshick, @pdollar, @vaibhava0, @fmassa and @xiaohu2015.

@RangiLyu
Copy link

We recently did some experiments on the pre-trained backbone and found that using TIMM's ResNet training method as pretrain
can boost Faster R-CNN from 37.4 to 40.8 mAP (+3.4 mAP).

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.408
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.625
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.446
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.255
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.449
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.532
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.542
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 ] = 0.542
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.542
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.367
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.580
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.682

More details can be found in open-mmlab/mmdetection#7001

And I learned that torchvision also updated a new resnet pre-training method recently in #5201 and it is a SOTA ResNet50. Do you have some experiments on faster rcnn using this pretrained model? Wondering how many improvements can achieve.

@datumbox
Copy link
Contributor Author

@RangiLyu I don't have yet these numbers but we plan to do such experiments soon after we add some new primitives for detection. I'm currently scoping which techniques should be added (see here for some early work). The metrics that appear on this issue were moved from #3995 and was written prior doing any work on ResNet50. BTW I wouldn't be surprised if at the end we end up training the detection models from scratch using longer cycles, as this has been the trend for strong recipes the last few years.

@xiaohu2015
Copy link
Contributor

xiaohu2015 commented Feb 25, 2022

We recently did some experiments on the pre-trained backbone and found that using TIMM's ResNet training method as pretrain can boost Faster R-CNN from 37.4 to 40.8 mAP (+3.4 mAP).

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.408
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.625
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.446
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.255
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.449
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.532
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.542
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 ] = 0.542
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.542
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.367
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.580
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.682

More details can be found in open-mmlab/mmdetection#7001

And I learned that torchvision also updated a new resnet pre-training method recently in #5201 and it is a SOTA ResNet50. Do you have some experiments on faster rcnn using this pretrained model? Wondering how many improvements can achieve.

Hi, I run the expriment of RetinaNet with new ResNet50 on detectron2, with the new weights, we can get 41.9 mAP (+about 3.6 compared 38.3 ) (GN + GIoU + multi-scale training trick)

code: https://github.com/xiaohu2015/nndet2

@datumbox datumbox changed the title Improve the accuracy of Detection models by using SOTA recipes and primitives Improve the accuracy of Detection & Segmentation models by using SOTA recipes and primitives Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment