Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add augmentation detail page to docs #3533

Merged
merged 1 commit into from
May 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Adaptive Training
==================
=================

Adaptive-training focuses to adjust the number of iterations or interval for the validation to achieve the fast training.
In the small data regime, we don't need to validate the model at every epoch since there are a few iterations at a single epoch.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
Augmentations per model
=======================

Following table shows details of augmentations that used for each model.

+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
| Task | Model | Train | Val | Test |
+=============================+===========================+=====================================================================================+=============================================+=============================================+
|| Multi Class Classification || Efficientnet-B0 || - RandomResizedCrop (size=224) || - Resize (size=224) || - Resize (size=224) |
|| Multi Label Classification || Efficientnet-V2-S || - RandomFlip (flip_prob=0.5, direction="horizontal") || - Normalize || - Normalize |
|| H-Label Classification || MV3-Large || - Normalize || || |
|| || DeiT || || || |
+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
|| Detection || Yolox_l || - Mosaic (img_scale=640, pad_val=114.0) || - MultiScaleFlipAug (img_scale=(640, 640)) || - MultiScaleFlipAug (img_scale=(640, 640)) |
|| || Yolox_s || - RandomAffine || - Resize || - Resize |
|| || || - MixUp (img_scale=640, ratio_range=(0.8, 1.6), pad_val=114.0) || - RandomFlip (flip_prob=0.5) || - RandomFlip (flip_prob=0.5) |
+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
|| || Yolox_x || - YOLOXHSVRandomAug || - Pad (size_divisor=32) || - Pad (size_divisor=32) |
|| || || - RandomFlip (flip_prob=0.5) || - Normalize || - Normalize |
|| || || - Resize (img_scale=640) || || |
|| || || - Pad || || |
|| || || - Normalize || || |
+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
|| || Yolox_tiny || - Mosaic (img_scale=640, pad_val=114.0) || - Resize (img_scale=(416, 416)) || - MultiScaleFlipAug (img_scale=(416, 416)) |
|| || || - RandomAffine || - MultiScaleFlipAug (img_scale=(416, 416)) || - Resize |
|| || || - PhotoMetricDistortion || - RandomFlip || - RandomFlip |
|| || || - RandomFlip (flip_prob=0.5) || - Pad || - Pad |
|| || || - Resize (img_scale=640) || - Normalize || - Normalize |
|| || || - Pad || || |
|| || || - Normalize || || |
+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
|| || Mobilenetv2_atss || - MinIoURandomCrop || - Resize (img_scale=(992, 736)) || - Resize (img_scale=(992, 736)) |
|| || Resnext101_atss || - Resize (img_scale=[(992, 736), (896, 736), (1088, 736), (992, 672), (992, 800)]) || - MultiScaleFlipAug (img_scale=(992, 736)) || - MultiScaleFlipAug (img_scale=(992, 736)) |
|| || || - RandomFlip (flip_prob=0.5) || - RandomFlip || - RandomFlip |
|| || || - Normalize || - Normalize || - Normalize |
+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
|| || Mobilenetv2_ssd || - PhotoMetricDistortion || - Resize (img_scale=(864, 864)) || - MultiScaleFlipAug (img_scale=(864, 864)) |
|| || || - MinIoURandomCrop || - MultiScaleFlipAug (img_scale=(864, 864)) || - Resize |
|| || || - Resize (img_scale=(864, 864)) || - Normalize || - Normalize |
|| || || - Normalize || || |
|| || || - RandomFlip (flip_prob=0.5) || || |
+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
|| || Resnet50_Detr || - RandomFlip (flip_prob=0.5) || - MultiScaleFlipAug (img_scale=(1333, 800) || - MultiScaleFlipAug (img_scale=(1333, 800) |
|| || Resnet50_dino || - AutoAugment || - Resize || - Resize |
|| || || - Resize || - RandomFlip || - RandomFlip |
|| || || - RandomCrop || - Normalize || - Normalize |
|| || || - Resize || - Pad (size_divisor=32) || - Pad (size_divisor=32) |
|| || || - Normalize || || |
|| || || - Pad (size_divisor=1) || || |
+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
|| Instance-segmentation || Convnext_maskrcnn || - Resize (img_scale=1024) || - Resize (img_scale=1024) || - MultiScaleFlipAug (img_scale=1024) |
|| || Efficientnetb2b_maskrcnn || - RandomFlip (flip_prob=0.5) || - MultiScaleFlipAug || - Resize |
|| || Resnet50_maskrcnn || - Normalize || - RandomFlip (flip_prob=0.5) || - RandomFlip (flip_prob=0.5) |
|| || || - Pad (size_divisor=32) || - Normalize || - Normalize |
|| || || || - Pad (size_divisor=32) || - Pad (size_divisor=32) |
+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
|| || Maskrcnn_swin_t || - Resize (img_scale=1344) || - Resize (img_scale=1344) || - Resize (img_scale=1344) |
|| || || - RandomFlip (flip_prob=0.5) || - MultiScaleFlipAug || - MultiScaleFlipAug |
|| || || - Normalize || - RandomFlip (flip_prob=0.5) || - RandomFlip (flip_prob=0.5) |
|| || || - Pad (size_divisor=32) || - Normalize || - Normalize |
|| || || - Pad (size_divisor=32) || - Pad (size_divisor=32) || |
+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
|| Segmentation || Segnext_b || - Resize (img_scale=544) || - Resize (img_scale=544) || - Resize (img_scale=544) |
|| || Segnext_s || - RandomCrop (crop_size=512, cat_max_ratio=0.75) || - MultiScaleFlipAug || - MultiScaleFlipAug |
|| || Segnext_t || - RandomFlip (flip_prob=0.5, direction="horizontal") || - RandomFlip || - RandomFlip |
|| || Lite_hrnet_18 || - Normalize || - Normalize || - Normalize |
|| || Lite_hrnet_18_mod2 || - Pad (size=512, pad_val=0, seg_pad_val=255) || || |
|| || Lite_hrnet_s_mod2 || || || |
|| || Lite_hrnet_x_mod3 || || || |
+-----------------------------+---------------------------+-------------------------------------------------------------------------------------+---------------------------------------------+---------------------------------------------+
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ Additional Features
fast_data_loading
tiling
config_input_size
augmentations_per_model
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Visual Prompting (Fine-tuning)
=================
==============================

Visual prompting is a computer vision task that uses a combination of an image and prompts, such as texts, bounding boxes, points, and so on to troubleshoot problems.
Using these useful prompts, the main purpose of this task is to obtain labels from unlabeled datasets, and to use generated label information on particular domains or to develop a new model with the generated information.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Visual Prompting
============
================

.. toctree::
:maxdepth: 1
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Visual Prompting (Zero-shot learning)
=================
=====================================

Visual prompting is a computer vision task that uses a combination of an image and prompts, such as texts, bounding boxes, points, and so on to troubleshoot problems.
Using these useful prompts, the main purpose of this task is to obtain labels from unlabeled datasets, and to use generated label information on particular domains or to develop a new model with the generated information.
Expand Down
Loading