forked from aim-uofa/AdelaiDet
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'dev' into fcos_large_models
- Loading branch information
Showing
48 changed files
with
1,977 additions
and
54 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# AdelaiDet Model Zoo and Baselines | ||
|
||
## Introduction | ||
This file documents a collection of models trained with AdelaiDet in Nov, 2019. | ||
|
||
## Models | ||
|
||
The inference time is measured on one 1080Ti based on the most recent commit on Detectron2 ([ffff8ac](https://github.com/facebookresearch/detectron2/commit/ffff8acc35ea88ad1cb1806ab0f00b4c1c5dbfd9)). | ||
|
||
More models will be released soon. Stay tuned. | ||
|
||
### COCO Object Detecton Baselines with FCOS | ||
|
||
Name | box AP | download | ||
--- |:---:|:---: | ||
[FCOS_R_50_1x](configs/FCOS-Detection/R_50_1x.yaml) | 38.7 | [model](https://cloudstor.aarnet.edu.au/plus/s/glqFc13cCoEyHYy/download) | ||
|
||
### COCO Instance Segmentation Baselines with [BlendMask](https://arxiv.org/abs/2001.00309) | ||
|
||
Model | Name |inference time (ms/im) | box AP | mask AP | download | ||
--- |:---:|:---:|:---:|:---:|:---: | ||
Mask R-CNN | [550_R_50_3x](configs/RCNN/550_R_50_FPN_3x.yaml) | 63 | 39.1 | 35.3 | | ||
BlendMask | [550_R_50_3x](configs/BlendMask/550_R_50_3x.yaml) | 36 | 38.7 | 34.5 | [model](https://cloudstor.aarnet.edu.au/plus/s/R3Qintf7N8UCiIt/download) | ||
Mask R-CNN | [R_50_1x](https://github.com/facebookresearch/detectron2/blob/master/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml) | 80 | 38.6 | 35.2 | | ||
BlendMask | [R_50_1x](configs/BlendMask/R_50_1x.yaml) | 73 | 39.9 | 35.8 | [model](https://cloudstor.aarnet.edu.au/plus/s/zoxXPnr6Hw3OJgK/download) | ||
Mask R-CNN | [R_50_3x](https://github.com/facebookresearch/detectron2/blob/master/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml) | 80 | 41.0 | 37.2 | | ||
BlendMask | [R_50_3x](configs/BlendMask/R_50_3x.yaml) | 74 | 42.7 | 37.8 | [model](https://cloudstor.aarnet.edu.au/plus/s/ZnaInHFEKst6mvg/download) | ||
Mask R-CNN | [R_101_3x](https://github.com/facebookresearch/detectron2/blob/master/configs/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml) | 100 | 42.9 | 38.6 | | ||
BlendMask | [R_101_3x](configs/BlendMask/R_101_3x.yaml) | 94 | 44.8 | 39.5 | [model](https://cloudstor.aarnet.edu.au/plus/s/e4fXrliAcMtyEBy/download) | ||
BlendMask | [R_101_dcni3_5x](configs/BlendMask/R_101_dcni3_5x.yaml) | 105 | 46.8 | 41.1 | [model](https://cloudstor.aarnet.edu.au/plus/s/vbnKnQtaGlw8TKv/download) | ||
|
||
### COCO Panoptic Segmentation Baselines with BlendMask | ||
Model | Name | PQ | PQ<sup>Th</sup> | PQ<sup>St</sup> | download | ||
--- |:---:|:---:|:---:|:---:|:---: | ||
Panoptic FPN | [R_50_3x](https://github.com/facebookresearch/detectron2/blob/master/configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_3x.yaml) | 41.5 | 48.3 | 31.2 | | ||
BlendMask | [R_50_3x](configs/BlendMask/Panoptic/R_50_3x.yaml) | 42.5 | 49.5 | 32.0 | [model](https://cloudstor.aarnet.edu.au/plus/s/oDgi0826JOJXCr5/download) | ||
Panoptic FPN | [R_101_3x](https://github.com/facebookresearch/detectron2/blob/master/configs/COCO-InstanceSegmentation/panoptic_fpn_R_101_3x.yaml) | 43.0 | 49.7 | 32.9 | | ||
BlendMask | [R_101_3x](configs/BlendMask/Panoptic/R_101_3x.yaml) | 44.3 | 51.6 | 33.2 | [model](https://cloudstor.aarnet.edu.au/plus/s/u6gZwj06MWDEkYe/download) | ||
BlendMask | [R_101_dcni3_5x](configs/BlendMask/Panoptic/R_101_dcni3_5x.yaml) | 46.0 | 52.9 | 35.5 | [model](https://cloudstor.aarnet.edu.au/plus/s/Jwp41WEzDdrhWsN/download) | ||
|
||
### Person in Context with BlendMask | ||
Model | Name | box AP | mask AP | download | ||
--- |:---:|:---:|:---:|:---: | ||
BlendMask | [R_50_1x](configs/BlendMask/Person/R_50_1x.yaml) | 70.6 | 66.7 | [model](https://cloudstor.aarnet.edu.au/plus/s/nvpcKTFA5fsagc0/download) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
from . import builtin # ensure the builtin datasets are registered | ||
# from .dataset_mapper import DatasetMapperWithBasis | ||
from .dataset_mapper import DatasetMapperWithBasis | ||
|
||
|
||
# __all__ = ["DatasetMapperWithBasis"] | ||
__all__ = ["DatasetMapperWithBasis"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,141 @@ | ||
import copy | ||
import numpy as np | ||
import torch | ||
from fvcore.common.file_io import PathManager | ||
from PIL import Image | ||
|
||
from detectron2.data.dataset_mapper import DatasetMapper | ||
from detectron2.data.detection_utils import SizeMismatchError | ||
from detectron2.data import detection_utils as utils | ||
from detectron2.data import transforms as T | ||
|
||
""" | ||
This file contains the default mapping that's applied to "dataset dicts". | ||
""" | ||
|
||
__all__ = ["DatasetMapperWithBasis"] | ||
|
||
|
||
class DatasetMapperWithBasis(DatasetMapper): | ||
""" | ||
This caller enables the default Detectron2 mapper to read an additional basis semantic label | ||
""" | ||
|
||
def __init__(self, cfg, is_train=True): | ||
super().__init__(cfg, is_train) | ||
|
||
# fmt: off | ||
self.basis_loss_on = cfg.MODEL.BASIS_MODULE.LOSS_ON | ||
self.ann_set = cfg.MODEL.BASIS_MODULE.ANN_SET | ||
# fmt: on | ||
|
||
def __call__(self, dataset_dict): | ||
""" | ||
Args: | ||
dataset_dict (dict): Metadata of one image, in Detectron2 Dataset format. | ||
Returns: | ||
dict: a format that builtin models in detectron2 accept | ||
""" | ||
dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below | ||
# USER: Write your own image loading if it's not from a file | ||
try: | ||
image = utils.read_image(dataset_dict["file_name"], format=self.img_format) | ||
except Exception as e: | ||
print(dataset_dict["file_name"]) | ||
print(e) | ||
raise e | ||
try: | ||
utils.check_image_size(dataset_dict, image) | ||
except SizeMismatchError as e: | ||
expected_wh = (dataset_dict["width"], dataset_dict["height"]) | ||
image_wh = (image.shape[1], image.shape[0]) | ||
if (image_wh[1], image_wh[0]) == expected_wh: | ||
print("transposing image {}".format(dataset_dict["file_name"])) | ||
image = image.transpose(1, 0, 2) | ||
else: | ||
raise e | ||
|
||
if "annotations" not in dataset_dict or len(dataset_dict["annotations"]) == 0: | ||
image, transforms = T.apply_transform_gens( | ||
([self.crop_gen] if self.crop_gen else []) + self.tfm_gens, image | ||
) | ||
else: | ||
# Crop around an instance if there are instances in the image. | ||
# USER: Remove if you don't use cropping | ||
if self.crop_gen: | ||
crop_tfm = utils.gen_crop_transform_with_instance( | ||
self.crop_gen.get_crop_size(image.shape[:2]), | ||
image.shape[:2], | ||
np.random.choice(dataset_dict["annotations"]), | ||
) | ||
image = crop_tfm.apply_image(image) | ||
image, transforms = T.apply_transform_gens(self.tfm_gens, image) | ||
if self.crop_gen: | ||
transforms = crop_tfm + transforms | ||
|
||
image_shape = image.shape[:2] # h, w | ||
|
||
# Pytorch's dataloader is efficient on torch.Tensor due to shared-memory, | ||
# but not efficient on large generic data structures due to the use of pickle & mp.Queue. | ||
# Therefore it's important to use torch.Tensor. | ||
dataset_dict["image"] = torch.as_tensor(image.transpose(2, 0, 1).astype("float32")) | ||
# Can use uint8 if it turns out to be slow some day | ||
|
||
# USER: Remove if you don't use pre-computed proposals. | ||
if self.load_proposals: | ||
utils.transform_proposals( | ||
dataset_dict, image_shape, transforms, self.min_box_side_len, self.proposal_topk | ||
) | ||
|
||
if not self.is_train: | ||
dataset_dict.pop("annotations", None) | ||
dataset_dict.pop("sem_seg_file_name", None) | ||
dataset_dict.pop("pano_seg_file_name", None) | ||
return dataset_dict | ||
|
||
if "annotations" in dataset_dict: | ||
# USER: Modify this if you want to keep them for some reason. | ||
for anno in dataset_dict["annotations"]: | ||
if not self.mask_on: | ||
anno.pop("segmentation", None) | ||
if not self.keypoint_on: | ||
anno.pop("keypoints", None) | ||
|
||
# USER: Implement additional transformations if you have other types of data | ||
annos = [ | ||
utils.transform_instance_annotations( | ||
obj, transforms, image_shape, keypoint_hflip_indices=self.keypoint_hflip_indices | ||
) | ||
for obj in dataset_dict.pop("annotations") | ||
if obj.get("iscrowd", 0) == 0 | ||
] | ||
instances = utils.annotations_to_instances( | ||
annos, image_shape, mask_format=self.mask_format | ||
) | ||
# Create a tight bounding box from masks, useful when image is cropped | ||
if self.crop_gen and instances.has("gt_masks"): | ||
instances.gt_boxes = instances.gt_masks.get_bounding_boxes() | ||
dataset_dict["instances"] = utils.filter_empty_instances(instances) | ||
|
||
# USER: Remove if you don't do semantic/panoptic segmentation. | ||
if "sem_seg_file_name" in dataset_dict: | ||
with PathManager.open(dataset_dict.pop("sem_seg_file_name"), "rb") as f: | ||
sem_seg_gt = Image.open(f) | ||
sem_seg_gt = np.asarray(sem_seg_gt, dtype="uint8") | ||
sem_seg_gt = transforms.apply_segmentation(sem_seg_gt) | ||
sem_seg_gt = torch.as_tensor(sem_seg_gt.astype("long")) | ||
dataset_dict["sem_seg"] = sem_seg_gt | ||
|
||
if self.basis_loss_on and self.is_train: | ||
# load basis supervisions | ||
if self.ann_set == "coco": | ||
basis_sem_path = dataset_dict["file_name"].replace('train2017', 'thing_train2017').replace('image/train', 'thing_train') | ||
else: | ||
basis_sem_path = dataset_dict["file_name"].replace('coco', 'lvis').replace('train2017', 'thing_train').replace('jpg', 'npz') | ||
basis_sem_path = basis_sem_path.replace('jpg', 'npz') | ||
basis_sem_gt = np.load(basis_sem_path)["mask"] | ||
basis_sem_gt = transforms.apply_segmentation(basis_sem_gt) | ||
basis_sem_gt = torch.as_tensor(basis_sem_gt.astype("long")) | ||
dataset_dict["basis_sem"] = basis_sem_gt | ||
return dataset_dict |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
from .fpn import build_fcos_resnet_fpn_backbone | ||
from .vovnet import build_vovnet_fpn_backbone, build_vovnet_backbone | ||
from .dla import build_fcos_dla_fpn_backbone | ||
from .resnet_lpf import build_resnet_lpf_backbone |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.