Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ICDAR dataset format #2866

Merged
merged 17 commits into from
Mar 26, 2021
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Project task subsets (<https://github.com/openvinotoolkit/cvat/pull/2774>)
- [WiderFace](http://shuoyang1213.me/WIDERFACE/) format support (<https://github.com/openvinotoolkit/cvat/pull/2864>)
- [VGGFace2](https://github.com/ox-vgg/vgg_face2) format support (<https://github.com/openvinotoolkit/cvat/pull/2865>)
- [ICDAR](https://rrc.cvc.uab.es/?ch=2) format support (<https://github.com/openvinotoolkit/cvat/pull/2866>)

### Changed

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ For more information about supported formats look at the
| [CamVid](http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/) | X | X |
| [WIDER Face](http://shuoyang1213.me/WIDERFACE/) | X | X |
| [VGGFace2](https://github.com/ox-vgg/vgg_face2) | X | X |
| [ICDAR13/15](https://rrc.cvc.uab.es/?ch=2) | X | X |

## Deep learning serverless functions for automatic labeling

Expand Down
92 changes: 82 additions & 10 deletions cvat/apps/dataset_manager/formats/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
- [CamVid](#camvid)
- [WIDER Face](#widerface)
- [VGGFace2](#vggface2)
- [ICDAR13/15](#icdar)

## How to add a new annotation format support<a id="how-to-add"></a>

Expand Down Expand Up @@ -816,17 +817,17 @@ Downloaded file: a zip archive of the following structure:
```bash
# if we save images:
taskname.zip/
── label1/
├── label1_image1.jpg
└── label1_image2.jpg
── label1/
| ├── label1_image1.jpg
| └── label1_image2.jpg
└── label2/
├── label2_image1.jpg
├── label2_image3.jpg
└── label2_image4.jpg

# if we keep only annotation:
taskname.zip/
── <any_subset_name>.txt
── <any_subset_name>.txt
└── synsets.txt

```
Expand All @@ -848,12 +849,12 @@ Downloaded file: a zip archive of the following structure:
```bash
taskname.zip/
├── labelmap.txt # optional, required for non-CamVid labels
── <any_subset_name>/
├── image1.png
└── image2.png
── <any_subset_name>annot/
├── image1.png
└── image2.png
── <any_subset_name>/
| ├── image1.png
| └── image2.png
── <any_subset_name>annot/
| ├── image1.png
| └── image2.png
└── <any_subset_name>.txt

# labelmap.txt
Expand Down Expand Up @@ -937,3 +938,74 @@ label1 <class1>
Uploaded file: a zip archive of the structure above

- supported annotations: Rectangles, Points (landmarks - groups of 5 points)

### [ICDAR13/15](https://rrc.cvc.uab.es/?ch=2)<a id="icdar" />

#### ICDAR13/15 Dumper

Downloaded file: a zip archive of the following structure:

```bash
# word recognition task
taskname.zip/
└── word_recognition/
└── <any_subset_name>/
├── images
| ├── word1.png
| └── word2.png
└── gt.txt

# text localization task
taskname.zip/
└── text_localization/
└── <any_subset_name>/
├── images
| ├── img_1.png
| └── img_2.png
├── gt_img_1.txt
└── gt_img_1.txt

#text segmentation task
taskname.zip/
└── text_localization/
└── <any_subset_name>/
├── images
| ├── 1.png
| └── 2.png
├── 1_GT.bmp
├── 1_GT.txt
├── 2_GT.bmp
└── 2_GT.txt
```

**Word recognition task**:

- supported annotations: Label `icdar` with attribute `caption`

**Text localization task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attribute `text`

**Text segmentation task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attributes `index`, `text`, `color`, `center`

#### ICDAR13/15 Loader

Uploaded file: a zip archive of the structure above

**Word recognition task**:

- supported annotations: Label `icdar` with attribute `caption`

**Text localization task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attribute `text`

**Text segmentation task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attributes `index`, `text`, `color`, `center`
75 changes: 75 additions & 0 deletions cvat/apps/dataset_manager/formats/icdar.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Copyright (C) 2021 Intel Corporation
#
# SPDX-License-Identifier: MIT

import os.path as osp
import zipfile
from tempfile import TemporaryDirectory

from datumaro.components.dataset import Dataset
from datumaro.components.extractor import (AnnotationType, Caption, Label,
LabelCategories)

from cvat.apps.dataset_manager.bindings import CvatTaskDataExtractor, \
import_dm_annotations
from cvat.apps.dataset_manager.util import make_zip_archive

from .registry import dm_env, exporter, importer

@exporter(name='ICDAR Recognition', ext='ZIP', version='1.0')
def _export_recognition(dst_file, task_data, save_images=False):
dataset = Dataset.from_extractors(CvatTaskDataExtractor(
task_data, include_images=save_images), env=dm_env)
for item in dataset:
anns = [p for p in item.annotations
if 'text' in p.attributes]
for ann in anns:
item.annotations.append(Caption(ann.attributes['text']))
with TemporaryDirectory() as temp_dir:
dataset.export(temp_dir, 'icdar_word_recognition', save_images=save_images)
make_zip_archive(temp_dir, dst_file)

@exporter(name='ICDAR Localization', ext='ZIP', version='1.0')
def _export_localization(dst_file, task_data, save_images=False):
dataset = Dataset.from_extractors(CvatTaskDataExtractor(
task_data, include_images=save_images), env=dm_env)
with TemporaryDirectory() as temp_dir:
dataset.export(temp_dir, 'icdar_text_localization', save_images=save_images)
make_zip_archive(temp_dir, dst_file)

@exporter(name='ICDAR Segmentation', ext='ZIP', version='1.0')
def _export_segmentation(dst_file, task_data, save_images=False):
dataset = Dataset.from_extractors(CvatTaskDataExtractor(
task_data, include_images=save_images), env=dm_env)
with TemporaryDirectory() as temp_dir:
dataset.transform('polygons_to_masks')
dataset.transform('boxes_to_masks')
dataset.transform('merge_instance_segments')
dataset.export(temp_dir, 'icdar_text_segmentation', save_images=save_images)
make_zip_archive(temp_dir, dst_file)

@importer(name='ICDAR', ext='ZIP', version='1.0')
def _import(src_file, task_data):
with TemporaryDirectory() as tmp_dir:
zipfile.ZipFile(src_file).extractall(tmp_dir)

dataset = Dataset.import_from(tmp_dir, 'icdar', env=dm_env)
if osp.isdir(osp.join(tmp_dir, 'word_recognition')):
for item in dataset:
zhiltsov-max marked this conversation as resolved.
Show resolved Hide resolved
anns = [p for p in item.annotations
if p.type == AnnotationType.caption]
for ann in anns:
item.annotations.append(Label(label=0,
attributes={'text': ann.caption}))
item.annotations.remove(ann)
else:
for item in dataset:
anns = [p for p in item.annotations
if p.type in [AnnotationType.bbox, AnnotationType.polygon, AnnotationType.mask]]
for ann in anns:
ann.label = 0
label_cat = LabelCategories()
label_cat.add('icdar')
dataset.categories()[AnnotationType.label] = label_cat
dataset.transform('masks_to_polygons')
import_dm_annotations(dataset, task_data)
1 change: 1 addition & 0 deletions cvat/apps/dataset_manager/formats/registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,3 +97,4 @@ def make_exporter(name):
import cvat.apps.dataset_manager.formats.camvid
import cvat.apps.dataset_manager.formats.widerface
import cvat.apps.dataset_manager.formats.vggface2
import cvat.apps.dataset_manager.formats.icdar
11 changes: 9 additions & 2 deletions cvat/apps/dataset_manager/tests/test_formats.py
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,9 @@ def test_export_formats_query(self):
'CamVid 1.0',
'WiderFace 1.0',
'VGGFace2 1.0',
'ICDAR Recognition 1.0',
'ICDAR Localization 1.0',
'ICDAR Segmentation 1.0',
})

def test_import_formats_query(self):
Expand All @@ -304,6 +307,7 @@ def test_import_formats_query(self):
'CamVid 1.0',
'WiderFace 1.0',
'VGGFace2 1.0',
'ICDAR 1.0',
})

def test_exports(self):
Expand All @@ -316,8 +320,8 @@ def check(file_path):
self.skipTest("Format is disabled")

format_name = f.DISPLAY_NAME
if format_name == "VGGFace2 1.0":
self.skipTest("Format does not support multiple shapes for one item")
if format_name in ["VGGFace2 1.0", "ICDAR Segmentation 1.0"]:
self.skipTest("Format is disabled")

for save_images in { True, False }:
images = self._generate_task_images(3)
Expand Down Expand Up @@ -346,6 +350,9 @@ def test_empty_images_are_exported(self):
('CamVid 1.0', 'camvid'),
('WiderFace 1.0', 'wider_face'),
('VGGFace2 1.0', 'vgg_face2'),
('ICDAR Recognition 1.0', 'icdar'),
('ICDAR Localization 1.0', 'icdar'),
# ('ICDAR Segmentation 1.0', 'icdar'), # does not support
]:
with self.subTest(format=format_name):
if not dm.formats.registry.EXPORT_FORMATS[format_name].ENABLED:
Expand Down
Loading