cvat-ai · nmanovic · Mar 26, 2021 · Feb 25, 2021 · Mar 1, 2021 · Mar 2, 2021
@@ -22,6 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Project task subsets (<https://github.com/openvinotoolkit/cvat/pull/2774>)
 - [WiderFace](http://shuoyang1213.me/WIDERFACE/) format support (<https://github.com/openvinotoolkit/cvat/pull/2864>)
 - [VGGFace2](https://github.com/ox-vgg/vgg_face2) format support (<https://github.com/openvinotoolkit/cvat/pull/2865>)
+- [ICDAR](https://rrc.cvc.uab.es/?ch=2) format support (<https://github.com/openvinotoolkit/cvat/pull/2866>)
 
 ### Changed
 

@@ -64,6 +64,7 @@ For more information about supported formats look at the
 | [CamVid](http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/)          | X      | X      |
 | [WIDER Face](http://shuoyang1213.me/WIDERFACE/)                               | X      | X      |
 | [VGGFace2](https://github.com/ox-vgg/vgg_face2)                               | X      | X      |
+| [ICDAR13/15](https://rrc.cvc.uab.es/?ch=2)                                    | X      | X      |
 
 ## Deep learning serverless functions for automatic labeling
 

@@ -22,6 +22,7 @@
   - [CamVid](#camvid)
   - [WIDER Face](#widerface)
   - [VGGFace2](#vggface2)
+  - [ICDAR13/15](#icdar)
 
 ## How to add a new annotation format support<a id="how-to-add"></a>
 
@@ -816,17 +817,17 @@ Downloaded file: a zip archive of the following structure:
 ```bash
 # if we save images:
 taskname.zip/
-└── label1/
-    ├── label1_image1.jpg
-    └── label1_image2.jpg
+├── label1/
+|   ├── label1_image1.jpg
+|   └── label1_image2.jpg
 └── label2/
     ├── label2_image1.jpg
     ├── label2_image3.jpg
     └── label2_image4.jpg
 
 # if we keep only annotation:
 taskname.zip/
-└── <any_subset_name>.txt
+├── <any_subset_name>.txt
 └── synsets.txt
 
 ```
@@ -848,12 +849,12 @@ Downloaded file: a zip archive of the following structure:
 ```bash
 taskname.zip/
 ├── labelmap.txt # optional, required for non-CamVid labels
-└── <any_subset_name>/
-    ├── image1.png
-    └── image2.png
-└── <any_subset_name>annot/
-    ├── image1.png
-    └── image2.png
+├── <any_subset_name>/
+|   ├── image1.png
+|   └── image2.png
+├── <any_subset_name>annot/
+|   ├── image1.png
+|   └── image2.png
 └── <any_subset_name>.txt
 
 # labelmap.txt
@@ -937,3 +938,74 @@ label1 <class1>
 Uploaded file: a zip archive of the structure above
 
 - supported annotations: Rectangles, Points (landmarks - groups of 5 points)
+
+### [ICDAR13/15](https://rrc.cvc.uab.es/?ch=2)<a id="icdar" />
+
+#### ICDAR13/15 Dumper
+
+Downloaded file: a zip archive of the following structure:
+
+```bash
+# word recognition task
+taskname.zip/
+└── word_recognition/
+    └── <any_subset_name>/
+        ├── images
+        |   ├── word1.png
+        |   └── word2.png
+        └── gt.txt
+
+# text localization task
+taskname.zip/
+└── text_localization/
+    └── <any_subset_name>/
+        ├── images
+        |   ├── img_1.png
+        |   └── img_2.png
+        ├── gt_img_1.txt
+        └── gt_img_1.txt
+
+#text segmentation task
+taskname.zip/
+└── text_localization/
+    └── <any_subset_name>/
+        ├── images
+        |   ├── 1.png
+        |   └── 2.png
+        ├── 1_GT.bmp
+        ├── 1_GT.txt
+        ├── 2_GT.bmp
+        └── 2_GT.txt
+```
+
+**Word recognition task**:
+
+- supported annotations: Label `icdar` with attribute `caption`
+
+**Text localization task**:
+
+- supported annotations: Rectangles and Polygons with label `icdar`
+  and attribute `text`
+
+**Text segmentation task**:
+
+- supported annotations: Rectangles and Polygons with label `icdar`
+  and attributes `index`, `text`, `color`, `center`
+
+#### ICDAR13/15 Loader
+
+Uploaded file: a zip archive of the structure above
+
+**Word recognition task**:
+
+- supported annotations: Label `icdar` with attribute `caption`
+
+**Text localization task**:
+
+- supported annotations: Rectangles and Polygons with label `icdar`
+  and attribute `text`
+
+**Text segmentation task**:
+
+- supported annotations: Rectangles and Polygons with label `icdar`
+  and attributes `index`, `text`, `color`, `center`
@@ -0,0 +1,75 @@
+# Copyright (C) 2021 Intel Corporation
+#
+# SPDX-License-Identifier: MIT
+
+import os.path as osp
+import zipfile
+from tempfile import TemporaryDirectory
+
+from datumaro.components.dataset import Dataset
+from datumaro.components.extractor import (AnnotationType, Caption, Label,
+    LabelCategories)
+
+from cvat.apps.dataset_manager.bindings import CvatTaskDataExtractor, \
+    import_dm_annotations
+from cvat.apps.dataset_manager.util import make_zip_archive
+
+from .registry import dm_env, exporter, importer
+
+@exporter(name='ICDAR Recognition', ext='ZIP', version='1.0')
+def _export_recognition(dst_file, task_data, save_images=False):
+    dataset = Dataset.from_extractors(CvatTaskDataExtractor(
+        task_data, include_images=save_images), env=dm_env)
+    for item in dataset:
+        anns = [p for p in item.annotations
+            if 'text' in p.attributes]
+        for ann in anns:
+            item.annotations.append(Caption(ann.attributes['text']))
+    with TemporaryDirectory() as temp_dir:
+        dataset.export(temp_dir, 'icdar_word_recognition', save_images=save_images)
+        make_zip_archive(temp_dir, dst_file)
+
+@exporter(name='ICDAR Localization', ext='ZIP', version='1.0')
+def _export_localization(dst_file, task_data, save_images=False):
+    dataset = Dataset.from_extractors(CvatTaskDataExtractor(
+        task_data, include_images=save_images), env=dm_env)
+    with TemporaryDirectory() as temp_dir:
+        dataset.export(temp_dir, 'icdar_text_localization', save_images=save_images)
+        make_zip_archive(temp_dir, dst_file)
+
+@exporter(name='ICDAR Segmentation', ext='ZIP', version='1.0')
+def _export_segmentation(dst_file, task_data, save_images=False):
+    dataset = Dataset.from_extractors(CvatTaskDataExtractor(
+        task_data, include_images=save_images), env=dm_env)
+    with TemporaryDirectory() as temp_dir:
+        dataset.transform('polygons_to_masks')
+        dataset.transform('boxes_to_masks')
+        dataset.transform('merge_instance_segments')
+        dataset.export(temp_dir, 'icdar_text_segmentation', save_images=save_images)
+        make_zip_archive(temp_dir, dst_file)
+
+@importer(name='ICDAR', ext='ZIP', version='1.0')
+def _import(src_file, task_data):
+    with TemporaryDirectory() as tmp_dir:
+        zipfile.ZipFile(src_file).extractall(tmp_dir)
+
+        dataset = Dataset.import_from(tmp_dir, 'icdar', env=dm_env)
+        if osp.isdir(osp.join(tmp_dir, 'word_recognition')):
+            for item in dataset:
+                anns = [p for p in item.annotations
+                    if p.type == AnnotationType.caption]
+                for ann in anns:
+                    item.annotations.append(Label(label=0,
+                        attributes={'text': ann.caption}))
+                    item.annotations.remove(ann)
+        else:
+            for item in dataset:
+                anns = [p for p in item.annotations
+                    if p.type in [AnnotationType.bbox, AnnotationType.polygon, AnnotationType.mask]]
+                for ann in anns:
+                    ann.label = 0
+        label_cat = LabelCategories()
+        label_cat.add('icdar')
+        dataset.categories()[AnnotationType.label] = label_cat
+        dataset.transform('masks_to_polygons')
+        import_dm_annotations(dataset, task_data)
@@ -97,3 +97,4 @@ def make_exporter(name):
 import cvat.apps.dataset_manager.formats.camvid
 import cvat.apps.dataset_manager.formats.widerface
 import cvat.apps.dataset_manager.formats.vggface2
+import cvat.apps.dataset_manager.formats.icdar
@@ -284,6 +284,9 @@ def test_export_formats_query(self):
             'CamVid 1.0',
             'WiderFace 1.0',
             'VGGFace2 1.0',
+            'ICDAR Recognition 1.0',
+            'ICDAR Localization 1.0',
+            'ICDAR Segmentation 1.0',
         })
 
     def test_import_formats_query(self):
@@ -304,6 +307,7 @@ def test_import_formats_query(self):
             'CamVid 1.0',
             'WiderFace 1.0',
             'VGGFace2 1.0',
+            'ICDAR 1.0',
         })
 
     def test_exports(self):
@@ -316,8 +320,8 @@ def check(file_path):
                 self.skipTest("Format is disabled")
 
             format_name = f.DISPLAY_NAME
-            if format_name == "VGGFace2 1.0":
-                self.skipTest("Format does not support multiple shapes for one item")
+            if format_name in ["VGGFace2 1.0", "ICDAR Segmentation 1.0"]:
+                self.skipTest("Format is disabled")
 
             for save_images in { True, False }:
                 images = self._generate_task_images(3)
@@ -346,6 +350,9 @@ def test_empty_images_are_exported(self):
             ('CamVid 1.0', 'camvid'),
             ('WiderFace 1.0', 'wider_face'),
             ('VGGFace2 1.0', 'vgg_face2'),
+            ('ICDAR Recognition 1.0', 'icdar'),
+            ('ICDAR Localization 1.0', 'icdar'),
+            # ('ICDAR Segmentation 1.0', 'icdar'), # does not support
         ]:
             with self.subTest(format=format_name):
                 if not dm.formats.registry.EXPORT_FORMATS[format_name].ENABLED: