[Doc] add tutorials for level 3 and 4 (#920)

### Summary Please take a look at http://10.225.20.174:7041/build/html/docs/level-up/basic_skills/03_dataset_import_export.html and http://10.225.20.174:7041/build/html/docs/level-up/basic_skills/04_detect_data_format.html#level-4-detect-data-format-from-an-unknown-dataset for references.  ### How to test  ### Checklist  - [ ] I have added unit tests to cover my changes. - [ ] I have added integration tests to cover my changes. - [ ] I have added the description of my changes into [CHANGELOG](https://github.com/openvinotoolkit/datumaro/blob/develop/CHANGELOG.md). - [ ] I have updated the [documentation](https://github.com/openvinotoolkit/datumaro/tree/develop/docs) accordingly ### License - [ ] I submit _my code changes_ under the same [MIT License](https://github.com/openvinotoolkit/datumaro/blob/develop/LICENSE) that covers the project. Feel free to contact the maintainers if that's a concern. - [ ] I have updated the license header for each file (see an example below). ```python # Copyright (C) 2023 Intel Corporation # # SPDX-License-Identifier: MIT ```
openvinotoolkit · Apr 11, 2023 · 4672276 · 4672276
1 parent 606a0c8
commit 4672276
Show file tree

Hide file tree

Showing 5 changed files with 132 additions and 56 deletions.
diff --git a/docs/source/docs/level-up/basic_skills/03_dataset_import_export.md b/docs/source/docs/level-up/basic_skills/03_dataset_import_export.md
diff --git a/docs/source/docs/level-up/basic_skills/03_dataset_import_export.rst b/docs/source/docs/level-up/basic_skills/03_dataset_import_export.rst
@@ -0,0 +1,90 @@
+=============
+Level 3: Data Import and Export
+=============
+
+Datumaro is a tool that supports public data formats across a wide range of tasks such as
+classification, detection, segmentation, pose estimation, or visual tracking.
+To facilitate this, Datumaro provides assistance with data import and export via both Python API and CLI.
+This makes it easier for users to work with various data formats using Datumaro.
+
+Prepare dataset
+============
+
+For the segmentation task, we here introduce the Cityscapes, which collects road scenes from 50
+different cities and contains 5K fine-grained pixel-level annotations and 20K coarse annotations.
+More detailed description is given by :ref:`here <Cityscapes>`.
+The Cityscapes dataset is available for free `download <https://www.cityscapes-dataset.com/downloads/>`_.
+
+Convert data format
+============
+
+Users sometimes needs to compare, merge, or manage various kinds of public datasets in a unified
+system. To achieve this, Datumaro not only has `import` and `export` funcionalities, but also
+provides `convert`, which shortens the import and export into a single command line.
+We now convert the Cityscapes data into the MS-COCO format, which is described in :ref:`here <COCO>`.
+
+
+.. tabbed:: CLI
+
+  Without creation of a project, we can achieve this with a single line command `convert` in Datumaro
+
+  .. code-block:: bash
+
+    datum convert -if cityscapes -i <path/to/cityscapes> -f coco_panoptic -o <path/to/output>
+
+.. tabbed:: Python
+
+  With Pthon API, we can import the data through `Dataset` as below.
+
+  .. code-block:: python
+
+      from datumaro.components.dataset import Dataset
+
+      data_path = '/path/to/cityscapes'
+      data_format = 'cityscapes'
+
+      dataset = Dataset.import_from(data_path, data_format)
+
+  We then export the import dataset as
+
+  .. code-block:: python
+
+      output_path = '/path/to/output'
+
+      dataset.export(output_path, format='coco_panoptic')
+
+.. tabbed:: ProjectCLI
+
+  With the project-based CLI, we first require to create a project by
+
+  .. code-block:: bash
+
+    datum create -o <path/to/project>
+
+  We now import Cityscapes data into the project through
+
+  .. code-block:: bash
+
+    datum import --format cityscapes -p <path/to/project> <path/to/cityscapes>
+
+  (Optional) When we import a data, the change is automatically commited in the project.
+  This can be shown through `log` as
+
+  .. code-block:: bash
+
+    datum log -p <path/to/project>
+
+  (Optional) We can check the imported dataset information such as subsets, number of data, or
+  categories through `info`.
+
+  .. code-block:: bash
+
+    datum info -p <path/to/project>
+
+  Finally, we export the data within the project with MS-COCO format as
+
+  .. code-block:: bash
+
+    datum export --format coco -p <path/to/project> -o <path/to/save> -- --save-media
+
+For a data with an unknown format, we can detect the format in the :ref:`next level <Level 4: Detect Data Format from an Unknown Dataset>`!
diff --git a/docs/source/docs/level-up/basic_skills/04_detect_data_format.md b/docs/source/docs/level-up/basic_skills/04_detect_data_format.md
diff --git a/docs/source/docs/level-up/basic_skills/04_detect_data_format.rst b/docs/source/docs/level-up/basic_skills/04_detect_data_format.rst
@@ -0,0 +1,41 @@
+=============
+Level 4: Detect Data Format from an Unknown Dataset
+=============
+
+Datumaro provides a function to detect the format of a dataset before importing data. This can be
+useful in cases where information about the original format of the data has been lost or is unclear.
+With this function, users can easily identify the format and proceed with appropriate data
+handling processes.
+
+Detect data format
+============
+
+.. tabbed:: CLI
+
+  .. code-block:: bash
+
+    datum detect-format <path/to/data>
+
+  The printed format can be utilized as `format` argument when importing a dataset as following the
+  :ref:`previous level <Level 3: Data Import and Export>`.
+
+.. tabbed:: Python
+
+  .. code-block:: python
+
+      from datumaro.components.environment import Environment
+
+      data_path = '/path/to/data'
+
+      env = Environment()
+
+      detected_formats = env.detect_dataset(data_path)
+
+
+  (Optional) With the detected format, we can import the dataset as below.
+
+  .. code-block:: python
+
+      from datumaro.components.dataset import Dataset
+
+      dataset = Dataset.import_from(data_path, detected_formats[0])
diff --git a/docs/source/docs/level-up/basic_skills/index.rst b/docs/source/docs/level-up/basic_skills/index.rst
@@ -26,6 +26,7 @@ Basic Skills
       :text: Level 3: Dataset Import & Export
       :classes: btn-outline-primary btn-block
 
+   :badge:`ProjectCLI,badge-primary`
    :badge:`CLI,badge-info`
    :badge:`Python,badge-warning`