-
Notifications
You must be signed in to change notification settings - Fork 137
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add level-up documentation - level 10 data generation
- Loading branch information
Showing
2 changed files
with
43 additions
and
28 deletions.
There are no files selected for viewing
28 changes: 0 additions & 28 deletions
28
docs/source/docs/level-up/intermediate_skills/10_data_generation.md
This file was deleted.
Oops, something went wrong.
43 changes: 43 additions & 0 deletions
43
docs/source/docs/level-up/intermediate_skills/10_data_generation.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
=========================== | ||
Level 10: Data Generation | ||
=========================== | ||
|
||
|
||
Pre-training of deep learning models for vision tasks can increase model accuracy. | ||
Training model with the synthetic dataset is one of famouse pre-training approach | ||
since the manual annotations is quite expensive work. | ||
|
||
Base on the `previous research <https://arxiv.org/abs/2103.13023>`_, | ||
Datumaro provides a fractal image dataset (FractalDB) generator that can be utilized to pre-train the vision models. | ||
Learning visual features of FractalDB is known to increase the performance of Vision Transformer (ViT) models. | ||
Note that a fractal patterns in FractalDB is calculated mathmatically using the interated function system (IFS) with random parameters. | ||
We thus not need to concern about any privacy issues. | ||
|
||
|
||
.. tab-set:: | ||
|
||
.. tab-item:: CLI | ||
|
||
We can generate the synthetic images by the following CLI command: | ||
|
||
.. code-block:: bash | ||
datum generate -o <path/to/data> --count GEN_IMG_COUNT --shape GEN_IMG_SHAPE | ||
``GEN_IMG_COUNT`` is an integer that indicates the number of images to be generated. (e.g. `--count 300`) | ||
``GEN_IMG_SHAPE`` is the shape (width height) of generated images (e.g. `--shape 240 180`) | ||
|
||
.. tab-item:: Python | ||
|
||
With Pthon API, we can generate the synthetic images as below. | ||
|
||
.. code-block:: python | ||
from datumaro.plugins.synthetic_data import FractalImageGenerator | ||
FractalImageGenerator(output_dir=<path/to/data>, count=GEN_IMG_COUNT, shape=GEN_IMG_SHAPE).generate_dataset() | ||
``GEN_IMG_COUNT`` is an integer that indicates the number of images to be generated. (e.g. `count=300`) | ||
``GEN_IMG_SHAPE`` is a tuple representing the shape of generated images as (width, height) (e.g. `shape=(240, 180)) | ||
|
||
Congratulations! You complete reading all Datumaro level-up documents for the intermediate skills. |