Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge back 1.2.1 RC1 & RC2 #2086

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,20 @@ All notable changes to this project will be documented in this file.
- OpenVINO(==2022.3) IR inference is not working well on 2-stage models (e.g. Mask-RCNN) exported from torch==1.13.1
(working well up to torch==1.12.1) (<https://github.com/openvinotoolkit/training_extensions/issues/1906>)

## \[v1.2.1\]

### Enhancements

- Upgrade mmdeploy==0.14.0 from official PyPI (<https://github.com/openvinotoolkit/training_extensions/pull/2047>)
- Integrate new ignored loss in semantic segmentation (<https://github.com/openvinotoolkit/training_extensions/pull/2065>)
- Optimize YOLOX data pipeline (<https://github.com/openvinotoolkit/training_extensions/pull/2075>)
- Tiling Spatial Concatenation for OpenVINO IR (<https://github.com/openvinotoolkit/training_extensions/pull/2052>)

### Bug fixes

- Bug fix: value of validation variable is changed after auto decrease batch size (<https://github.com/openvinotoolkit/training_extensions/pull/2053>)
- Fix tiling 0 stride issue in parameter adapter (<https://github.com/openvinotoolkit/training_extensions/pull/2078>)

## \[v1.2.0\]

### New features
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
Fast Data Loading
=================

OpenVINO™ Training Extensions provides several ways to boost model training speed,
one of which is fast data loading.


===================
Faster Augmentation
===================


******
AugMix
******
AugMix [1]_ is a simple yet powerful augmentation technique
to improve robustness and uncertainty estimates of image classification task.
OpenVINO™ Training Extensions implemented it in `Cython <https://cython.org/>`_ for faster augmentation.
Users do not need to configure anything as cythonized AugMix is used by default.



=======
Caching
=======


*****************
In-Memory Caching
*****************
OpenVINO™ Training Extensions provides in-memory caching for decoded images in main memory.
If the batch size is large, such as for classification tasks, or if dataset contains
high-resolution images, image decoding can account for a non-negligible overhead
in data pre-processing.
One can enable in-memory caching for maximizing GPU utilization and reducing model
training time in those cases.


.. code-block::

$ otx train --mem-cache-size=8GB ..



***************
Storage Caching
***************

OpenVINO™ Training Extensions uses `Datumaro <https://github.com/openvinotoolkit/datumaro>`_
under the hood for dataset managements.
Since Datumaro `supports <https://openvinotoolkit.github.io/datumaro/latest/docs/explanation/formats/arrow.html>`_
`Apache Arrow <https://arrow.apache.org/overview/>`_, OpenVINO™ Training Extensions
can exploit fast data loading using memory-mapped arrow file at the expanse of storage consumtion.


.. code-block::

$ otx train .. params --algo_backend.storage_cache_scheme JPEG/75


The cache would be saved in ``$HOME/.cache/otx`` by default.
One could change it by modifying ``OTX_CACHE`` environment variable.


.. code-block::

$ OTX_CACHE=/path/to/cache otx train .. params --algo_backend.storage_cache_scheme JPEG/75


Please refere `Datumaro document <https://openvinotoolkit.github.io/datumaro/latest/docs/explanation/formats/arrow.html#export-to-arrow>`_
for available schemes to choose but we recommend ``JPEG/75`` for fast data loaidng.

.. [1] Dan Hendrycks, Norman Mu, Ekin D. Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshminarayanan. "AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty" International Conference on Learning Representations. 2020.
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ Additional Features
auto_configuration
xai
noisy_label_detection
fast_data_loading
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Noisy label detection
Noisy Label Detection
=====================

OpenVINO™ Training Extensions provide a feature for detecting noisy labels during model training.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,18 @@ For example, that is how you can change the learning rate and the batch size for
--learning_parameters.batch_size 16 \
--learning_parameters.learning_rate 0.001

You could also enable storage caching to boost data loading at the expanse of storage:

.. code-block::

(otx) ...$ otx train SSD --train-data-roots <path/to/train/root> \
--val-data-roots <path/to/val/root> \
params \
--algo_backend.storage_cache_scheme JPEG/75

.. note::
Not all templates support stroage cache. We are working on extending supported templates.


As can be seen from the parameters list, the model can be trained using multiple GPUs. To do so, you simply need to specify a comma-separated list of GPU indices after the ``--gpus`` argument. It will start the distributed data-parallel training with the GPUs you have specified.

Expand Down
Loading