Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update PTQ docs #2672

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
Models Optimization
===================

OpenVINO™ Training Extensions provides two types of optimization algorithms: `Post-training Optimization Tool (POT) <https://docs.openvino.ai/latest/pot_introduction.html#doxid-pot-introduction>`_ and `Neural Network Compression Framework (NNCF) <https://github.com/openvinotoolkit/nncf>`_.
OpenVINO™ Training Extensions provides two types of optimization algorithms: `Post-Training Quantization tool (PTQ) <https://github.com/openvinotoolkit/nncf#post-training-quantization>`_ and `Neural Network Compression Framework (NNCF) <https://github.com/openvinotoolkit/nncf>`_.

*******************************
Post-training Optimization Tool
Post-Training Quantization Tool
*******************************

POT is designed to optimize the inference of models by applying post-training methods that do not require model retraining or fine-tuning. If you want to know more details about how POT works and to be more familiar with model optimization methods, please refer to `documentation <https://docs.openvino.ai/latest/pot_introduction.html#doxid-pot-introduction>`_.
PTQ is designed to optimize the inference of models by applying post-training methods that do not require model retraining or fine-tuning. If you want to know more details about how PTQ works and to be more familiar with model optimization methods, please refer to `documentation <https://docs.openvino.ai/2023.2/ptq_introduction.html>`_.

To run Post-training optimization it is required to convert the model to OpenVINO™ intermediate representation (IR) first. To perform fast and accurate quantization we use ``DefaultQuantization Algorithm`` for each task. Please, see the `DefaultQuantization Parameters <https://docs.openvino.ai/latest/pot_compression_algorithms_quantization_default_README.html#doxid-pot-compression-algorithms-quantization-default-r-e-a-d-m-e>`_ for further information about configuring the optimization.
To run Post-training quantization it is required to convert the model to OpenVINO™ intermediate representation (IR) first. To perform fast and accurate quantization we use ``DefaultQuantization Algorithm`` for each task. Please, refer to the `Tune quantization Parameters <https://docs.openvino.ai/2023.2/basic_quantization_flow.html#tune-quantization-parameters>`_ for further information about configuring the optimization.

POT parameters can be found and configured in ``template.yaml`` and ``configuration.yaml`` for each task. For Anomaly and Semantic Segmentation tasks, we have separate configuration files for POT, that can be found in the same directory with ``template.yaml``, for example for `PaDiM <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/anomaly/configs/classification/padim/ptq_optimization_config.py>`_, `OCR-Lite-HRNe-18-mod2 <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/segmentation/configs/ocr_lite_hrnet_18_mod2/ptq_optimization_config.py>`_ model.
PTQ parameters can be found and configured in ``template.yaml`` and ``configuration.yaml`` for each task. For Anomaly and Semantic Segmentation tasks, we have separate configuration files for PTQ, that can be found in the same directory with ``template.yaml``, for example for `PaDiM <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/anomaly/configs/classification/padim/ptq_optimization_config.py>`_, `OCR-Lite-HRNe-18-mod2 <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/segmentation/configs/ocr_lite_hrnet_18_mod2/ptq_optimization_config.py>`_ model.

************************************
Neural Network Compression Framework
Expand All @@ -25,9 +25,9 @@ You can refer to configuration files for default templates for each task accordi

NNCF tends to provide better quality in terms of preserving accuracy as it uses training compression approaches.
Compression results achievable with the NNCF can be found `here <https://github.com/openvinotoolkit/nncf#nncf-compressed-model-zoo>`_ .
Meanwhile, the POT is faster but can degrade accuracy more than the training-enabled approach.
Meanwhile, the PTQ is faster but can degrade accuracy more than the training-enabled approach.

.. note::
The main recommendation is to start with post-training compression and use NNCF compression during training if you are not satisfied with the results.

Please, refer to our :doc:`dedicated tutorials <../../tutorials/base/how_to_train/index>` on how to optimize your model using POT or NNCF.
Please, refer to our :doc:`dedicated tutorials <../../tutorials/base/how_to_train/index>` on how to optimize your model using PTQ or NNCF.
14 changes: 7 additions & 7 deletions docs/source/guide/get_started/cli_commands.rst
Original file line number Diff line number Diff line change
Expand Up @@ -342,10 +342,10 @@ To use the exported model as an input for ``otx explain``, please dump additiona
Optimization
************

``otx optimize`` optimizes a model using `NNCF <https://github.com/openvinotoolkit/nncf>`_ or `POT <https://docs.openvino.ai/latest/pot_introduction.html>`_ depending on the model format.
``otx optimize`` optimizes a model using `NNCF <https://github.com/openvinotoolkit/nncf>`_ or `PTQ <https://github.com/openvinotoolkit/nncf#post-training-quantization>`_ depending on the model and transforms it to ``INT8`` format.

- NNCF optimization used for trained snapshots in a framework-specific format such as checkpoint (.pth) file from Pytorch
- POT optimization used for models exported in the OpenVINO™ IR format
- PTQ optimization used for models exported in the OpenVINO™ IR format

With the ``--help`` command, you can list additional information:

Expand Down Expand Up @@ -383,16 +383,16 @@ Command example for optimizing a PyTorch model (.pth) with OpenVINO™ NNCF:
--output outputs/nncf


Command example for optimizing OpenVINO™ model (.xml) with OpenVINO™ POT:
Command example for optimizing OpenVINO™ model (.xml) with OpenVINO™ PTQ:

.. code-block::

(otx) ...$ otx optimize SSD --load-weights <path/to/openvino.xml> \
--val-data-roots <path/to/val/root> \
--output outputs/pot
--output outputs/ptq


Thus, to use POT pass the path to exported IR (.xml) model, to use NNCF pass the path to the PyTorch (.pth) weights.
Thus, to use PTQ pass the path to exported IR (.xml) model, to use NNCF pass the path to the PyTorch (.pth) weights.


***********
Expand All @@ -419,7 +419,7 @@ With the ``--help`` command, you can list additional information, such as its pa
--test-data-roots TEST_DATA_ROOTS
Comma-separated paths to test data folders.
--load-weights LOAD_WEIGHTS
Load model weights from previously saved checkpoint.It could be a trained/optimized model (POT only) or exported model.
Load model weights from previously saved checkpoint. It could be a trained/optimized model (with PTQ only) or exported model.
sovrasov marked this conversation as resolved.
Show resolved Hide resolved
-o OUTPUT, --output OUTPUT
Location where the intermediate output of the task will be stored.
--workspace WORKSPACE Path to the workspace where the command will run.
Expand Down Expand Up @@ -532,7 +532,7 @@ Demonstration
-i INPUT, --input INPUT
Source of input data: images folder, image, webcam and video.
--load-weights LOAD_WEIGHTS
Load model weights from previously saved checkpoint.It could be a trained/optimized model (POT only) or exported model.
Load model weights from previously saved checkpoint.It could be a trained/optimized model (with PTQ only) or exported model.
--fit-to-size FIT_TO_SIZE FIT_TO_SIZE
Width and Height space-separated values. Fits displayed images to window with specified Width and Height. This options applies to result visualisation only.
--loop Enable reading the input in a loop.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/guide/tutorials/advanced/semi_sl.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ The process has been tested on the following configuration:

To learn how to export the trained model, refer to `classification export <../base/how_to_train/classification.html#export>`__.

To learn how to optimize the trained model (.xml) with OpenVINO™ POT, refer to `classification optimization <../base/how_to_train/classification.html#optimization>`__.
To learn how to optimize the trained model (.xml) with OpenVINO™ PTQ, refer to `classification optimization <../base/how_to_train/classification.html#optimization>`__.

Currently, OpenVINO™ NNCF optimization doesn't support a full Semi-SL training algorithm. The accuracy-aware optimization will be executed on labeled data only.
So, the performance drop may be more noticeable than after ordinary supervised training.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/guide/tutorials/base/deploy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ using the command below:
2023-01-20 09:30:41,737 | INFO : Deploying the model
2023-01-20 09:30:41,753 | INFO : Deploying completed

You can also deploy the quantized model, that was optimized with NNCF or POT, passing the path to this model in IR format to ``--load-weights`` parameter.
You can also deploy the quantized model, that was optimized with NNCF or PTQ, passing the path to this model in IR format to ``--load-weights`` parameter.

After that, you can use the resulting ``openvino.zip`` archive in other application.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ Export
*********

1. ``otx export`` exports a trained Pytorch `.pth` model to the OpenVINO™ Intermediate Representation (IR) format.
It allows running the model on the Intel hardware much more efficiently, especially on the CPU. Also, the resulting IR model is required to run POT optimization. IR model consists of two files: ``openvino.xml`` for weights and ``openvino.bin`` for architecture.
It allows running the model on the Intel hardware much more efficiently, especially on the CPU. Also, the resulting IR model is required to run PTQ optimization. IR model consists of two files: ``openvino.xml`` for weights and ``openvino.bin`` for architecture.

2. Run the command line below to export the trained model
and save the exported model to the ``openvino`` folder.
Expand All @@ -235,7 +235,7 @@ and save the exported model to the ``openvino`` folder.
2023-02-21 22:54:35,424 - mmaction - INFO - Exporting completed


3. Check the accuracy of the IR model and the consistency between the exported model and the PyTorch model,
3. Check the accuracy of the IR optimimodel and the consistency between the exported model and the PyTorch model,
using ``otx eval`` and passing the IR model path to the ``--load-weights`` parameter.

.. code-block::
Expand All @@ -254,22 +254,24 @@ Optimization
*************

1. You can further optimize the model with ``otx optimize``.
Currently, quantization jobs that include POT is supported for X3D template. MoViNet will be supported in near future.
Currently, quantization jobs that include PTQ is supported for X3D template. MoViNet will be supported in near future.

The optimized model will be quantized to ``INT8`` format.
Refer to :doc:`optimization explanation <../../../explanation/additional_features/models_optimization>` section for more details on model optimization.

2. Example command for optimizing
OpenVINO™ model (.xml) with OpenVINO™ POT.
OpenVINO™ model (.xml) with OpenVINO™ PTQ.

.. code-block::

(otx) ...$ otx optimize --load-weights openvino/openvino.xml \
--output pot_model
--output ptq_model

...

Performance(score: 0.6252587703095486, dashboard: (3 metric groups))

Keep in mind that POT will take some time (generally less than NNCF optimization) without logging to optimize the model.
Keep in mind that PTQ will take some time (generally less than NNCF optimization) without logging to optimize the model.

3. Now, you have fully trained, optimized and exported an
efficient model representation ready-to-use action classification model.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,8 @@ Optimization

1. You can further optimize the model with ``otx optimize``.
Currently, only PTQ is supported for action detection. NNCF will be supported in near future.

The optimized model will be quantized to ``INT8`` format.
Refer to :doc:`optimization explanation <../../../explanation/additional_features/models_optimization>` section for more details on model optimization.

2. Example command for optimizing
Expand All @@ -209,7 +211,7 @@ OpenVINO™ model (.xml) with OpenVINO™ PTQ.
.. code-block::

(otx) ...$ otx optimize --load-weights openvino/openvino.xml \
--save-model-to pot_model
--save-model-to ptq_model

...

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ Export
******

1. ``otx export`` exports a trained Pytorch `.pth` model to the OpenVINO™ Intermediate Representation (IR) format.
It allows running the model on the Intel hardware much more efficient, especially on the CPU. Also, the resulting IR model is required to run POT optimization. IR model consists of 2 files: ``openvino.xml`` for weights and ``openvino.bin`` for architecture.
It allows running the model on the Intel hardware much more efficient, especially on the CPU. Also, the resulting IR model is required to run PTQ optimization. IR model consists of 2 files: ``openvino.xml`` for weights and ``openvino.bin`` for architecture.

2. We can run the below command line to export the trained model
and save the exported model to the ``openvino`` folder:
Expand Down Expand Up @@ -200,18 +200,19 @@ This gives the following results:
Optimization
************

Anomaly tasks can be optimized either in POT or NNCF format. For more information refer to the :doc:`optimization explanation <../../../explanation/additional_features/models_optimization>` section.
Anomaly tasks can be optimized either in PTQ or NNCF format. The model will be quantized to ``INT8`` format.
For more information refer to the :doc:`optimization explanation <../../../explanation/additional_features/models_optimization>` section.


1. Let's start with POT
1. Let's start with PTQ
optimization.

.. code-block::

otx optimize ote_anomaly_detection_padim \
--train-data-roots datasets/MVTec/bottle/train \
--load-weights otx-workspace-ANOMALY_DETECTION/openvino/openvino.xml \
--output otx-workspace-ANOMALY_DETECTION/pot_model
--output otx-workspace-ANOMALY_DETECTION/ptq_model

This command generates the following files that can be used to run :doc:`otx demo <../demo>`:

Expand All @@ -233,7 +234,7 @@ weights to the ``opitmize`` command:
--load-weights otx-workspace-ANOMALY_DETECTION/models/weights.pth \
--output otx-workspace-ANOMALY_DETECTION/nncf_model

Similar to POT optimization, it generates the following files:
Similar to PTQ optimization, it generates the following files:

- image_threshold
- pixel_threshold
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ Export
*********

1. ``otx export`` exports a trained Pytorch `.pth` model to the OpenVINO™ Intermediate Representation (IR) format.
It allows running the model on the Intel hardware much more efficient, especially on the CPU. Also, the resulting IR model is required to run POT optimization. IR model consists of 2 files: ``openvino.xml`` for weights and ``openvino.bin`` for architecture.
It allows running the model on the Intel hardware much more efficient, especially on the CPU. Also, the resulting IR model is required to run PTQ optimization. IR model consists of 2 files: ``openvino.xml`` for weights and ``openvino.bin`` for architecture.

2. You can run the below command line to export the trained model
and save the exported model to the ``openvino_model`` folder:
Expand Down Expand Up @@ -212,7 +212,7 @@ Optimization
*************

1. You can further optimize the model with ``otx optimize``.
It uses NNCF or POT depending on the model format.
It uses NNCF or PTQ depending on the model and transforms it to ``INT8`` format.

Please, refer to :doc:`optimization explanation <../../../explanation/additional_features/models_optimization>` section for more details on model optimization.

Expand All @@ -235,18 +235,18 @@ a PyTorch model (`.pth`) with OpenVINO™ NNCF.
The optimization time relies on the hardware characteristics, for example on 1 NVIDIA GeForce RTX 3090 and Intel(R) Core(TM) i9-10980XE it took about 10 minutes.

3. Command example for optimizing
OpenVINO™ model (.xml) with OpenVINO™ POT.
OpenVINO™ model (.xml) with OpenVINO™ PTQ.

.. code-block::

(otx) ...$ otx optimize --load-weights openvino_model/openvino.xml \
--output pot_model
--output ptq_model

...

Performance(score: 0.9577656675749319, dashboard: (3 metric groups))

Please note, that POT will take some time (generally less than NNCF optimization) without logging to optimize the model.
Please note, that PTQ will take some time (generally less than NNCF optimization) without logging to optimize the model.

4. Now you have fully trained, optimized and exported an
efficient model representation ready-to-use classification model.
Expand Down
Loading