Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add explanation for XAI & minor doc fixes #1923

Merged
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/daily.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ jobs:
Daily-Tests:
runs-on: [self-hosted, linux, x64, dev]
timeout-minutes: 1440
if: github.ref == 'refs/heads/develop'
steps:
- name: Checkout repository
uses: actions/checkout@v3
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ Additional Features
models_optimization
hpo
auto_configuration
xai
83 changes: 83 additions & 0 deletions docs/source/guide/explanation/additional_features/xai.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
Explainable AI (XAI)
====================

**Explainable AI (XAI)** is a field of research that aims to make machine learning models more transparent and interpretable to humans.
The goal is to help users understand how and why AI systems make decisions and provide insight into their inner workings. It allows us to detect, analyze, and prevent common mistakes like the lack of data diversity for certain objects.
GalyaZalesskaya marked this conversation as resolved.
Show resolved Hide resolved
XAI can help to build trust in AI, make sure that the model is safe for development and increase its adoption in various domains.

Most XAI tools generate **saliency maps** as a part of the process. It is a visual representation, suitable for human comprehension, that highlights the most important parts of the image that the network has focused on the most.
GalyaZalesskaya marked this conversation as resolved.
Show resolved Hide resolved
It looks like a heatmap, where warm-colored areas represent the areas with main focuses.
GalyaZalesskaya marked this conversation as resolved.
Show resolved Hide resolved


.. image:: ../../../../utils/images/xai_example.jpg
:width: 600
:alt: this image shows the result of XAI algorithm


We can generate saliency maps for a certain model that was trained in OpenVINO™ Training Extensions, using ``otx explain`` command line. Learn more about its usage in :doc:`../../tutorials/base/explain` tutorial.

*************************
Classification algorithms
GalyaZalesskaya marked this conversation as resolved.
Show resolved Hide resolved
*************************

.. image:: ../../../../utils/images/xai_cls.jpg
:width: 600
:alt: this image shows the comparison of XAI classification algorithms


For classification networks these algorithms are used to generate saliency maps:

- **Activation Map​** - this is the most basic and naive approach. It takes the outputs of the model's feature extractor (backbone) and averages it in channel dimension. The results highly rely on the backbone and ignore neck and head computations. Basically, it gives a relatively good and fast result.

- `Eigen-Cam <https://arxiv.org/abs/2008.00299​>`_ uses Principal Component Analysis (PCA). It returns the first principal component of the feature extractor output, which most of the time corresponds to the dominant object. The results highly rely on the backbone as well and ignore neck and head computations.

- `Recipro-CAM​ <https://arxiv.org/pdf/2209.14074>`_ uses Class Activation Mapping (CAM) to weigh the activation map for each class, so it can generate different saliency per class. Recipro-CAM is a fast gradient-free Reciprocal CAM method. The method involves spatially masking the extracted feature maps to exploit the correlation between activation maps and network predictions for target classes.


Below we show the comparison of described algorithms:

+-------------------------------------------+----------------+----------------+-------------------------------------------------------------------------+
| Classification algorithm | Activation Map | Eigen-Cam | Recipro-CAM |
+===========================================+================+================+=========================================================================+
| Need access to model internal state | Yes | Yes | Yes |
+-------------------------------------------+----------------+----------------+-------------------------------------------------------------------------+
| Gradient-free | Yes | Yes | Yes |
+-------------------------------------------+----------------+----------------+-------------------------------------------------------------------------+
| Single-shot | Yes | Yes | No (re-infer neck + head H*W times, where HxW – feature map size) |
+-------------------------------------------+----------------+----------------+-------------------------------------------------------------------------+
| Class discrimination | No | No | Yes |
+-------------------------------------------+----------------+----------------+-------------------------------------------------------------------------+
| Execution speed | Fast | Fast | Medium |
+-------------------------------------------+----------------+----------------+-------------------------------------------------------------------------+


*************************
Detection algorithms
GalyaZalesskaya marked this conversation as resolved.
Show resolved Hide resolved
*************************

To generate a saliency map for the detection task, we use the **DetClassProbabilityMap** algorithm.
It's the naive approach for detection that takes the raw classification head output and uses class probability maps to calculate regions of interest for each class. So, it creates different salience maps for each class.
For now, this algorithm is implemented for single-stage detectors only.​
GalyaZalesskaya marked this conversation as resolved.
Show resolved Hide resolved

.. image:: ../../../../utils/images/xai_det.jpg
:width: 600
:alt: this image shows the detailed description of XAI detection algorithm


The main limitation of this method is that, due to training loss design, activation values drift towards the center of the object. It limits the getting of clear explanations in the near-edge image areas.​
GalyaZalesskaya marked this conversation as resolved.
Show resolved Hide resolved

+-------------------------------------------+-------------------------------------------------------------------------+
GalyaZalesskaya marked this conversation as resolved.
Show resolved Hide resolved
| Detection algorithm | DetClassProbabilityMap |
+===========================================+=========================================================================+
| Need access to model internal state | Yes |
+-------------------------------------------+-------------------------------------------------------------------------+
| Gradient-free | Yes |
+-------------------------------------------+-------------------------------------------------------------------------+
| Single-shot | Yes |
+-------------------------------------------+-------------------------------------------------------------------------+
| Class discrimination | No |
+-------------------------------------------+-------------------------------------------------------------------------+
| Box discrimination | No |
+-------------------------------------------+-------------------------------------------------------------------------+
| Execution speed | Fast |
+-------------------------------------------+-------------------------------------------------------------------------+
Original file line number Diff line number Diff line change
Expand Up @@ -399,7 +399,7 @@ The command below will evaluate the trained model on the provided dataset:
Explanation
***********

``otx explain`` runs the explanation algorithm of a model on the specific dataset. It helps explain the model's decision-making process in a way that is easily understood by humans.
``otx explain`` runs the explainable AI (XAI) algorithm of a model on the specific dataset. It helps explain the model's decision-making process in a way that is easily understood by humans.

With the ``--help`` command, you can list additional information, such as its parameters common to all model templates:

Expand Down
2 changes: 1 addition & 1 deletion docs/source/guide/tutorials/advanced/self_sl.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ The process has been tested on the following configuration:
Setup virtual environment
*************************

1. You can follow the installation process from a :doc:`quick start guide <../../../get_started/quick_start_guide/installation>`
1. You can follow the installation process from a :doc:`quick start guide <../../get_started/quick_start_guide/installation>`
to create a universal virtual environment for OpenVINO™ Training Extensions.

2. Activate your virtual
Expand Down
4 changes: 2 additions & 2 deletions docs/source/guide/tutorials/advanced/semi_sl.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ This tutorial explains how to train a model in semi-supervised learning mode and
Setup virtual environment
*************************

1. You can follow the installation process from a :doc:`quick start guide <../../../get_started/quick_start_guide/installation>`
1. You can follow the installation process from a :doc:`quick start guide <../../get_started/quick_start_guide/installation>`
to create a universal virtual environment for OpenVINO™ Training Extensions.

2. Activate your virtual
Expand Down Expand Up @@ -128,7 +128,7 @@ Enable via ``otx train``
***************************

1. To enable semi-supervised learning directly via ``otx train``, we need to add arguments ``--unlabeled-data-roots`` and ``--algo_backend.train_type``
which is one of template-specific parameters (details are provided in `quick start guide <../../get_started/quick_start_guide/cli_commands.html#training>`__.)
which is one of template-specific parameters (details are provided in `quick start guide <../../get_started/quick_start_guide/cli_commands.html#training>`__).

.. code-block::

Expand Down
6 changes: 3 additions & 3 deletions docs/source/guide/tutorials/base/demo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ It allows you to apply the model on the custom data or the online footage from a

This tutorial uses an object detection model for example, however for other tasks the functionality remains the same - you just need to replace the input dataset with your own.

For visualization you use images from WGISD dataset from the :doc: `object detection tutorial <how_to_train/detection>`.
For visualization you use images from WGISD dataset from the :doc:`object detection tutorial <how_to_train/detection>`.

1. Activate the virtual environment
created in the previous step.
Expand Down Expand Up @@ -69,8 +69,8 @@ You can check a list of camera devices by running the command line below on Linu

.. code-block::

sudo apt-get install v4l-utils
v4l2-ctl --list-devices
(demo) ...$ sudo apt-get install v4l-utils
(demo) ...$ v4l2-ctl --list-devices

The output will look like this:

Expand Down
20 changes: 18 additions & 2 deletions docs/source/guide/tutorials/base/explain.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,25 @@ at the path specified by ``--save-explanation-to``.

.. code-block::

otx explain --explain-data-roots otx-workspace-DETECTION/splitted_dataset/val/ --save-explanation-to outputs/explanation --load-weights outputs/weights.pth
otx explain --explain-data-roots otx-workspace-DETECTION/splitted_dataset/val/ \
--save-explanation-to outputs/explanation \
--load-weights outputs/weights.pth

3. As a result we will get a folder with a pair of generated
3. To specify the algorithm of saliency map creation for classification,
we can define the ``--explain-algorithm`` parameter.

- ``activationmap`` - for activation map classification algorithm
- ``eigencam`` - for Eigen-Cam classification algorithm
- ``classwisesaliencymap`` - for Recipro-CAM classification algorithm, this is a default method

For detection task, the ``classwisesaliencymap`` is only supported, so we don't need to specify it.
GalyaZalesskaya marked this conversation as resolved.
Show resolved Hide resolved

.. note::

Learn more about Explainable AI and its algorithms in :doc:`XAI explanation section <../../explanation/additional_features/xai>`


4. As a result we will get a folder with a pair of generated
images for each image in ``--explain-data-roots``:

- saliency map - where red color means more attention of the model
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ with the following command:
cd ..

|

.. image:: ../../../../../utils/images/flowers_example.jpg
:width: 600

Expand Down Expand Up @@ -120,7 +121,7 @@ Let's prepare an OpenVINO™ Training Extensions classification workspace runnin

(otx) ...$ cd ./otx-workspace-CLASSIFICATION

It will create **otx-workspace-CLASSIFICATION** with all necessery configs for MobileNet-V3-large-1x, prepared ``data.yaml`` to simplify CLI commands launch and splitted dataset named ``splitted_dataset``.
It will create **otx-workspace-CLASSIFICATION** with all necessary configs for MobileNet-V3-large-1x, prepared ``data.yaml`` to simplify CLI commands launch and splitted dataset named ``splitted_dataset``.

3. To start training you need to call ``otx train``
command in our workspace:
Expand Down
1 change: 1 addition & 0 deletions docs/source/guide/tutorials/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ This section reveals how to use ``CLI``, both base and advanced features.
It provides the end-to-end solution from installation to model deployment and demo visualization on specific example for each of the supported tasks.

.. toctree::
:titlesonly:
:maxdepth: 3

base/index
Expand Down
Binary file added docs/utils/images/xai_cls.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/utils/images/xai_det.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/utils/images/xai_example.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Description.
model_template_id: Custom_Action_Classificaiton_MoViNet
model_template_id: Custom_Action_Classification_MoViNet
name: MoViNet
task_type: ACTION_CLASSIFICATION
task_family: VISION
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Description.
model_template_id: Custom_Action_Classificaiton_X3D
model_template_id: Custom_Action_Classification_X3D
name: X3D
task_type: ACTION_CLASSIFICATION
task_family: VISION
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -172,18 +172,16 @@ def loss_single(
pos_centerness = centerness[pos_inds]

centerness_targets = self.centerness_target(pos_anchors, pos_bbox_targets)
pos_decode_bbox_pred = self.bbox_coder.decode(pos_anchors, pos_bbox_pred)
pos_decode_bbox_targets = self.bbox_coder.decode(pos_anchors, pos_bbox_targets)
if self.reg_decoded_bbox:
pos_bbox_pred = self.bbox_coder.decode(pos_anchors, pos_bbox_pred)

if self.use_qfl:
quality[pos_inds] = bbox_overlaps(
pos_decode_bbox_pred.detach(), pos_decode_bbox_targets, is_aligned=True
).clamp(min=1e-6)
quality[pos_inds] = bbox_overlaps(pos_bbox_pred.detach(), pos_bbox_targets, is_aligned=True).clamp(
min=1e-6
)

# regression loss
loss_bbox = self.loss_bbox(
pos_decode_bbox_pred, pos_decode_bbox_targets, weight=centerness_targets, avg_factor=1.0
)
loss_bbox = self.loss_bbox(pos_bbox_pred, pos_bbox_targets, weight=centerness_targets, avg_factor=1.0)

# centerness loss
loss_centerness = self.loss_centerness(pos_centerness, centerness_targets, avg_factor=num_total_samples)
Expand Down
2 changes: 1 addition & 1 deletion otx/cli/manager/config_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
"INSTANCE_SEGMENTATION": "Custom_Counting_Instance_Segmentation_MaskRCNN_ResNet50",
"ROTATED_DETECTION": "Custom_Rotated_Detection_via_Instance_Segmentation_MaskRCNN_ResNet50",
"SEGMENTATION": "Custom_Semantic_Segmentation_Lite-HRNet-18-mod2_OCR",
"ACTION_CLASSIFICATION": "Custom_Action_Classificaiton_X3D",
"ACTION_CLASSIFICATION": "Custom_Action_Classification_X3D",
"ACTION_DETECTION": "Custom_Action_Detection_X3D_FAST_RCNN",
"ANOMALY_CLASSIFICATION": "ote_anomaly_classification_padim",
"ANOMALY_DETECTION": "ote_anomaly_detection_padim",
Expand Down