openvinotoolkit · goodsong81 · Apr 14, 2023 · Apr 11, 2023 · Apr 12, 2023 · Apr 12, 2023
@@ -9,6 +9,7 @@ All notable changes to this project will be documented in this file.
 - Add generating feature cli_report.log in output for otx training (<https://github.com/openvinotoolkit/training_extensions/pull/1959>)
 - Support multiple python versions up to 3.10 (<https://github.com/openvinotoolkit/training_extensions/pull/1978>)
 - Support export of onnx models (<https://github.com/openvinotoolkit/training_extensions/pull/1976>)
+- Add option to save images after inference in OTX CLI demo together with demo in exportable code (<https://github.com/openvinotoolkit/training_extensions/pull/2005>)
 
 ### Enhancements
 

@@ -474,7 +474,7 @@ Demonstration
 .. code-block::
 
     (otx) ...$ otx demo --help
-    usage: otx demo [-h] -i INPUT --load-weights LOAD_WEIGHTS [--fit-to-size FIT_TO_SIZE FIT_TO_SIZE] [--loop] [--delay DELAY] [--display-perf] [template] {params} ...
+    usage: otx demo [-h] -i INPUT --load-weights LOAD_WEIGHTS [--fit-to-size FIT_TO_SIZE FIT_TO_SIZE] [--loop] [--delay DELAY] [--display-perf] [--output OUTPUT] [template] {params} ...
 
     positional arguments:
       template              Enter the path or ID or name of the template file.
@@ -493,7 +493,8 @@ Demonstration
       --loop                Enable reading the input in a loop.
       --delay DELAY         Frame visualization time in ms.
       --display-perf        This option enables writing performance metrics on displayed frame. These metrics take into account not only model inference time, but also frame reading, pre-processing and post-processing.
-
+      --output OUTPUT
+                            Output path to save input data with predictions.
 
 Command example of the demonstration:
 

@@ -34,21 +34,32 @@ But if we'll provide a single image the demo processes and renders it quickly, t
     (demo) ...$ otx demo --input docs/utils/images/wgisd_dataset_sample.jpg \
                          --load-weights outputs/weights.pth --loop
 
-In this case, you can stop the demo by killing the process in the terminal (``Ctrl+C`` for Linux).
+In this case, you can stop the demo by pressing `Q` button or killing the process in the terminal (``Ctrl+C`` for Linux).
 
-3. In WGISD dataset we have high-resolution images, 
+3. If we want to pass an images folder, it's better to specify the delay parameter, that defines, how much millisecond pause will be held between showing the next image.
+For example ``--delay 100`` will make this pause 0.1 ms.
+If you want to skip showing the resulting image and instead see the number of predictions and time spent on each image inference, specify ``--delay 0``.
+
+
+4. In WGISD dataset we have high-resolution images, 
 so the ``--fit-to-size`` parameter would be quite useful. It resizes the resulting image to a specified:
 
 .. code-block::
 
     (demo) ...$ otx demo --input docs/utils/images/wgisd_dataset_sample.jpg \
                          --load-weights outputs/weights.pth --loop --fit-to-size 800 600
 
-4. If we want to pass an images folder, it's better to specify the delay parameter, that defines, how much millisecond pause will be held between showing the next image.
-For example ``--delay 100`` will make this pause 0.1 ms.
 
+5. To save inferenced results with predictions on it, we can specify the folder path, using ``--output``. 
+It works for images, videos, image folders and web cameras. To prevent issues, do not specify it together with a ``--loop`` parameter.
 
-5. If we want to show inference speed right on images, 
+.. code-block::
+
+    (demo) ...$ otx demo --input docs/utils/images/wgisd_dataset_sample.jpg \
+                         --load-weights outputs/weights.pth \
+                         --output resulted_images
+
+6. If we want to show inference speed right on images, 
 we can run the following line:
 
 .. code-block::
@@ -57,12 +68,6 @@ we can run the following line:
                          --load-weights outputs/weights.pth --loop \
                          --fit-to-size 800 600 --display-perf
 
-.. The result will look like this:
-
-.. .. image:: ../../../../utils/images/wgisd_pr_sample.jpg
-..   :width: 600
-..   :alt: this image shows the inference results with inference time on the WGISD dataset
-.. image to be generated and added
 
 6. To run a demo on a web camera, you need to know its ID. 
 You can check a list of camera devices by running the command line below on Linux system:

@@ -100,11 +100,20 @@ For example, the model inference on image from WGISD dataset, which we used for
 
     If you provide a single image as input, the demo processes and renders it quickly, then exits. To continuously
     visualize inference results on the screen, apply the ``loop`` option, which enforces processing a single image in a loop.
-    In this case, you can stop the demo by killing the process in the terminal (``Ctrl+C`` for Linux).
+    In this case, you can stop the demo by pressing `Q` button or killing the process in the terminal (``Ctrl+C`` for Linux).
 
 To learn how to run the demo on Windows and MacOS, please refer to the ``outputs/deploy/python/README.md`` file in exportable code.
 
-4. To run a demo on a web camera, we need to know its ID. 
+4. To save inferenced results with predictions on it, we can specify the folder path, using ``--output``. 
+It works for images, videos, image folders and web cameras. To prevent issues, do not specify it together with a ``--loop`` parameter.
+
+.. code-block::
+
+    (demo) ...$ python outputs/deploy/python/demo.py --input docs/utils/images/wgisd_dataset_sample.jpg \
+                                                      --models outputs/deploy/model \
+                                                      --output resulted_images
+
+5. To run a demo on a web camera, we need to know its ID. 
 We can check a list of camera devices by running this command line on Linux system:
 
 .. code-block::
@@ -121,7 +130,7 @@ The output will look like this:
 
 After that, we can use this ``/dev/video0`` as a camera ID for ``--input``.
 
-5. We can also change ``config.json`` that specifies the confidence threshold and 
+6. We can also change ``config.json`` that specifies the confidence threshold and 
 color for each class visualization, but any changes should be made with caution. 
 
 For example, in our image of the winery we see, that a lot of objects weren't detected.

@@ -85,44 +85,75 @@ Exportable code is a .zip archive that contains simple demo to get and visualize
 
 ## Usecase
 
-Running the `demo.py` application with the `-h` option yields the following usage message:
-
-```bash
-usage: demo.py [-h] -i INPUT -m MODELS [MODELS ...] [-it {sync,async}] [-l]
-Options:
-  -h, --help            Show this help message and exit.
-  -i INPUT, --input INPUT
-                        Required. An input to process. The input must be a
-                        single image, a folder of images, video file or camera
-                        id.
-  -m MODELS [MODELS ...], --models MODELS [MODELS ...]
-                        Required. Path to directory with trained model and
-                        configuration file. If you provide several models you
-                        will start the task chain pipeline with the provided
-                        models in the order in which they were specified
-  -it {sync,async}, --inference_type {sync,async}
-                        Optional. Type of inference for single model
-  -l, --loop            Optional. Enable reading the input in a loop.
-  --no_show
-                        Optional. If this flag is specified, the demo
-                        won't show the inference results on UI.
-```
-
-As a model, you can use path to model directory from generated zip. So you can use the following command to do inference with a pre-trained model:
-
-```bash
-python3 demo.py \
-  -i <path_to_video>/inputVideo.mp4 \
-  -m <path_to_model_directory>
-```
-
-You can press `Q` to stop inference during demo running.
-
-> **NOTE**: If you provide a single image as an input, the demo processes and renders it quickly, then exits. To continuously
-> visualize inference results on the screen, apply the `loop` option, which enforces processing a single image in a loop.
->
-> **NOTE**: Default configuration contains info about pre- and post processing for inference and is guaranteed to be correct.
-> Also you can change `config.json` that specifies needed parameters, but any changes should be made with caution.
+1. Running the `demo.py` application with the `-h` option yields the following usage message:
+
+   ```bash
+   usage: demo.py [-h] -i INPUT -m MODELS [MODELS ...] [-it {sync,async}] [-l] [--no_show] [-d {CPU,GPU}] [--output OUTPUT]
+
+   Options:
+   -h, --help            Show this help message and exit.
+   -i INPUT, --input INPUT
+                           Required. An input to process. The input must be a single image, a folder of images, video file or camera id.
+   -m MODELS [MODELS ...], --models MODELS [MODELS ...]
+                           Required. Path to directory with trained model and configuration file. If you provide several models you will start the task chain pipeline with the provided models in the order in
+                           which they were specified.
+   -it {sync,async}, --inference_type {sync,async}
+                           Optional. Type of inference for single model.
+   -l, --loop            Optional. Enable reading the input in a loop.
+   --no_show             Optional. Disables showing inference results on UI.
+   -d {CPU,GPU}, --device {CPU,GPU}
+                           Optional. Device to infer the model.
+   --output OUTPUT       Optional. Output path to save input data with predictions.
+   ```
+
+2. As a `model`, you can use path to model directory from generated zip. You can pass as `input` a single image, a folder of images, a video file, or a web camera id. So you can use the following command to do inference with a pre-trained model:
+
+   ```bash
+   python3 demo.py \
+   -i <path_to_video>/inputVideo.mp4 \
+   -m <path_to_model_directory>
+   ```
+
+   You can press `Q` to stop inference during demo running.
+
+   > **NOTE**: If you provide a single image as input, the demo processes and renders it quickly, then exits. To continuously
+   > visualize inference results on the screen, apply the `--loop` option, which enforces processing a single image in a loop.
+   > In this case, you can stop the demo by pressing `Q` button or killing the process in the terminal (`Ctrl+C` for Linux).
+   >
+   > **NOTE**: Default configuration contains info about pre- and post processing for inference and is guaranteed to be correct.
+   > Also you can change `config.json` that specifies the confidence threshold and color for each class visualization, but any
+   > changes should be made with caution.
+
+
+3. To save inferenced results with predictions on it, you can specify the folder path, using `--output`. 
+It works for images, videos, image folders and web cameras. To prevent issues, do not specify it together with a `--loop` parameter.
+
+   ```bash
+   python3 demo.py \
+      --input <path_to_image>/inputImage.jpg \
+      --models ../model \
+      --output resulted_images
+   ```
+
+4. To run a demo on a web camera, you need to know its ID. 
+You can check a list of camera devices by running this command line on Linux system:
+
+   ```bash
+   sudo apt-get install v4l-utils
+   v4l2-ctl --list-devices
+   ```
+
+   The output will look like this:
+
+   ```bash
+   Integrated Camera (usb-0000:00:1a.0-1.6):
+      /dev/video0
+   ```
+
+   After that, you can use this `/dev/video0` as a camera ID for `--input`.
+
+
+
 
 ## Troubleshooting
 

@@ -74,6 +74,12 @@ def build_argparser():
         default="CPU",
         type=str,
     )
+    args.add_argument(
+        "--output",
+        default=None,
+        type=str,
+        help="Optional. Output path to save input data with predictions.",
+    )
 
     return parser
 
@@ -96,6 +102,10 @@ def get_inferencer_class(type_inference, models):
 def main():
     """Main function that is used to run demo."""
     args = build_argparser().parse_args()
+
+    if args.loop and args.output:
+        raise ValueError("--loop and --output cannot be both specified")
+
     # create models
     models = []
     for model_dir in args.models:
@@ -105,7 +115,7 @@ def main():
     inferencer = get_inferencer_class(args.inference_type, models)
 
     # create visualizer
-    visualizer = create_visualizer(models[-1].task_type, no_show=args.no_show)
+    visualizer = create_visualizer(models[-1].task_type, no_show=args.no_show, output=args.output)
 
     if len(models) == 1:
         models = models[0]

@@ -16,6 +16,7 @@
 )
 from otx.api.usecases.exportable_code.streamer import get_streamer
 from otx.api.usecases.exportable_code.visualizers import Visualizer
+from otx.cli.tools.utils.demo.visualization import dump_frames
 
 
 class AsyncExecutor:
@@ -38,13 +39,16 @@ def run(self, input_stream: Union[int, str], loop: bool = False) -> None:
         next_frame_id = 0
         next_frame_id_to_show = 0
         stop_visualization = False
+        saved_frames = []
 
         for frame in streamer:
             results = self.async_pipeline.get_result(next_frame_id_to_show)
             while results:
                 output = self.render_result(results)
                 next_frame_id_to_show += 1
                 self.visualizer.show(output)
+                if self.visualizer.output:
+                    saved_frames.append(frame)
                 if self.visualizer.is_quit():
                     stop_visualization = True
                 results = self.async_pipeline.get_result(next_frame_id_to_show)
@@ -57,6 +61,7 @@ def run(self, input_stream: Union[int, str], loop: bool = False) -> None:
             results = self.async_pipeline.get_result(next_frame_id_to_show)
             output = self.render_result(results)
             self.visualizer.show(output)
+        dump_frames(saved_frames, self.visualizer.output, input_stream, streamer)
 
     def render_result(self, results: Tuple[Any, dict]) -> np.ndarray:
         """Render for results of inference."""

@@ -22,6 +22,7 @@
 from otx.api.usecases.exportable_code.streamer import get_streamer
 from otx.api.usecases.exportable_code.visualizers import Visualizer
 from otx.api.utils.shape_factory import ShapeFactory
+from otx.cli.tools.utils.demo.visualization import dump_frames
 
 
 class ChainExecutor:
@@ -78,11 +79,16 @@ def crop(
     def run(self, input_stream: Union[int, str], loop: bool = False) -> None:
         """Run demo using input stream (image, video stream, camera)."""
         streamer = get_streamer(input_stream, loop)
+        saved_frames = []
 
         for frame in streamer:
             # getting result for single image
             annotation_scene = self.single_run(frame)
             output = self.visualizer.draw(frame, annotation_scene, {})
             self.visualizer.show(output)
+            if self.visualizer.output:
+                saved_frames.append(frame)
             if self.visualizer.is_quit():
                 break
+
+        dump_frames(saved_frames, self.visualizer.output, input_stream, streamer)
@@ -13,6 +13,7 @@
 )
 from otx.api.usecases.exportable_code.streamer import get_streamer
 from otx.api.usecases.exportable_code.visualizers import Visualizer
+from otx.cli.tools.utils.demo.visualization import dump_frames
 
 
 class SyncExecutor:
@@ -31,12 +32,17 @@ def __init__(self, model: ModelContainer, visualizer: Visualizer) -> None:
     def run(self, input_stream: Union[int, str], loop: bool = False) -> None:
         """Run demo using input stream (image, video stream, camera)."""
         streamer = get_streamer(input_stream, loop)
+        saved_frames = []
 
         for frame in streamer:
             # getting result include preprocessing, infer, postprocessing for sync infer
             predictions, frame_meta = self.model(frame)
             annotation_scene = self.converter.convert_to_annotation(predictions, frame_meta)
             output = self.visualizer.draw(frame, annotation_scene, frame_meta)
             self.visualizer.show(output)
+            if self.visualizer.output:
+                saved_frames.append(frame)
             if self.visualizer.is_quit():
                 break
+
+        dump_frames(saved_frames, self.visualizer.output, input_stream, streamer)
@@ -47,9 +47,9 @@ def create_output_converter(task_type: TaskType, labels: LabelSchemaEntity):
     return create_converter(converter_type, labels)
 
 
-def create_visualizer(_task_type: TaskType, no_show: bool = False):
+def create_visualizer(_task_type: TaskType, no_show: bool = False, output: Optional[str] = None):
     """Create visualizer according to kind of task."""
 
     # TODO: use anomaly-specific visualizer for anomaly tasks
 
-    return Visualizer(window_name="Result", no_show=no_show)
+    return Visualizer(window_name="Result", no_show=no_show, output=output)
@@ -1,4 +1,4 @@
 openvino==2022.3.0
 openmodelzoo-modelapi==2022.3.0
-otx @ git+https://github.com/openvinotoolkit/training_extensions/@dd03235da2319815227f1b75bce298ee6e8b0f31#egg=otx
+otx @ git+https://github.com/openvinotoolkit/training_extensions/@8c11c3d42c726e6e0eda7364f00cf8ed4dbdc2e9#egg=otx
 numpy>=1.21.0,<=1.23.5  # np.bool was removed in 1.24.0 which was used in openvino runtime
@@ -164,6 +164,10 @@ def __iter__(self) -> Iterator[np.ndarray]:
                 else:
                     break
 
+    def fps(self):
+        """Returns a frequency of getting images from source."""
+        return self.cap.get(cv2.CAP_PROP_FPS)
+
     def get_type(self) -> MediaType:
         """Returns the type of media."""
         return MediaType.VIDEO

@@ -66,6 +66,7 @@ def __init__(
         is_one_label: bool = False,
         no_show: bool = False,
         delay: Optional[int] = None,
+        output: Optional[str] = None,
     ) -> None:
         self.window_name = "Window" if window_name is None else window_name
         self.shape_drawer = ShapeDrawer(show_count, is_one_label)
@@ -74,6 +75,7 @@ def __init__(
         self.no_show = no_show
         if delay is None:
             self.delay = 1
+        self.output = output
 
     def draw(
         self,