openvinotoolkit · azhogov · Mar 15, 2022 · Mar 10, 2022 · Mar 10, 2022 · Mar 10, 2022
@@ -66,6 +66,9 @@ Jenkinsfile  @openvinotoolkit/openvino-admins
 /src/plugins/intel_gna/  @openvinotoolkit/openvino-ie-gna-maintainers
 /src/inference/include/ie/gna/  @openvinotoolkit/openvino-ie-gna-maintainers
 
+# IE ARM CPU:
+/docs/OV_Runtime_UG/supported_plugins/ARM_CPU.md  @openvinotoolkit/openvino_contrib-arm_plugin-maintainers
+
 # IE Auto (MULTI) plugin:
 /src/plugins/auto/  @openvinotoolkit/openvino-ie-auto-multi-maintainers
 /src/inference/include/ie/multi-device/  @openvinotoolkit/openvino-ie-auto-multi-maintainers

@@ -0,0 +1,91 @@
+# Arm® CPU device {#openvino_docs_OV_UG_supported_plugins_ARM_CPU}
+
+
+## Introducing the Arm® CPU Plugin
+The Arm® CPU plugin is developed in order to enable deep neural networks inference on Arm® CPU, using [Compute Library](https://github.com/ARM-software/ComputeLibrary) as a backend.
+
+> **NOTE**: Note that this is a community-level add-on to OpenVINO™. Intel® welcomes community participation in the OpenVINO™ ecosystem and technical questions on community forums as well as code contributions are welcome. However, this component has not undergone full release validation or qualification from Intel®, and no official support is offered. 
+
+The Arm® CPU plugin is not a part of the Intel® Distribution of OpenVINO™ toolkit and is not distributed in pre-built form. To use the plugin, it should be built from source code. Plugin build procedure is described on page [How to build Arm® CPU plugin](https://github.com/openvinotoolkit/openvino_contrib/wiki/How-to-build-ARM-CPU-plugin). 
+
+The set of supported layers is defined on [Operation set specification](https://github.com/openvinotoolkit/openvino_contrib/wiki/ARM-plugin-operation-set-specification).
+
+
+## Supported inference data types
+The Arm® CPU plugin supports the following data types as inference precision of internal primitives:
+
+- Floating-point data types:
+  - f32
+  - f16
+- Quantized data types:
+  - i8
+
+
+> **NOTE**: i8 support is experimental.
+
+[Hello Query Device C++ Sample](../../../samples/cpp/hello_query_device/README.md) can be used to print out supported data types for all detected devices.
+
+## Supported features
+
+### Preprocessing acceleration
+The Arm® CPU plugin supports the following accelerated preprocessing operations:
+- Precision conversion:
+    - u8  -> u16, s16, s32
+    - u16 -> u8, u32
+    - s16 -> u8, s32
+    - f16 -> f32
+- Transposion of tensors with dims < 5
+- Interpolation of 4D tensors with no padding (`pads_begin` and `pads_end` equal 0).
+
+The Arm® CPU plugin supports the following preprocessing operations, however they are not accelerated:
+- Precision conversion that are not mentioned above
+- Color conversion:
+    - NV12 to RGB
+    - NV12 to BGR
+    - i420 to RGB
+    - i420 to BGR
+
+See [preprocessing API guide](../preprocessing_overview.md) for more details.
+
+## Supported properties
+The plugin supports the properties listed below.
+
+### Read-write properties
+All parameters must be set before calling `ov::Core::compile_model()` in order to take effect or passed as additional argument to `ov::Core::compile_model()`
+
+- ov::enable_profiling
+
+### Read-only properties
+- ov::supported_properties
+- ov::available_devices
+- ov::range_for_async_infer_requests
+- ov::range_for_streams
+- ov::device::full_name
+- ov::device::capabilities
+
+
+## Known Layers Limitation
+* `AvgPool` layer is supported via arm_compute library for 4D input tensor and via reference implementation for another cases.
+* `BatchToSpace` layer is supported 4D tensors only and constant nodes: `block_shape` with `N` = 1 and `C`= 1, `crops_begin` with zero values and `crops_end` with zero values.
+* `ConvertLike` layer is supported configuration like `Convert`.
+* `DepthToSpace` layer is supported 4D tensors only and for `BLOCKS_FIRST` of `mode` attribute.
+* `Equal` does not support `broadcast` for inputs.
+* `Gather` layer is supported constant scalar or 1D indices axes only. Layer is supported as via arm_compute library for non negative indices and via reference implementation otherwise.
+* `Less` does not support `broadcast` for inputs.
+* `LessEqual` does not support `broadcast` for inputs.
+* `LRN` layer is supported `axes = {1}` or `axes = {2, 3}` only.
+* `MaxPool-1` layer is supported via arm_compute library for 4D input tensor and via reference implementation for another cases.
+* `Mod` layer is supported for f32 only.
+* `MVN` layer is supported via arm_compute library for 2D inputs and `false` value of `normalize_variance` and `false` value of `across_channels`, for another cases layer is implemented via runtime reference.
+* `Normalize` layer is supported via arm_compute library with `MAX` value of `eps_mode` and `axes = {2 | 3}`, and for `ADD` value of `eps_mode` layer uses `DecomposeNormalizeL2Add`, for another cases layer is implemented via runtime reference.
+* `NotEqual` does not support `broadcast` for inputs.
+* `Pad` layer works with `pad_mode = {REFLECT | CONSTANT | SYMMETRIC}` parameters only.
+* `Round` layer is supported via arm_compute library with `RoundMode::HALF_AWAY_FROM_ZERO` value of `mode`, for another cases layer is implemented via runtime reference.
+* `SpaceToBatch` layer is supported 4D tensors only and constant nodes: `shapes`, `pads_begin` or `pads_end` with zero paddings for batch or channels and one values `shapes` for batch and channels.
+* `SpaceToDepth` layer is supported 4D tensors only and for `BLOCKS_FIRST` of `mode` attribute.
+* `StridedSlice` layer is supported via arm_compute library for tensors with dims < 5 and zero values of `ellipsis_mask` or zero values of `new_axis_mask` and `shrink_axis_mask`, for another cases layer is implemented via runtime reference.
+* `FakeQuantize` layer is supported via arm_compute library in Low Precision evaluation mode for suitable models and via runtime reference otherwise.
+
+## See Also
+* [How to run YOLOv4 model inference using OpenVINO™ and OpenCV on Arm®](https://opencv.org/how-to-run-yolov4-using-openvino-and-opencv-on-arm/)
+* [Face recognition on Android™ using OpenVINO™ toolkit with Arm® plugin](https://opencv.org/face-recognition-on-android-using-openvino-toolkit-with-arm-plugin/)
@@ -11,6 +11,7 @@
    openvino_docs_OV_UG_supported_plugins_GPU
    openvino_docs_IE_DG_supported_plugins_VPU
    openvino_docs_OV_UG_supported_plugins_GNA
+   openvino_docs_OV_UG_supported_plugins_ARM_CPU
 
 @endsphinxdirective
 
@@ -22,6 +23,7 @@ The OpenVINO Runtime provides capabilities to infer deep learning models on the
 |[GPU](GPU.md)            |Intel® Graphics, including Intel® HD Graphics, Intel® UHD Graphics, Intel® Iris® Graphics, Intel® Xe Graphics, Intel® Xe MAX Graphics |
 |[VPUs](VPU.md)            |Intel® Neural Compute Stick 2 powered by the Intel® Movidius™ Myriad™ X, Intel® Vision Accelerator Design with Intel® Movidius™ VPUs                                                                                           |
 |[GNA](GNA.md)              |[Intel® Speech Enabling Developer Kit](https://www.intel.com/content/www/us/en/support/articles/000026156/boards-and-kits/smart-home.html); [Amazon Alexa\* Premium Far-Field Developer Kit](https://developer.amazon.com/en-US/alexa/alexa-voice-service/dev-kits/amazon-premium-voice); [Intel® Pentium® Silver Processors N5xxx, J5xxx and Intel® Celeron® Processors N4xxx, J4xxx (formerly codenamed Gemini Lake)](https://ark.intel.com/content/www/us/en/ark/products/codename/83915/gemini-lake.html): [Intel® Pentium® Silver J5005 Processor](https://ark.intel.com/content/www/us/en/ark/products/128984/intel-pentium-silver-j5005-processor-4m-cache-up-to-2-80-ghz.html), [Intel® Pentium® Silver N5000 Processor](https://ark.intel.com/content/www/us/en/ark/products/128990/intel-pentium-silver-n5000-processor-4m-cache-up-to-2-70-ghz.html), [Intel® Celeron® J4005 Processor](https://ark.intel.com/content/www/us/en/ark/products/128992/intel-celeron-j4005-processor-4m-cache-up-to-2-70-ghz.html), [Intel® Celeron® J4105 Processor](https://ark.intel.com/content/www/us/en/ark/products/128989/intel-celeron-j4105-processor-4m-cache-up-to-2-50-ghz.html), [Intel® Celeron® J4125 Processor](https://ark.intel.com/content/www/us/en/ark/products/197305/intel-celeron-processor-j4125-4m-cache-up-to-2-70-ghz.html), [Intel® Celeron® Processor N4100](https://ark.intel.com/content/www/us/en/ark/products/128983/intel-celeron-processor-n4100-4m-cache-up-to-2-40-ghz.html), [Intel® Celeron® Processor N4000](https://ark.intel.com/content/www/us/en/ark/products/128988/intel-celeron-processor-n4000-4m-cache-up-to-2-60-ghz.html); [Intel® Pentium® Processors N6xxx, J6xxx, Intel® Celeron® Processors N6xxx, J6xxx and Intel Atom® x6xxxxx (formerly codenamed Elkhart Lake)](https://ark.intel.com/content/www/us/en/ark/products/codename/128825/products-formerly-elkhart-lake.html); [Intel® Core™ Processors (formerly codenamed Cannon Lake)](https://ark.intel.com/content/www/us/en/ark/products/136863/intel-core-i3-8121u-processor-4m-cache-up-to-3-20-ghz.html); [10th Generation Intel® Core™ Processors (formerly codenamed Ice Lake)](https://ark.intel.com/content/www/us/en/ark/products/codename/74979/ice-lake.html): [Intel® Core™ i7-1065G7 Processor](https://ark.intel.com/content/www/us/en/ark/products/196597/intel-core-i71065g7-processor-8m-cache-up-to-3-90-ghz.html), [Intel® Core™ i7-1060G7 Processor](https://ark.intel.com/content/www/us/en/ark/products/197120/intel-core-i71060g7-processor-8m-cache-up-to-3-80-ghz.html), [Intel® Core™ i5-1035G4 Processor](https://ark.intel.com/content/www/us/en/ark/products/196591/intel-core-i51035g4-processor-6m-cache-up-to-3-70-ghz.html), [Intel® Core™ i5-1035G7 Processor](https://ark.intel.com/content/www/us/en/ark/products/196592/intel-core-i51035g7-processor-6m-cache-up-to-3-70-ghz.html), [Intel® Core™ i5-1035G1 Processor](https://ark.intel.com/content/www/us/en/ark/products/196603/intel-core-i51035g1-processor-6m-cache-up-to-3-60-ghz.html), [Intel® Core™ i5-1030G7 Processor](https://ark.intel.com/content/www/us/en/ark/products/197119/intel-core-i51030g7-processor-6m-cache-up-to-3-50-ghz.html), [Intel® Core™ i5-1030G4 Processor](https://ark.intel.com/content/www/us/en/ark/products/197121/intel-core-i51030g4-processor-6m-cache-up-to-3-50-ghz.html), [Intel® Core™ i3-1005G1 Processor](https://ark.intel.com/content/www/us/en/ark/products/196588/intel-core-i31005g1-processor-4m-cache-up-to-3-40-ghz.html), [Intel® Core™ i3-1000G1 Processor](https://ark.intel.com/content/www/us/en/ark/products/197122/intel-core-i31000g1-processor-4m-cache-up-to-3-20-ghz.html), [Intel® Core™ i3-1000G4 Processor](https://ark.intel.com/content/www/us/en/ark/products/197123/intel-core-i31000g4-processor-4m-cache-up-to-3-20-ghz.html); [11th Generation Intel® Core™ Processors (formerly codenamed Tiger Lake)](https://ark.intel.com/content/www/us/en/ark/products/codename/88759/tiger-lake.html); [12th Generation Intel® Core™ Processors (formerly codenamed Alder Lake)](https://ark.intel.com/content/www/us/en/ark/products/codename/147470/products-formerly-alder-lake.html)|
+|[Arm® CPU](ARM_CPU.md) |Raspberry Pi™ 4 Model B, Apple® Mac mini with M1 chip, NVIDIA® Jetson Nano™, Android™ devices    |
 
 OpenVINO runtime also has several execution capabilities which work on top of other devices:
 
@@ -38,17 +40,17 @@ Devices similar to the ones we have used for benchmarking can be accessed using
 ## Features support matrix
 The table below demonstrates support of key features by OpenVINO device plugins.
 
-| Capability | [CPU](CPU.md) | [GPU](GPU.md) | [GNA](GNA.md) | [VPU](VPU.md) |
-| ---------- | --- | --- | --- | --- |
-| [Heterogeneous execution](../hetero_execution.md)| Yes | Yes | No | ? |
-| [Multi-device execution](../multi_device.md) | Yes | Yes | Partial | ? |
-| [Automatic batching](../automatic_batching.md) | No | Yes | No | ? |
-| [Multi-stream execution](@ref openvino_docs_optimization_guide_dldt_optimization_guide) | Yes | Yes | No | ? |
-| [Models caching](../Model_caching_overview.md) | Yes | Partial | Yes | ? |
-| [Dynamic shapes](../ov_dynamic_shapes.md) | Yes | Partial | No | ? |
-| Import/Export | Yes | No | Yes | ? |
-| [Preprocessing acceleration](../preprocessing_overview.md) | Yes | Yes | No | ? |
-| [Stateful models](../network_state_intro.md) | Yes | No | Yes | ? |
-| [Extensibility](@ref openvino_docs_Extensibility_UG_Intro) | Yes | Yes | No | ? |
+| Capability | [CPU](CPU.md) | [GPU](GPU.md) | [GNA](GNA.md) | [VPU](VPU.md) | [Arm® CPU](ARM_CPU.md) |
+| ---------- | --- | --- | --- | --- | --- |
+| [Heterogeneous execution](../hetero_execution.md)| Yes | Yes | No | ? | Yes |
+| [Multi-device execution](../multi_device.md) | Yes | Yes | Partial | ? | Yes |
+| [Automatic batching](../automatic_batching.md) | No | Yes | No | ? | No |
+| [Multi-stream execution](@ref openvino_docs_optimization_guide_dldt_optimization_guide) | Yes | Yes | No | ? | Yes |
+| [Models caching](../Model_caching_overview.md) | Yes | Partial | Yes | ? | No |
+| [Dynamic shapes](../ov_dynamic_shapes.md) | Yes | Partial | No | ? | No |
+| Import/Export | Yes | No | Yes | ? | No |
+| [Preprocessing acceleration](../preprocessing_overview.md) | Yes | Yes | No | ? | Partial |
+| [Stateful models](../network_state_intro.md) | Yes | No | Yes | ? | No |
+| [Extensibility](@ref openvino_docs_Extensibility_UG_Intro) | Yes | Yes | No | ? | No |
 
 For more details on plugin specific feature limitation see corresponding plugin pages.