Skip to content

Commit

Permalink
Add int4 MVE support for FC (#134)
Browse files Browse the repository at this point in the history
Unit tests are updated as well.
  • Loading branch information
mansnils authored May 6, 2024
1 parent 429fb5c commit 01dee38
Show file tree
Hide file tree
Showing 73 changed files with 1,368 additions and 396 deletions.
7 changes: 2 additions & 5 deletions Include/arm_nnsupportfunctions.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
* Description: Public header file of support functions for CMSIS NN Library
*
* $Date: 30 April 2024
* $Revision: V.21.1.0
* $Revision: V.22.0.0
*
* Target : Arm(R) M-Profile Architecture
* -------------------------------------------------------------------- */
Expand Down Expand Up @@ -596,8 +596,6 @@ arm_cmsis_nn_status arm_nn_mat_mult_nt_t_s8_s32(const int8_t *lhs,
* @param[in] rhs_rows Number of rows in the right-hand side input matrix
* @param[in] activation_min Minimum value to clamp the output to. Range: int8
* @param[in] activation_max Maximum value to clamp the output to. Range: int8
* @param[in] address_offset Memory position offset for dst. First output is stored at 'dst', the
* second at 'dst + address_offset' and so on. Default value is typically 1.
*
* @return The function returns <code>ARM_CMSIS_NN_SUCCESS</code>
*
Expand All @@ -613,8 +611,7 @@ arm_cmsis_nn_status arm_nn_vec_mat_mult_t_s4(const int8_t *lhs,
const int32_t rhs_cols,
const int32_t rhs_rows,
const int32_t activation_min,
const int32_t activation_max,
const int32_t address_offset);
const int32_t activation_max);

/**
* @brief s8 Vector by Matrix (transposed) multiplication
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Examples are Cortex-M55 or Cortex-M85 configured with MVE.
| Conv2D | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| DepthwiseConv2D | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| TransposeConv2D | Yes | No | No | Yes | No | No | Yes | No | No |
| Fully Connected | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No |
| Fully Connected | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Add | Yes | Yes | N/A | Yes | Yes | N/A | Yes | Yes | N/A |
| Mul | Yes | Yes | N/A | Yes | Yes | N/A | Yes | Yes | N/A |
| MaxPooling | Yes | Yes | N/A | Yes | Yes | N/A | Yes | Yes | N/A |
Expand Down
10 changes: 5 additions & 5 deletions Source/FullyConnectedFunctions/arm_fully_connected_s4.c
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright 2023 Arm Limited and/or its affiliates <open-source-office@arm.com>
* SPDX-FileCopyrightText: Copyright 2023-2024 Arm Limited and/or its affiliates <open-source-office@arm.com>
*
* SPDX-License-Identifier: Apache-2.0
*
Expand All @@ -21,8 +21,8 @@
* Title: arm_fully_connected_s4
* Description: Fully connected function compatible with TF Lite.
*
* $Date: 10 October 2023
* $Revision: V.1.0.0
* $Date: 22 April 2024
* $Revision: V.1.1.0
*
* Target : Arm(R) M-Profile Architecture
*
Expand Down Expand Up @@ -79,8 +79,8 @@ arm_cmsis_nn_status arm_fully_connected_s4(const cmsis_nn_context *ctx,
filter_dims->n, /* col_dim or accum_depth */
output_dims->c, /* row_dim or output_depth */
fc_params->activation.min,
fc_params->activation.max,
1L);
fc_params->activation.max);

input += filter_dims->n;
output += output_dims->c;
batch_cnt--;
Expand Down
Loading

0 comments on commit 01dee38

Please sign in to comment.