-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[microNPU] Add support for TFLite PAD #13732
Conversation
A separate nn.pad relay operator is legalized to an Ethos-U depthwise_conv2d operator. For ethosu_depthwise_conv2d the hardware only supports padding up to 31, 31, 32, 32, 32, so the pad size for legalization on the NPU is within these limits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alexey-yazev , I think it's good to go, but I'll leave it open for little longer in case anybody else wants to have a look.
channels_map = { | ||
"NHWC": 3, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC this one entry channels_map
is a historic relic that can go and so simplify the code little bit, but as it is present in other operators, it's probably a clean up task for some other time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alexey-yazev, LGTM! After looking at how padding is lowered in Vela I think there might be a couple more opportunities for optimization, although it seems out of scope for this PR. Just a couple of things to consider in the future:
- In some cases its possible to fuse a
nn.pad
with the following operation. As an example we currently fusenn.pad
->qnn.conv2d
([microNPU] Optimize separate padding operation for conv2d #11468), however, it seems a similar approach is also possible for average pooling (see: https://git.mlplatform.org/ml/ethos-u/ethos-u-vela.git/tree/ethosu/vela/tflite_graph_optimiser.py#n1413) - With the current implementation
nn.pad
does not get offloaded if the provided padding exceeds [31, 31, 32, 32]. If these dimensions are exceeded, we might be able to use multiple average pooling operations similar to https://git.mlplatform.org/ml/ethos-u/ethos-u-vela.git/tree/ethosu/vela/tflite_graph_optimiser.py#n1500
Thanks @alexey-yazev, @ekalda! |
Oops I forgot to ask, would it be possible to add a legalization test under: |
Hello @lhutton1, thanks for the review!
Yes, this was the first option that we tried to implement. But in the Vela implementation, this is done by "copies IFM to the right place inside the OFM" using write_offset attribute of the created AvgPool operation. In the TVM, VelaAPI operations are derived from the NpuOperation class, which does not have a write_offset attribute, so we cannot replicate Vela convert_pad() function. We tried to implement PAD legalization using the Concatenate operation but encountered an error. Seems like Cascader must be turned off for Concatenate to work. For example, Cascader is disabled in test_tflite_concat() (if Cascader is enabled, there is the same error as we have with the Concatenate) So far, the most feasible option seems to use several depthwise_conv2d operators, if padding exceeds [31, 31, 32, 32]. But of course, I do not have all the knowledge about this, maybe there are other options? |
Test was added in PR |
Thanks @arina-grovety for the explanation, just following up on some of the questions... I suspect this is a case of needing to expose this functionality from within Vela, I'll see if we can make this happen for a future Vela release. The concatenate error does indeed sound like a separate issue in itself. It might be worth investigating the reason for that at some point. |
A separate nn.pad relay operator is legalized to an Ethos-U depthwise_conv2d operator. For ethosu_depthwise_conv2d the hardware only supports padding up to 31, 31, 32, 32, 32, so the pad size for legalization on the NPU is within these limits.
A separate nn.pad relay operator is legalized to an Ethos-U depthwise_conv2d operator. For ethosu_depthwise_conv2d the hardware only supports padding up to 31, 31, 32, 32, 32, so the pad size for legalization on the NPU is within these limits.
cc @leandron, @ekalda, @lhutton1