Should dilated pooling be supported #180

fujunwei · 2021-06-08T07:31:04Z

OpenVINO doesn't support dilated pooling because there are no attributes to specify the dilation, for example MaxPooling.

For DML, only MAX_POOLING2 support with adding an additional constant array Dilations, but AVERAGE_POOLING and LP_POOLING still don't support dilated pooling.

System API only has limit capabliity to support dilated pooling, should it be defined in WebNN Spec?

The text was updated successfully, but these errors were encountered:

wchao1115 · 2021-06-28T03:05:22Z

This is already supported. Please take a look at pool2d operations in the current spec.

huningxin · 2021-07-12T03:27:22Z

I suppose @fujunwei 's question is whether WebNN spec should not support dilated pooling given WebNN-native implementation feedbacks on OpenVINO and DirectML. According to the following table, only max pool of DirectML supports dilation.

native API	Average Pooling	Max Pooling	L2 Pooling
DirectML	No dilations DML_AVERAGE_POOLING_OPERATOR_DESC	Support dilations DML_MAX_POOLING2_OPERATOR_DESC	No dilations DML_LP_POOLING_OPERATOR_DESC
NNAPI	No dilations ANEURALNETWORKS_AVERAGE_POOL_2D	No dilations ANEURALNETWORKS_MAX_POOL_2D	No dilations ANEURALNETWORKS_L2_POOL_2D
OpenVINO	No dilations AvgPool	No dilations MaxPool	No L2 pooling op

wchao1115 · 2021-07-13T20:58:56Z

TensorFlow, PyTorch, and ONNX all support dilated pooling as the feature is used in real models e.g. the one described in this paper, with TensorFlow supporting it on both AVG and MAX pool.

https://www.tensorflow.org/api_docs/python/tf/nn/pool
https://github.com/onnx/onnx/blob/master/docs/Operators.md#MaxPool
https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d

It makes sense since dilation has been supported in convolution for a long time. A follow-up pooling operation might as well support it too.

A general question here is what to do if the underlying platform doesn't support a particular feature. I think in such case, the graph builder should probably fail fast at graph construction time to give the user a chance to either fail or recover from it.

This situation, hopefully, won't be common since WebNN is most likely just a backend of a framework used by the user; if the framework doesn't support this feature to begin with, it isn't likely to cause the WebNN backend to fail on it.

huningxin · 2021-07-14T08:07:26Z

@wchao1115 , thanks for the pointers, they are helpful.

A general question here is what to do if the underlying platform doesn't support a particular feature. I think in such case, the graph builder should probably fail fast at graph construction time to give the user a chance to either fail or recover from it.

I understand a WebNN implementation should support the spec defines, otherwise it would not pass the conformance tests. Given the dilated pooling is not widely available on native platform ML APIs yet, should we exclude it for now and support it in the future?

wchao1115 · 2021-07-14T20:45:46Z

My concern is that a webnn backend may be limited in their support for this feature for the frameworks that do support it, like the major frameworks I cited above i.e. there will be no way to implement TensorFlow, PyTorch, or ONNX dilated pooling, for examples, through webnn backend, if we remove this feature from the spec.

FWIW, the way we approach conformance in DirectML is that there are two classes of features -- the required and optional features. A required feature must be implemented to be fully conformed, but the implementation needs not be native i.e. it may be emulated. This is the implementation strategy I suggested for batchNormalization in #187, particularly because the operation can in fact be emulated as a composite of other existing operations.

An optional feature, however, is something that is either not possible or cannot be easily emulated, but also not required in all scenarios. This type of features requires a capability flag so that the caller can independently probe if it's actually supported by the implementation. I think of dilated pooling in this category.

We could make dilated pooling an optional feature with a capability flag exposed in the context e.g. context.isDilatedPoolingSupported. This way for the frameworks that do support this feature, it can detect if it is actually implemented in the backend and fail accordingly if not. And for the frameworks that do not support it, then it's a non-issue.

huningxin · 2024-07-25T15:09:39Z

According to the current Chromium prototype, dilated pooling are not widely supported by targeting backends:

CoreML backend doesn't support: https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/coreml/graph_builder_coreml.cc;l=2253
TFLite backend doesn't support: https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/tflite/graph_builder_tflite.cc;l=2251
DirectML may only support dilated pooling ops for higher feature level, e.g., DML_OPERATOR_AVERAGE_POOLING1 and DML_OPERATOR_LP_POOLING1 with dilations support are introduced in DML_FEATURE_LEVEL_6_2: https://learn.microsoft.com/en-us/windows/ai/directml/dml-feature-level-history#dml_feature_level_6_2

Because dilated pooling ops cannot be easily emulated. We probably should consider either removing it from the spec or making it as a detectable feature in opSupportLimits.

Thoughts?

/cc @fdwr @philloooo @inexorabletash

fdwr · 2024-07-29T07:39:12Z

Because dilated pooling ops cannot be easily emulated.

To convolve, reduce, or pool a tensor is a very similar operation, so much so that if you can rearrange one problem to any of the others, it's likely emulatable.

pooling	reducing	convoluting
averagePool	reduceAverage	conv2d(...) / windowElementCount
l2pool	reduceL2	sqrt(conv2d(sqr()...))
maxPool	reduceMax	NA

Pool averages: It's equivalent to calling conv2d with a respectively weighted uniform filter. So for dilations > 1 and a 2x2 pooling window size, use a convolution filter of all 1's [[1, 1], [1, 1]] and divide the convolution output by 4. For a less precise answer but fewer operations, prefold the factor into the filter to [[0.25, 0.25], [0.25, 0.25]] (filter = div(expand(scalarConstantOne, windowSize), windowElementCount)).
Another way to pool averages via reduction (for backends capable of strides) is to reshape temporarily to a higher ND space and use overlapping strides for the windows. For example, for older DirectML (before FL6.2 which takes dilations directly), you can pad (when padding is specified), adjust the tensor description from 4D tensor to a 6D tensor with overlapped strides for the trailing window dimensions, and call averagePool's sibling reduceAverage - I verified this locally returns the same result, albeit more slowly than with direct dilations (happy to share the DxDispatch data file). If the API cannot take explicit tensor strides directly, there are other potential options like PyTorch's unfold/as_strided or TF's ExtractImagePatches ¹ to project it into a temporary form which average reduction can work upon. Though, just using conv2d should be faster.
Pool Lebesgue 2-norms: Like above, you call sqrt(conv2d(pow(input, 2), filterOfOnes, {dilations...})).
Pool maximums: This one is tougher. DML < FL6.2 should be able to achieve this with the reshape stride trick above, calling PADDING (when needed) and REDUCE_FUNCTION_MAX. TF could use ExtractImagePatches, assuming it's in the TFLite list ¹, but I'm not seeing a matching counterpart for CoreML ² 🤔. Maybe SlidingWindows followed by ReduceMax? Can CoreML create tensors with explicit strides to then use ReduceMax?

¹ For TF, should I be using this list or this list or another, to know which ops Chromium may call?
² For CoreML, should I be using this list or this list?

fujunwei mentioned this issue Jun 8, 2021

Add oneDNN/XNNPACK backend and support MobileNet/SqueezeNet model webmachinelearning/webnn-native#10

Merged

wchao1115 closed this as completed Jul 11, 2021

huningxin reopened this Jul 12, 2021

anssiko added the question label Mar 3, 2023

dontcallmedom added opset and removed question labels Mar 3, 2023

inexorabletash added operator specific and removed opset labels Feb 21, 2024

inexorabletash mentioned this issue May 2, 2024

Meta: Introduce "Interop" label? #673

Closed

inexorabletash added the interop label May 9, 2024

a-sully mentioned this issue Aug 13, 2024

How to define the algorithm of L2_Pool2d? #278

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should dilated pooling be supported #180

Should dilated pooling be supported #180

fujunwei commented Jun 8, 2021

wchao1115 commented Jun 28, 2021

huningxin commented Jul 12, 2021

wchao1115 commented Jul 13, 2021 •

edited

Loading

huningxin commented Jul 14, 2021

wchao1115 commented Jul 14, 2021 •

edited

Loading

huningxin commented Jul 25, 2024

fdwr commented Jul 29, 2024 •

edited

Loading

Should dilated pooling be supported #180

Should dilated pooling be supported #180

Comments

fujunwei commented Jun 8, 2021

wchao1115 commented Jun 28, 2021

huningxin commented Jul 12, 2021

wchao1115 commented Jul 13, 2021 • edited Loading

huningxin commented Jul 14, 2021

wchao1115 commented Jul 14, 2021 • edited Loading

huningxin commented Jul 25, 2024

fdwr commented Jul 29, 2024 • edited Loading

wchao1115 commented Jul 13, 2021 •

edited

Loading

wchao1115 commented Jul 14, 2021 •

edited

Loading

fdwr commented Jul 29, 2024 •

edited

Loading