Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where is scatter and gather op? #467

Open
muazhuda opened this issue Oct 4, 2023 · 8 comments
Open

Where is scatter and gather op? #467

muazhuda opened this issue Oct 4, 2023 · 8 comments

Comments

@muazhuda
Copy link

muazhuda commented Oct 4, 2023

Is there reason why it is not in spec?
tensorflow.js supports gather even for webgpu backend.

@anssiko
Copy link
Member

anssiko commented Oct 12, 2023

Thanks for your question.

We have published guidelines for proposing new ops: https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md#proposing-and-adding-a-new-operation

If you can answer those questions in this issue, the WG is able to look at your request sooner.

The gather op has been already proposed in #375 and the WG is actively discussing it.

@inexorabletash
Copy link
Member

re: Scatter

Open questions for scatter:

  • Behavior if indices contains duplicates? (TF: sums; ONNX: explicit reduction options; PyTorch: undefined, or use scatter_reduce)
  • Treatment of axis? (ONNX: explicit; PyTorch: explicit; TF: ??)
  • Do all support slice updates like TF?

@huningxin
Copy link
Contributor

re: Scatter

+1 to support scatter, in particular, scatterND

MLOperand scatterNd(MLOperand input, MLOperand indices, MLOperand updates);

When prototyping Whisper decoders' inference with MLBuffer, I found scatterND operator is useful to insert KV values by position into the pre-allocated static KV cache (WebNN requires static shape).

builder.scatterNd(past_key, position_ids, present_key);
builder.scatterNd(past_value, position_ids, present_value);

This avoids reading the KV cache tensor back to CPU and improves the performance. The initial prototype is available at: https://github.com/huningxin/onnxruntime-inference-examples/blob/whisper-mlbuffer/js/whisper-demo/whisper.js

Platforms' support:

  • DirectML DML_SCATTER_ND_OPERATOR_DESC: Copies the whole input tensor to the output, then overwrites selected indices with corresponding values from the updates tensor.
  • CoreML scatter_nd: Scatter updates to data at locations indices. Support "update" and other modes.
  • TFLite scatter_nd: Scatter sparse updates according to individual values at the specified indices. Only support "add"/"sum" mode. Need the emulation to support scattering to input tensor.

@inexorabletash
Copy link
Member

inexorabletash commented Jul 22, 2024

  • TFLite scatter_nd: Scatter sparse updates according to individual values at the specified indices. Only support "add"/"sum" mode. Need the emulation to support scattering to input tensor.

Can't TF's scatter_nd_update be used for the last case (scattering to input tensor)?

The impression I got from a quick survey was that "update" into an input tensor was more common in the APIs vs scattering into a new tensor given the shape, but it sounds like models need both?

i.e. do we need both of these:

MLOperand scatterNd(MLOperand input, MLOperand indices, MLOperand updates);
MLOperand scatterNd(sequence<unsigned long> shape, MLOperand indices, MLOperand updates);

... or can we get away with the former only?

@huningxin
Copy link
Contributor

huningxin commented Jul 23, 2024

@inexorabletash

Can't TF's scatter_nd_update be used for the last case (scattering to input tensor)?

The impression I got from a quick survey was that "update" into an input tensor was more common in the APIs vs scattering into a new tensor given the shape, but it sounds like models need both?

I agree "update" into an input tensor is more commonly used, including the updating static KV cache use case and some other models, like SAM ViT Base.

TF's scatter_nd_update could be mapped directly. But I am not sure whether it is available in TFLite as backend.

... or can we get away with the former only?

+1 to former only. The latter can be emulated by updating into a zero initialized tensor.

@inexorabletash
Copy link
Member

TF's scatter_nd_update could be mapped directly. But I am not sure whether it is available in TFLite as backend.

Ooof, yeah. Not listed in https://www.tensorflow.org/mlir/tfl_ops

@huningxin
Copy link
Contributor

TF's scatter_nd_update could be mapped directly. But I am not sure whether it is available in TFLite as backend.

Ooof, yeah. Not listed in https://www.tensorflow.org/mlir/tfl_ops

Although scatter_nd_update is not supported as tflite built-in ops, I am not sure whether it could be supported by Select TensorFlow operators feature. It is listed in the Supported Select TensorFlow operators.

/cc @reillyeon

@reillyeon
Copy link
Contributor

We'll have to see what the binary size impact of adding support for these operators to Chromium's built-in copy of TFLite is, and also what level of support delegates have for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants