-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Where is scatter and gather op? #467
Comments
Thanks for your question. We have published guidelines for proposing new ops: https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md#proposing-and-adding-a-new-operation If you can answer those questions in this issue, the WG is able to look at your request sooner. The |
re: Scatter
Open questions for scatter:
|
+1 to support scatter, in particular, MLOperand scatterNd(MLOperand input, MLOperand indices, MLOperand updates); When prototyping Whisper decoders' inference with MLBuffer, I found builder.scatterNd(past_key, position_ids, present_key);
builder.scatterNd(past_value, position_ids, present_value); This avoids reading the KV cache tensor back to CPU and improves the performance. The initial prototype is available at: https://github.com/huningxin/onnxruntime-inference-examples/blob/whisper-mlbuffer/js/whisper-demo/whisper.js Platforms' support:
|
Can't TF's scatter_nd_update be used for the last case (scattering to input tensor)? The impression I got from a quick survey was that "update" into an input tensor was more common in the APIs vs scattering into a new tensor given the shape, but it sounds like models need both? i.e. do we need both of these: MLOperand scatterNd(MLOperand input, MLOperand indices, MLOperand updates);
MLOperand scatterNd(sequence<unsigned long> shape, MLOperand indices, MLOperand updates); ... or can we get away with the former only? |
I agree "update" into an input tensor is more commonly used, including the updating static KV cache use case and some other models, like SAM ViT Base. TF's scatter_nd_update could be mapped directly. But I am not sure whether it is available in TFLite as backend.
+1 to former only. The latter can be emulated by updating into a zero initialized tensor. |
Ooof, yeah. Not listed in https://www.tensorflow.org/mlir/tfl_ops |
Although /cc @reillyeon |
We'll have to see what the binary size impact of adding support for these operators to Chromium's built-in copy of TFLite is, and also what level of support delegates have for them. |
Is there reason why it is not in spec?
tensorflow.js supports gather even for webgpu backend.
The text was updated successfully, but these errors were encountered: