Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebGPU JSEP: Make shader code not depend on input broadcasting patterns #22536

Merged

Conversation

jiangzhaoming
Copy link
Contributor

@jiangzhaoming jiangzhaoming commented Oct 22, 2024

This PR make MatMul shaders not depend on inputs broadcasting pattern, but only depend on input ranks and their shape provided in uniform. This change fix the issue that currently shaders code are different for different broadcasting, but have identical cache key and results in wrong cache hit.

This PR adds inputs broadcasting information into the cache key of
MatMul shaders, which currently impacts the shader code. This PR fixes
the results for MatMul nodes with identical input ranks but different
broadcasting patterns.
@jiangzhaoming
Copy link
Contributor Author

@qjia7 @fs-eire Please take a look, thanks

@jiangzhaoming jiangzhaoming changed the title WebGPU JSEP: Add inputs broadcasting into MatMul shader cache key WebGPU JSEP: Make shader code not rely on input broadcasting patterns Oct 23, 2024
@jiangzhaoming jiangzhaoming changed the title WebGPU JSEP: Make shader code not rely on input broadcasting patterns WebGPU JSEP: Make shader code not depend on input broadcasting patterns Oct 23, 2024
@jiangzhaoming
Copy link
Contributor Author

PR description edited, previous:

This PR adds inputs broadcasting information into the cache key of MatMul shaders, which currently impacts the shader code. This PR fixes the results for MatMul nodes with identical input ranks but different broadcasting patterns.

@fs-eire
Copy link
Contributor

fs-eire commented Oct 24, 2024

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

@fs-eire
Copy link
Contributor

fs-eire commented Oct 24, 2024

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

@fs-eire
Copy link
Contributor

fs-eire commented Oct 24, 2024

/azp run Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline,CoreML CI Pipeline,Linux DNNL CI Pipeline,Linux MIGraphX CI Pipeline,Linux ROCm CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

2 similar comments
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Oct 24, 2024
Copy link
Contributor

@qjia7 qjia7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also remove the definition of getBroadcastDims in commen.ts if it's not needed anymore.

@jiangzhaoming
Copy link
Contributor Author

Please also remove the definition of getBroadcastDims in commen.ts if it's not needed anymore.

Done.

@jiangzhaoming jiangzhaoming requested a review from fs-eire November 1, 2024 08:28
@guschmue
Copy link
Contributor

guschmue commented Nov 8, 2024

/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@guschmue
Copy link
Contributor

guschmue commented Nov 8, 2024

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

@guschmue
Copy link
Contributor

guschmue commented Nov 8, 2024

/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models

@guschmue
Copy link
Contributor

guschmue commented Nov 8, 2024

/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

guschmue
guschmue previously approved these changes Nov 8, 2024
@guschmue
Copy link
Contributor

guschmue commented Nov 8, 2024

lint not happy:
image

@jiangzhaoming
Copy link
Contributor Author

The issue comes from merging, fixed in the latest commit. Please take a look, thanks!

@guschmue
Copy link
Contributor

guschmue commented Nov 8, 2024

/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline

@guschmue
Copy link
Contributor

guschmue commented Nov 8, 2024

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline

@guschmue
Copy link
Contributor

guschmue commented Nov 8, 2024

/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

@guschmue
Copy link
Contributor

guschmue commented Nov 8, 2024

/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

@guschmue guschmue merged commit d9b9168 into microsoft:main Nov 8, 2024
50 checks passed
ishwar-raut1 pushed a commit to ishwar-raut1/onnxruntime that referenced this pull request Nov 19, 2024
…ns (microsoft#22536)

This PR make MatMul shaders not depend on inputs broadcasting pattern,
but only depend on input ranks and their shape provided in uniform. This
change fix the issue that currently shaders code are different for
different broadcasting, but have identical cache key and results in
wrong cache hit.
guschmue pushed a commit that referenced this pull request Dec 2, 2024
…ns (#22536)

This PR make MatMul shaders not depend on inputs broadcasting pattern,
but only depend on input ranks and their shape provided in uniform. This
change fix the issue that currently shaders code are different for
different broadcasting, but have identical cache key and results in
wrong cache hit.
ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request Dec 11, 2024
…ns (microsoft#22536)

This PR make MatMul shaders not depend on inputs broadcasting pattern,
but only depend on input ranks and their shape provided in uniform. This
change fix the issue that currently shaders code are different for
different broadcasting, but have identical cache key and results in
wrong cache hit.
ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request Dec 11, 2024
…ns (microsoft#22536)

This PR make MatMul shaders not depend on inputs broadcasting pattern,
but only depend on input ranks and their shape provided in uniform. This
change fix the issue that currently shaders code are different for
different broadcasting, but have identical cache key and results in
wrong cache hit.
ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request Dec 11, 2024
…ns (microsoft#22536)

This PR make MatMul shaders not depend on inputs broadcasting pattern,
but only depend on input ranks and their shape provided in uniform. This
change fix the issue that currently shaders code are different for
different broadcasting, but have identical cache key and results in
wrong cache hit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:WebGPU ort-web webgpu provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants