[js/webgpu] Optimize Expand #22752

qjia7 · 2024-11-07T02:20:14Z

Use components = 4 if possible.

llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.

Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.

qjia7 · 2024-11-07T02:20:55Z

@guschmue @fs-eire Please take a look, thanks.

guschmue · 2024-11-12T17:33:29Z

/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline

guschmue · 2024-11-12T17:33:36Z

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline

guschmue · 2024-11-12T17:33:41Z

/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models

azure-pipelines · 2024-11-12T17:33:42Z

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines · 2024-11-12T17:33:47Z

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

azure-pipelines · 2024-11-12T17:33:56Z

Azure Pipelines successfully started running 1 pipeline(s).

guschmue · 2024-11-12T17:34:06Z

/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2024-11-12T17:34:13Z

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.

### Description  Use components = 4 if possible. This is the webgpu native implementation from #22752

Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.

### Description  Use components = 4 if possible. This is the webgpu native implementation from microsoft#22752

Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.

### Description  Use components = 4 if possible. This is the webgpu native implementation from microsoft#22752

[js/webgpu] Optimize Expand

717ac31

Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.

guschmue approved these changes Nov 12, 2024

View reviewed changes

guschmue added the ep:WebGPU ort-web webgpu provider label Nov 12, 2024

guschmue merged commit 7e0dd9d into microsoft:main Nov 12, 2024
50 checks passed

ishwar-raut1 pushed a commit to ishwar-raut1/onnxruntime that referenced this pull request Nov 19, 2024

[js/webgpu] Optimize Expand (microsoft#22752)

0015560

Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.

guschmue pushed a commit that referenced this pull request Dec 2, 2024

[js/webgpu] Optimize Expand (#22752)

e784ce9

Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.

qjia7 deleted the opt_expand branch December 9, 2024 05:00

qjia7 mentioned this pull request Dec 9, 2024

[webgpu] Optimize Expand #23052

Merged

guschmue pushed a commit that referenced this pull request Dec 10, 2024

[webgpu] Optimize Expand (#23052)

defcc4f

### Description  Use components = 4 if possible. This is the webgpu native implementation from #22752

ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request Dec 11, 2024

[js/webgpu] Optimize Expand (microsoft#22752)

de64e53

Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.

ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request Dec 11, 2024

[webgpu] Optimize Expand (microsoft#23052)

9920061

### Description  Use components = 4 if possible. This is the webgpu native implementation from microsoft#22752

ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request Dec 11, 2024

[js/webgpu] Optimize Expand (microsoft#22752)

e724f37

Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.

ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request Dec 11, 2024

[webgpu] Optimize Expand (microsoft#23052)

0d0c49d

### Description  Use components = 4 if possible. This is the webgpu native implementation from microsoft#22752

ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request Dec 11, 2024

[webgpu] Optimize Expand (microsoft#23052)

7ba6ed6

### Description  Use components = 4 if possible. This is the webgpu native implementation from microsoft#22752

tarekziade pushed a commit to tarekziade/onnxruntime that referenced this pull request Jan 10, 2025

[webgpu] Optimize Expand (microsoft#23052)

219467c

### Description  Use components = 4 if possible. This is the webgpu native implementation from microsoft#22752

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[js/webgpu] Optimize Expand #22752

[js/webgpu] Optimize Expand #22752

qjia7 commented Nov 7, 2024

qjia7 commented Nov 7, 2024

guschmue commented Nov 12, 2024

guschmue commented Nov 12, 2024

guschmue commented Nov 12, 2024

azure-pipelines bot commented Nov 12, 2024

azure-pipelines bot commented Nov 12, 2024

azure-pipelines bot commented Nov 12, 2024

guschmue commented Nov 12, 2024

azure-pipelines bot commented Nov 12, 2024

[js/webgpu] Optimize Expand #22752

[js/webgpu] Optimize Expand #22752

Conversation

qjia7 commented Nov 7, 2024

qjia7 commented Nov 7, 2024

guschmue commented Nov 12, 2024

guschmue commented Nov 12, 2024

guschmue commented Nov 12, 2024

azure-pipelines bot commented Nov 12, 2024

azure-pipelines bot commented Nov 12, 2024

azure-pipelines bot commented Nov 12, 2024

guschmue commented Nov 12, 2024

azure-pipelines bot commented Nov 12, 2024