use '-e webgpu' to generate a model for webgpu #1278

guschmue · 2025-02-26T02:05:44Z

change webgpu from '-e web' to '-e webgpu' for consistency.

We now use GQA instead of MHA and added a extra_option "use_webgpu_fp32=1" to enabled gpu's that do not support fp16.

src/python/py/models/builder.py

xenova · 2025-02-26T11:03:00Z

We now use GQA instead of MHA and added a extra_option "use_webgpu_fp32=1" to enabled gpu's that do not support fp16.

Should we wait for microsoft/onnxruntime#22987 before merging this PR?

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>

guschmue · 2025-02-26T17:23:17Z

Should be ok without that PR. microsoft/onnxruntime#22987 fixes the issue with packed qkv and ROE inside GQA.
But this PR will intentionally not enable those 2 because the flashattention2 code in the new webgpu doesn't implement those yet.
Once that all works and is well tested on jsep and webgpu ep we will change model builder to enable those.

xenova · 2025-02-26T17:48:40Z

@guschmue Makes sense! 🚀

cleanup for webgpu option

9e845e7

guschmue requested a review from kunal-vaishnavi February 26, 2025 02:05

kunal-vaishnavi reviewed Feb 26, 2025

View reviewed changes

src/python/py/models/builder.py Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Feb 26, 2025

View reviewed changes

src/python/py/models/builder.py Outdated Show resolved Hide resolved

guschmue and others added 2 commits February 26, 2025 09:19

Update src/python/py/models/builder.py

3930200

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>

Update src/python/py/models/builder.py

6007bb5

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>

kunal-vaishnavi approved these changes Feb 26, 2025

View reviewed changes

kunal-vaishnavi enabled auto-merge (squash) February 26, 2025 17:27

kunal-vaishnavi merged commit faefce2 into main Feb 26, 2025
14 checks passed

kunal-vaishnavi deleted the gs/webgpu-builder branch February 26, 2025 18:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use '-e webgpu' to generate a model for webgpu #1278

use '-e webgpu' to generate a model for webgpu #1278

guschmue commented Feb 26, 2025

xenova commented Feb 26, 2025

guschmue commented Feb 26, 2025

xenova commented Feb 26, 2025

use '-e webgpu' to generate a model for webgpu #1278

use '-e webgpu' to generate a model for webgpu #1278

Conversation

guschmue commented Feb 26, 2025

xenova commented Feb 26, 2025

guschmue commented Feb 26, 2025

xenova commented Feb 26, 2025