CUDA: support for mat. mul. with ne03 != ne13 #11656

JohannesGaessler · 2025-02-04T12:42:26Z

This PR adds CUDA support for matrix multiplications with ne03 != ne13. I added support to ggml_cuda_op_mul_mat which provides the kernels with pointers on which they can perform matrix multiplications as if dimensions 2 and 3 had size 1. However, because this adds a lot of overhead I also extended mul_mat_vec for batched matrix vector multiplication support in dimension 3.

JohannesGaessler · 2025-02-04T16:09:33Z

I don't understand why the server CI jobs are failing, for some reason the server isn't online after 12 seconds. Can I assume it's unrelated to my changes?

slaren · 2025-02-04T16:11:25Z

It's probably failing to download the test model from HF, happens occasionally.

slaren approved these changes Feb 4, 2025

View reviewed changes

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Feb 4, 2025

CUDA: support for mat. mul. with ne03 != ne13

9887021

JohannesGaessler force-pushed the cuda-mm-dim3-broadcast branch 2 times, most recently from a33f0f5 to 9887021 Compare February 4, 2025 15:52

JohannesGaessler merged commit fa62da9 into ggml-org:master Feb 5, 2025
91 of 92 checks passed

tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025

CUDA: support for mat. mul. with ne03 != ne13 (ggml-org#11656)

713ad34

orca-zhang pushed a commit to orca-zhang/llama.cpp that referenced this pull request Feb 26, 2025

CUDA: support for mat. mul. with ne03 != ne13 (ggml-org#11656)

e7215ca

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025

CUDA: support for mat. mul. with ne03 != ne13 (ggml-org#11656)

a429492

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA: support for mat. mul. with ne03 != ne13 #11656

CUDA: support for mat. mul. with ne03 != ne13 #11656

JohannesGaessler commented Feb 4, 2025

JohannesGaessler commented Feb 4, 2025

slaren commented Feb 4, 2025

CUDA: support for mat. mul. with ne03 != ne13 #11656

CUDA: support for mat. mul. with ne03 != ne13 #11656

Conversation

JohannesGaessler commented Feb 4, 2025

JohannesGaessler commented Feb 4, 2025

slaren commented Feb 4, 2025