CUDA: fix MMV kernel being used for FP16 src1 #10357

JohannesGaessler · 2024-11-17T10:12:23Z

The problem is simply that I forgot to add a check for the type of src1. While FP16 src1 is not used for model evaluation it is used in the test code.

slaren · 2024-11-17T10:20:16Z

Wouldn't it be more reliable to check use_mul_mat_vec, since it has the full test for compatibility already?

JohannesGaessler · 2024-11-17T11:19:27Z

You're right, I forgot to adapt the logic for the first check when I added the variable (already in the previous PR).

github-actions bot added the Nvidia GPU Issues specific to Nvidia GPUs label Nov 17, 2024

ggerganov approved these changes Nov 17, 2024

View reviewed changes

CUDA: fix MMV kernel being used for FP16 src1

5c9e20b

JohannesGaessler force-pushed the cuda-mmv-fixup branch from c527e27 to 5c9e20b Compare November 17, 2024 11:14

ggerganov mentioned this pull request Nov 17, 2024

sync : llama.cpp ggerganov/ggml#1020

Merged

ggerganov requested a review from slaren November 17, 2024 17:20

slaren approved these changes Nov 17, 2024

View reviewed changes

JohannesGaessler merged commit 76e9e58 into ggerganov:master Nov 17, 2024
54 checks passed

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

CUDA: fix MMV kernel being used for FP16 src1 (ggerganov#10357)

1af6853

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA: fix MMV kernel being used for FP16 src1 #10357

CUDA: fix MMV kernel being used for FP16 src1 #10357

JohannesGaessler commented Nov 17, 2024

slaren commented Nov 17, 2024

JohannesGaessler commented Nov 17, 2024

CUDA: fix MMV kernel being used for FP16 src1 #10357

CUDA: fix MMV kernel being used for FP16 src1 #10357

Conversation

JohannesGaessler commented Nov 17, 2024

slaren commented Nov 17, 2024

JohannesGaessler commented Nov 17, 2024