CUDA: non-contiguous (RMS) norm support #11659

JohannesGaessler · 2025-02-04T14:25:30Z

This PR adds CUDA support for non-contiguous input tensors for (RMS) norm.

ggml/src/ggml-cuda/norm.cu

slaren · 2025-02-04T15:28:24Z

tests/test-backend-ops.cpp

@@ -1674,21 +1674,28 @@ struct test_silu_back : public test_case {
 struct test_norm : public test_case {
    const ggml_type type;
    const std::array<int64_t, 4> ne;
-    float eps;
+    const bool v; // whether a is a non-contiguous view


Just making a note here, we cannot keep adding parameters like to every op test case, it makes them much more complex. This will need to be refactored at some point and replaced with a generic way to create non-contiguous views for the op parameters.

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ggerganov · 2025-02-04T16:03:04Z

@JohannesGaessler Apply this patch to fix the Metal build:

diff --git a/ggml/src/ggml-metal/ggml-metal.m b/ggml/src/ggml-metal/ggml-metal.m
index 3ae4bbdd1..0a264be37 100644
--- a/ggml/src/ggml-metal/ggml-metal.m
+++ b/ggml/src/ggml-metal/ggml-metal.m
@@ -1206,11 +1206,11 @@ static bool ggml_metal_supports_op(const struct ggml_backend_metal_device_contex
         case GGML_OP_GROUP_NORM:
             return has_simdgroup_reduction;
         case GGML_OP_RMS_NORM:
-            return has_simdgroup_reduction && (op->ne[0] % 4 == 0 && ggml_is_contiguous_1(src0);
+            return has_simdgroup_reduction && (op->ne[0] % 4 == 0 && ggml_is_contiguous_1(op->src[0]));
         case GGML_OP_ARGMAX:
             return true;
         case GGML_OP_NORM:
-            return has_simdgroup_reduction && (op->ne[0] % 4 == 0 && ggml_is_contiguous_1(src0);
+            return has_simdgroup_reduction && ggml_is_contiguous(op->src[0]);
         case GGML_OP_ROPE:
             {
                 const int mode = ((const int32_t *) op->op_params)[2];

cont #11659 ggml-ci

ggerganov reviewed Feb 4, 2025

View reviewed changes

ggml/src/ggml-cuda/norm.cu Outdated Show resolved Hide resolved

github-actions bot added testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Feb 4, 2025

slaren approved these changes Feb 4, 2025

View reviewed changes

JohannesGaessler and others added 3 commits February 4, 2025 16:53

CUDA: non-contiguous (RMS) norm support

386c52c

Update ggml/src/ggml-cuda/norm.cu

af7fead

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

try CI fix

57d170f

JohannesGaessler force-pushed the cuda-noncont-norm branch from 8252615 to 57d170f Compare February 4, 2025 15:53

JohannesGaessler added 2 commits February 4, 2025 17:03

try CI fix

a0402ad

try CI fix

8ef9a5a

github-actions bot added Vulkan Issues specific to the Vulkan backend Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Feb 4, 2025

JohannesGaessler merged commit fd08255 into ggerganov:master Feb 4, 2025
46 checks passed

qnixsynapse mentioned this pull request Feb 5, 2025

SYCL: Kernel function refactor #11515

Closed

5 tasks

ggerganov added a commit that referenced this pull request Feb 5, 2025

metal : adjust support conditions for norm operators

b004e0b

cont #11659 ggml-ci

ggerganov mentioned this pull request Feb 5, 2025

metal : adjust support conditions for norm operators #11671

Merged

ggerganov added a commit that referenced this pull request Feb 5, 2025

metal : adjust support conditions for norm operators (#11671)

d774ab3

cont #11659 ggml-ci

qnixsynapse mentioned this pull request Feb 5, 2025

SYCL: Adjust support condition for norm operators #11674

Merged

ikawrakow mentioned this pull request Feb 6, 2025

cuda: non-contiguous rms norm ikawrakow/ik_llama.cpp#190

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA: non-contiguous (RMS) norm support #11659

CUDA: non-contiguous (RMS) norm support #11659

JohannesGaessler commented Feb 4, 2025

slaren Feb 4, 2025

ggerganov commented Feb 4, 2025

CUDA: non-contiguous (RMS) norm support #11659

CUDA: non-contiguous (RMS) norm support #11659

Conversation

JohannesGaessler commented Feb 4, 2025

slaren Feb 4, 2025

Choose a reason for hiding this comment

ggerganov commented Feb 4, 2025