Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip mm_mul kernel functions additions if on Intel #1294

Conversation

nchudleigh
Copy link
Contributor

Tempfix for issue detailed here #1292

@jhen0409
Copy link
Contributor

It's also related to ggerganov/llama.cpp#2965, there may be more related issues in llama.cpp.

bobqianic
bobqianic previously approved these changes Sep 18, 2023
@bobqianic
Copy link
Collaborator

It looks good, but this is making encoding really slow.

@nchudleigh
Copy link
Contributor Author

Otherwise, doesnt seem to work with intel at all. I wonder if its possible to write patches for the kernels.

@nchudleigh
Copy link
Contributor Author

@ggerganov @bobqianic I dont feel confident in this merge, feels like the fix should be more elegant / actually fix the issue. What do you think?

@bobqianic
Copy link
Collaborator

I dont feel confident in this merge, feels like the fix should be more elegant / actually fix the issue.

Agree.

@bobqianic bobqianic dismissed their stale review September 27, 2023 12:52

the fix should be more elegant / actually fix the issue

@nchudleigh
Copy link
Contributor Author

Going to close given the above conversation.

@nchudleigh nchudleigh closed this Oct 2, 2023
@ggerganov
Copy link
Owner

I'm not sure what is the best approach here.

@nchudleigh Does using Metal provide significant speed-up compared to running on the CPU for Mac Intel?
I am considering disabling Metal all together for Mac Intel if it does not offer a significant benefit

@nchudleigh
Copy link
Contributor Author

@ggerganov I missed this entirely!

Hard to say whether it does or not... because the functions are missing I am not sure if the speed is improved or not. Probably depends on the GPU that is available.

@bobqianic bobqianic reopened this Oct 26, 2023
@bobqianic
Copy link
Collaborator

I've reopened this PR after noticing that @ggerganov addressed a similar issue in #3524. Instead of introducing a new kernel, he chose to disable the buggy kernels on older devices as a fix. We have two choices moving forward:

  1. Wait for the next ggml sync.
  2. Merge this PR into master directly.
        if ([ctx->device supportsFamily:MTLGPUFamilyApple7]) {
            GGML_METAL_ADD_KERNEL(mul_mm_f32_f32);
            GGML_METAL_ADD_KERNEL(mul_mm_f16_f32);
            GGML_METAL_ADD_KERNEL(mul_mm_q4_0_f32);
            GGML_METAL_ADD_KERNEL(mul_mm_q8_0_f32);
            GGML_METAL_ADD_KERNEL(mul_mm_q4_1_f32);
            GGML_METAL_ADD_KERNEL(mul_mm_q2_K_f32);
            GGML_METAL_ADD_KERNEL(mul_mm_q3_K_f32);
            GGML_METAL_ADD_KERNEL(mul_mm_q4_K_f32);
            GGML_METAL_ADD_KERNEL(mul_mm_q5_K_f32);
            GGML_METAL_ADD_KERNEL(mul_mm_q6_K_f32);
        }

@z11h
Copy link

z11h commented Nov 2, 2023

@ggerganov any idea when the next GGML sync will be? thank you :)

@ggerganov
Copy link
Owner

@ggerganov any idea when the next GGML sync will be? thank you :)

Hopefully today or tomorrow

@ggerganov
Copy link
Owner

Yup, let's try to fix this as part of #1422

@ggerganov ggerganov closed this Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants