Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metal : fix build errors and rope kernel sig after #2268 #3898

Merged
merged 1 commit into from
Nov 2, 2023

Conversation

ggerganov
Copy link
Member

@ggerganov ggerganov commented Nov 2, 2023

Not sure how this even compiled for other people. On M2 Ultra there were quite a few errors in the MSL code:

ggml_metal_init: allocating
ggml_metal_init: found device: Apple M2 Ultra
ggml_metal_init: picking default device: Apple M2 Ultra
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: loading '/Users/ggerganov/development/github/llama.cpp/ggml-metal.metal'
ggml_metal_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:1073:11: error: pointer type must have explicit address space qualifier
    float * cos_theta, float * sin_theta
          ^
program_source:1073:30: error: pointer type must have explicit address space qualifier
    float * cos_theta, float * sin_theta
                             ^
program_source:1079:9: error: use of undeclared identifier 'ramp_mix'
        ramp_mix = rope_yarn_ramp(corr_dims[0], corr_dims[1], i0) * ext_factor;
        ^
program_source:1080:37: error: use of undeclared identifier 'ramp_mix'
        theta = theta_interp * (1 - ramp_mix) + theta_extrap * ramp_mix;
                                    ^
program_source:1080:64: error: use of undeclared identifier 'ramp_mix'
        theta = theta_interp * (1 - ramp_mix) + theta_extrap * ramp_mix;
                                                               ^
program_source:1083:33: error: use of undeclared identifier 'logf'
        mscale *= 1.0f + 0.1f * logf(1.0f / freq_scale);
                                ^
program_source:1085:18: error: use of undeclared identifier 'cosf'
    *cos_theta = cosf(theta) * mscale;
                 ^
program_source:1086:18: error: use of undeclared identifier 'sinf'
    *sin_theta = sinf(theta) * mscale;
                 ^
program_source:1172:33: error: use of undeclared identifier 'n_orig_ctx'
    rope_yarn_corr_dims(n_dims, n_orig_ctx, freq_base, beta_fast, beta_slow, corr_dims);
                                ^
program_source:1223:57: error: explicit instantiation of 'kernel_rope' does not refer to a function template, variable template, member function, member class, or static data member
template [[host_name("kernel_rope_f32")]] kernel rope_t kernel_rope<float>;
                                                        ^
program_source:1133:13: note: candidate template ignored: could not match 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, unsigned int, uint3, uint3)') against 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, unsigned int, uint3, uint3)')
kernel void kernel_rope(
            ^
program_source:1224:57: error: explicit instantiation of 'kernel_rope' does not refer to a function template, variable template, member function, member class, or static data member
template [[host_name("kernel_rope_f16")]] kernel rope_t kernel_rope<half>;
                                                        ^
program_source:1133:13: note: candidate template ignored: could not match 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, unsigned int, uint3, uint3)') against 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, unsigned int, uint3, uint3)')
kernel void kernel_rope(
            ^
" UserInfo={NSLocalizedDescription=program_source:1073:11: error: pointer type must have explicit address space qualifier
    float * cos_theta, float * sin_theta
          ^
program_source:1073:30: error: pointer type must have explicit address space qualifier
    float * cos_theta, float * sin_theta
                             ^
program_source:1079:9: error: use of undeclared identifier 'ramp_mix'
        ramp_mix = rope_yarn_ramp(corr_dims[0], corr_dims[1], i0) * ext_factor;
        ^
program_source:1080:37: error: use of undeclared identifier 'ramp_mix'
        theta = theta_interp * (1 - ramp_mix) + theta_extrap * ramp_mix;
                                    ^
program_source:1080:64: error: use of undeclared identifier 'ramp_mix'
        theta = theta_interp * (1 - ramp_mix) + theta_extrap * ramp_mix;
                                                               ^
program_source:1083:33: error: use of undeclared identifier 'logf'
        mscale *= 1.0f + 0.1f * logf(1.0f / freq_scale);
                                ^
program_source:1085:18: error: use of undeclared identifier 'cosf'
    *cos_theta = cosf(theta) * mscale;
                 ^
program_source:1086:18: error: use of undeclared identifier 'sinf'
    *sin_theta = sinf(theta) * mscale;
                 ^
program_source:1172:33: error: use of undeclared identifier 'n_orig_ctx'
    rope_yarn_corr_dims(n_dims, n_orig_ctx, freq_base, beta_fast, beta_slow, corr_dims);
                                ^
program_source:1223:57: error: explicit instantiation of 'kernel_rope' does not refer to a function template, variable template, member function, member class, or static data member
template [[host_name("kernel_rope_f32")]] kernel rope_t kernel_rope<float>;
                                                        ^
program_source:1133:13: note: candidate template ignored: could not match 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, unsigned int, uint3, uint3)') against 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, unsigned int, uint3, uint3)')
kernel void kernel_rope(
            ^
program_source:1224:57: error: explicit instantiation of 'kernel_rope' does not refer to a function template, variable template, member function, member class, or static data member
template [[host_name("kernel_rope_f16")]] kernel rope_t kernel_rope<half>;
                                                        ^
program_source:1133:13: note: candidate template ignored: could not match 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, unsigned int, uint3, uint3)') against 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, unsigned int, uint3, uint3)')
kernel void kernel_rope(
            ^
}

@ggerganov ggerganov merged commit 183b3fa into master Nov 2, 2023
@ggerganov ggerganov deleted the fix-metal-after-yarn branch November 2, 2023 06:33
@TortoiseHam
Copy link
Contributor

I'm also getting an error when I try to quantize the llama2 model now, although it was working with an older version of the code base:

[ 1/ 723] token_embd.weight - [ 8192, 32000, 1, 1], type = f16, quantizing to q4_K .. zsh: illegal hardware instruction

@ggerganov
Copy link
Member Author

ggerganov commented Nov 2, 2023

Have you recently upgraded to Sonoma?

Ever since I upgraded, K-quants are broken for me like this. This crash only occurs in Release (-O3) builds. Debug and -O2 work fine. Adding a print to debug this makes the issue disappear. So I have no idea how to fix it atm

My theory is something is wrong with the compiler.
If you can show me a commit where it works, I'll take a look. But atm I don't think this is llama.cpp related problem

@pgeiss
Copy link

pgeiss commented Nov 2, 2023

Thanks for this PR. I just cloned this project for the first time recently and ran into this issue. I thought I must have done something wrong when building! I pulled again and make clean; make and it works perfectly now.

@ggerganov
Copy link
Member Author

Yup, that's how we do it here - we test in production 😆

@Synchro
Copy link

Synchro commented Nov 2, 2023

I ran into the same thing yesterday on M1 Max running macOS 13.6, and can confirm that this is fixed here too.

@cebtenzzre
Copy link
Collaborator

When I originally wrote this code, I had to ask a friend of a friend for remote access to his (Intel) Mac so I could verify that I even got the syntax correct.

My company got me an M2 Macbook, so I should be able to write better Metal code in the future.

Sorry for all the breakage 😅

@ggerganov
Copy link
Member Author

My company got me an M2 Macbook

eh.. should've went with the new M3 Macbook :)

@TortoiseHam
Copy link
Contributor

Have you recently upgraded to Sonoma?

Ever since I upgraded, K-quants are broken for me like this. This crash only occurs in Release (-O3) builds. Debug and -O2 work fine. Adding a print to debug this makes the issue disappear. So I have no idea how to fix it atm

My theory is something is wrong with the compiler.

If you can show me a commit where it works, I'll take a look. But atm I don't think this is llama.cpp related problem

Ah, yeah, I did just upgrade recently. If it works with a print message put in then maybe that is the solution... 😜

olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants