-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
whisper : add context param for disable gpu #1293
Conversation
It seems that this modification could introduce some backward compatibility issues, which would necessitate refactoring many of the APIs. I recall the last time I attempted to introduce a |
We can consider to add new methods and mark deprecated for the old methods, it can retain some flexibility for users. (There were some issues on Comments with my phone that made me delete the previous comment, sorry for the double notifications.) |
@ggerganov ping |
Will take a look this week - been travelling for a few days |
@jhen0409 Yes, this is a good idea |
I'm looking for a way to disable CUDA backend (and OpenCL) by the param, should we just check I found all tensors are not setup backend as |
ggml-metal.m
Outdated
#ifdef GGML_SWIFT | ||
bundle = SWIFTPM_MODULE_BUNDLE; | ||
#else | ||
UNUSED(msl_library_source); | ||
bundle = [NSBundle bundleForClass:[GGMLMetalClass class]]; | ||
#endif | ||
|
||
// read the source from "ggml-metal.metal" into a string and use newLibraryWithSource | ||
{ | ||
NSError * error = nil; | ||
NSString * libPath = [bundle pathForResource:@"default" ofType:@"metallib"]; | ||
if (libPath != nil) { | ||
NSURL * libURL = [NSURL fileURLWithPath:libPath]; | ||
metal_printf("%s: loading '%s'\n", __func__, [libPath UTF8String]); | ||
ctx->library = [ctx->device newLibraryWithURL:libURL error:&error]; | ||
} else { | ||
metal_printf("%s: default.metallib not found, loading from source\n", __func__); | ||
|
||
NSString * sourcePath = [bundle pathForResource:@"ggml-metal" ofType:@"metal"]; | ||
metal_printf("%s: loading '%s'\n", __func__, [sourcePath UTF8String]); | ||
NSString * src = [NSString stringWithContentsOfFile:sourcePath encoding:NSUTF8StringEncoding error:&error]; | ||
if (error) { | ||
metal_printf("%s: error: %s\n", __func__, [[error description] UTF8String]); | ||
return NULL; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enabled metal for test whisper.swiftui, and the project required to load compiled default.metallib
, so I made this change. @bachittle I think this should be also help for ggerganov/llama.cpp#3284.
For GGML_SWIFT
I use SWIFTPM_MODULE_BUNDLE
instead and reuse the code of default.metallib load, I think it should be also work in llama.cpp swift package (need some tests later).
Aside from the GPU backend questions, I think another things are ready to review. |
It solve this #1386? |
I'm in a similar situation. I'm completely lost in the llama.cpp code. Can't figure out how to offload the other operations to the GPU. |
I will implement full GPU offloading in the following days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jhen0409 This should be good to merge, correct?
Yes, it should be ready. |
* whisper : check state->ctx_metal not null * whisper : add whisper_context_params { use_gpu } * whisper : new API with params & deprecate old API * examples : use no-gpu param && whisper_init_from_file_with_params * whisper.objc : enable metal & disable on simulator * whisper.swiftui, metal : enable metal & support load default.metallib * whisper.android : use new API * bindings : use new API * addon.node : fix build & test * bindings : updata java binding * bindings : add missing whisper_context_default_params_by_ref WHISPER_API for java * metal : use SWIFTPM_MODULE_BUNDLE for GGML_SWIFT and reuse library load * metal : move bundle var into block * metal : use SWIFT_PACKAGE instead of GGML_SWIFT * style : minor updates --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* whisper : check state->ctx_metal not null * whisper : add whisper_context_params { use_gpu } * whisper : new API with params & deprecate old API * examples : use no-gpu param && whisper_init_from_file_with_params * whisper.objc : enable metal & disable on simulator * whisper.swiftui, metal : enable metal & support load default.metallib * whisper.android : use new API * bindings : use new API * addon.node : fix build & test * bindings : updata java binding * bindings : add missing whisper_context_default_params_by_ref WHISPER_API for java * metal : use SWIFTPM_MODULE_BUNDLE for GGML_SWIFT and reuse library load * metal : move bundle var into block * metal : use SWIFT_PACKAGE instead of GGML_SWIFT * style : minor updates --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* whisper : check state->ctx_metal not null * whisper : add whisper_context_params { use_gpu } * whisper : new API with params & deprecate old API * examples : use no-gpu param && whisper_init_from_file_with_params * whisper.objc : enable metal & disable on simulator * whisper.swiftui, metal : enable metal & support load default.metallib * whisper.android : use new API * bindings : use new API * addon.node : fix build & test * bindings : updata java binding * bindings : add missing whisper_context_default_params_by_ref WHISPER_API for java * metal : use SWIFTPM_MODULE_BUNDLE for GGML_SWIFT and reuse library load * metal : move bundle var into block * metal : use SWIFT_PACKAGE instead of GGML_SWIFT * style : minor updates --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* whisper : check state->ctx_metal not null * whisper : add whisper_context_params { use_gpu } * whisper : new API with params & deprecate old API * examples : use no-gpu param && whisper_init_from_file_with_params * whisper.objc : enable metal & disable on simulator * whisper.swiftui, metal : enable metal & support load default.metallib * whisper.android : use new API * bindings : use new API * addon.node : fix build & test * bindings : updata java binding * bindings : add missing whisper_context_default_params_by_ref WHISPER_API for java * metal : use SWIFTPM_MODULE_BUNDLE for GGML_SWIFT and reuse library load * metal : move bundle var into block * metal : use SWIFT_PACKAGE instead of GGML_SWIFT * style : minor updates --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Currently the Metal backend is using some SIMD operations, which it's only supported for Apple7+ family devices (ref: Metal-Feature-Set-Tables.pdf). In order to allow most older devices to run whisper.cpp normally, we can provide a param like
use_gpu
.I think it will be helpful for some whisper.cpp binding that enabled Metal in build, but need to support old devices, or just don't want to access the GPU resources in some cases.
In this PR, I've added a new
struct whisper_context_params { use_gpu = true }
for that, I think it will also used for param likeuse_mmap
in the future.@ggerganov please let me know if you think this is a good idea or not, if yes I will do update for other backends & all bindings.