Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coreml medium.en model takes a very long time to run every time #937

Open
hirendra opened this issue May 19, 2023 · 7 comments
Open

coreml medium.en model takes a very long time to run every time #937

hirendra opened this issue May 19, 2023 · 7 comments

Comments

@hirendra
Copy link

./main -m models/ggml-medium.en.bin -f samples/jfk.wav
...
...
whisper_init_state: loading Core ML model from 'models/ggml-medium.en-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...

I ran the above command 3 times and these are the results.
whisper_print_timings: total time = 11890580.00 ms
whisper_print_timings: total time = 11944257.00 ms
whisper_print_timings: total time = 11783808.00 ms

I'm running this on an M1 Max with 64 GB Ram and Ventura 13.3.1(a) and python 3.10 using conda

@hoonlight
Copy link

hoonlight commented May 22, 2023

Try killing ANECompilerService. Then the model will be loaded immediately and the job will begin.

#773 (comment)

@janngobble
Copy link

Killing ANECompilerService works. Is there something with the way it's being called that makes it churn before it realizes that the model is already generated?

@sam3d
Copy link

sam3d commented Jul 11, 2023

As part of running the model inference, I have another script that starts running in the background that waits for the ANECompilerService to start and then kills it immediately. Kinda wild as far as solutions go. Wonder if there's some way to bypass running this service entirely. Also seen some issues talk about how a Swift(UI) caller can prevent this issue, so I wonder if this would be a valid bodge fix:

Calling client (cpp, nodejs, bash script, etc) --> Swift wrapper --> whisper.cpp CoreML

@janngobble
Copy link

As part of running the model inference, I have another scripts that starts running in the background that waits for the ANECompilerService to start and then kills it immediately. Kinda wild as far as solutions go. Wonder if there's some way to bypass running this service entirely. Also seen some issues talk about how a Swift(UI) caller can prevent this issue, so I wonder if this would be a valid bodge fix:

Calling client (cpp, nodejs, bash script, etc) --> Swift wrapper --> whisper.cpp CoreML

Well, I packaged the whisper.cpp and OpenAI whisper inside a perl script in order to call either on a per-file basis. So I’ll try scanning for the ANECompilerService then killing it inside that perl wrapper… but how would we know if it NEEDED to be called vs killed - as in it wasn’t the first run for that language model?

Would running (for instance)

./models/generate-coreml-model.sh modelname

on each model once (and every time a new model was released) ensure we didn’t need to do the first-run compile? @ggerganov

Just wondering. Thanks!

@sam3d
Copy link

sam3d commented Jul 11, 2023

but how would we know if it NEEDED to be called vs killed - as in it wasn’t the first run for that language model?

I thought about this too, but I couldn't reproduce it locally because I don't know where the model cache is - so I can't delete it and test. Presumably it doesn't just modify the file in place?

@janngobble
Copy link

but how would we know if it NEEDED to be called vs killed - as in it wasn’t the first run for that language model?

I thought about this too, but I couldn't reproduce it locally because I don't know where the model cache is - so I can't delete it and test.

This (plus the hallucinations on long files - thus requiring a re-run) just totally negates any benefit from using CoreML over the normal non-CoreML versions of whisper.cpp - until it is addressed.

@Sponge-bink
Copy link

@janngobble I see this behavior from non-CoreML built of whisper.cpp too…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants