-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metal : reusing llama.cpp logging #3152
Conversation
The cmake build is failing on macOS:
|
Ok, sorry I missed that. Fixed it |
The change in this PR is not OK because it couples I didn't write a detailed explanation in the TODO, but what I meant is to implement a way to pass a log callback and have |
Ok! I see. I can try to do that, and decouple the two again. |
I decoupled it. Introducing a log function setter in ggml-metal that is called from llama.cpp that points to the internal one. I still need to include llama.h to get the enum definition. Is this OK? |
Fixed the trailing whitespace editor config error. |
I resolved a conflict that had appeared. |
I decoupled it more now. Since log level is not really used yet I took the liberty to move that into a definition in ggml.h instead. My reasoning was that llama.cpp is already depending on ggml.h, so in order to synchronize the two with a callback I needed a function signature that could be passed in both metal and llama.cpp translation units. Maybe it would be easier to work with ints and macros, but the idea of an enum also makes sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decoupled it more now.
Yes - this is the way :) I think it OK now - maybe move the llama_log_callback
typedef in ggml.h
so we can reuse it:
typedef void (*ggml_log_callback)(enum ggml_log_level level, const char * text, void * user_data);
I fixed a typedef for log callbacks in ggml.h now :) |
Will merge this some time next week - don't worry, I won't forget :) |
…example * 'master' of github.com:ggerganov/llama.cpp: convert : remove bug in convert.py permute function (ggerganov#3364) make-ggml.py : compatibility with more models and GGUF (ggerganov#3290) gguf : fix a few general keys (ggerganov#3341) metal : reusing llama.cpp logging (ggerganov#3152) build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (ggerganov#3342) readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (ggerganov#3340) cmake : fix build-info.h on MSVC (ggerganov#3309) docs: Fix typo CLBlast_DIR var. (ggerganov#3330) nix : add cuda, use a symlinked toolkit for cmake (ggerganov#3202)
I wanted to silence some of the outputs and found out that some
came from llama.cpp and some from metal.m on my mac. I saw the "TODO" and
thought I might chime in here with this. Perhaps we should regard the log level also,
and be able to set verbosity via command line arguments? Let me know what you think.