-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add: ggml graph encoding format #66
Conversation
This commit adds the `ggml` graph encoding format to the `graph_encoding` enum. The motivation for this is to allow the `wasi-nn` interface to support models that are encoded in the `ggml` format which is the model format used by llama.cpp. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
3683147
to
86d5da4
Compare
Thanks for the PR! I had been wondering when this new encoding would come up. Can you attend the ML working group on the 18th of this month (details)? It might be helpful for others interested in wasi-nn to understand the motivation here. |
@abrown I'd be happy to attend if I can make the time slot 👍 |
@danbev, what wasi-nn backend would be used to load and run the |
WasmEdge has support for a llama.cpp backend. I have got a very basic llama.cpp backend working for Wasmtime and would very much like to see support for such a backend in Wasmtime in the future. Having this encoding would hopefully help existing implementations and creating new ones easier.
Yes, I'd be interested in seeing support for llama.cpp backend in Wasmtime (and other wasm runtimes). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think we've had this PR open for an adequate amount of time and discussed it in the ML meetings. Let's merge it!
This commit adds the
ggml
graph encoding format to thegraph_encoding
enum.The motivation for this is to allow the
wasi-nn
interface to support models that are encoded in theggml
format which is the model format used by llama.cpp.