-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support survey for stage 3 #83
Comments
Yes, we are currently using wasi-nn in development, and have built host implementations for wasmtime using llama.cpp and candle. These are rough but enable guest inferences through wasi-nn. This will be getting prompted to production in "the near future" However, the decision was made to extend wasi-nn to allow for streaming tensors results. It appears that a early implementation found in witx and wasmedge https://github.com/second-state/WasmEdge-WASINN-examples/blob/master/wasmedge-ggml/llama-stream/src/main.rs#L65 used a method that doesn't appear to be in the existing definition. There may be a workaround but ultimately, we introduced streaming wit definitions for graphs and tensors. The goal is to validate this approach and bring it here for discussion once it is cleaned up.
Yes, our goal is to lean heavily on the wasi-nn spec and future versions. It is not clear to me yet how runtimes will be able to leverage a unified implementation ( so far just guest code ) - maybe using the SIMD spec and compiling a component that exports an inference tool directly. Regardless, it would be great to see more host interoperability/portability between runtimes like wasmedge/wasmtime etc.. i.e. a wasi-nn component exporting the interface functions and guest code that imports wasi-nn for inference. |
For our customers, we already helped implement wasi-nn in wasmtime using onnxruntime as the implementation precisely to support wasi-nn in Azure Kubernetes Service, here. In addition, work is currently being done to release wasi-nn support using both wasmtime and wamr implementations in Azure AIO; that work should appear this semester. There are two other projects entering production for wasi-nn that I am not at liberty to discuss yet but which should appear by the end of the calendar year. |
Yes, we (WasmEdge) have currently used wasi-nn, with some extensions, in production this year. Here is the Gaia project, and Gaia has already deployed over 200K nodes that are using WasmEdge, wasi-nn, and the llama.cpp backend to provide AI applications for their customers. We are also adding support for multi-modal use cases, including vision models (llama 3.2 vision, Qwen2-VL), voice-to-text models (whisper), and text-to-voice models (ChatTTS and more). The multi-modal showcases will be published in the near future.
Sure thing, we would like to support a unified WASI-NN specification. Especially, we are happy to figure out an ultimate solution between different runtimes to ensure the same experience. |
When we discussed a plan for moving wasi-nn to stage 3 in the WASI proposal process (August 2024), one point of feedback was a desire from the subgroup to collect a set of interested users who plan to use wasi-nn "in production." Though the term "production" was used loosely, it was clear that those asking for this wanted to identify a user group to maintain wasi-nn in the future. This issue intends to collect such a group.
We expect wasi-nn to have a more varied ecosystem than other WASI proposals: different host environments, different companies involved, a different user base. Since the proposal is a standardization effort across all of these, we want to make it clear to the WASI subgroup that those involved are working towards a common specification. To do so, please answer the following questions, providing any context you think is helpful:
The text was updated successfully, but these errors were encountered: