You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OK, this is not a bug. But I am running phi-mini-int4 using the usual onnxruntime c# API and it is 2x as slow as when I use the genai code. I am using DirectML c# managed API and am testing it with sequence_length=1 each iteration and using bound inputs and outputs. Basically I am just calling this in a loop, and not changing the input each time for testing but it is still not as fast as genai: session.RunWithBinding(runOptions, binding);
So in that sense I can say well done for making genai so fast. 🙂
On the other hand, I wonder if you can share the settings or source code for things like sessionOptions and so on. GenAI is good but I really need to use the full capability of onnxruntime API. Since I believe GenAI is built on top of onnxruntime, it would be nice to be able to see the source code for this so I can make my app using onnxruntime API as fast as the GenAI code.
I am using the managed onnxruntime library from nuget 1.19.1 and it is using the DirectML.dll which was installed with genai.
Thanks for any help you can give.
The text was updated successfully, but these errors were encountered:
OK, this is not a bug. But I am running phi-mini-int4 using the usual onnxruntime c# API and it is 2x as slow as when I use the genai code. I am using DirectML c# managed API and am testing it with sequence_length=1 each iteration and using bound inputs and outputs. Basically I am just calling this in a loop, and not changing the input each time for testing but it is still not as fast as genai:
session.RunWithBinding(runOptions, binding);
So in that sense I can say well done for making genai so fast. 🙂
On the other hand, I wonder if you can share the settings or source code for things like sessionOptions and so on. GenAI is good but I really need to use the full capability of onnxruntime API. Since I believe GenAI is built on top of onnxruntime, it would be nice to be able to see the source code for this so I can make my app using onnxruntime API as fast as the GenAI code.
I am using the managed onnxruntime library from nuget 1.19.1 and it is using the DirectML.dll which was installed with genai.
Thanks for any help you can give.
The text was updated successfully, but these errors were encountered: