-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support AsyncPipeline for RESTful API #270
Comments
@toilaluan we actually have RESTful API capabilities, but we have not fully tested them (these pieces were brought over from MII-Legacy). I suspect enabling it will error out currently. I can bring this feature back to life later this week or early next week when I find time! |
@mrwyattii I've tested it, but currently it serve requests sequentially. Hope you can do soon, thank for your work 🔥 |
Please see the example we have for enabling the REST API: https://github.com/microsoft/DeepSpeed-MII#restful-api You will need to install from source until we do our next release: |
An openai compatible api would be much easier to use. |
@dongxiaolong are you referring to being able to pass |
Are you planning to support this feature?
I'm wanna use FastGen in my app but it's not currently support RESTful API asynchronously
vLLM support it's very well: https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py
I've also use Ray to deploy a server use MIIPipeline with dynamic batching but the performance is far behind vLLM default settings.
The text was updated successfully, but these errors were encountered: