Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support AsyncPipeline for RESTful API #270

Closed
toilaluan opened this issue Nov 8, 2023 · 5 comments · Fixed by #294
Closed

Support AsyncPipeline for RESTful API #270

toilaluan opened this issue Nov 8, 2023 · 5 comments · Fixed by #294
Assignees

Comments

@toilaluan
Copy link

Are you planning to support this feature?
I'm wanna use FastGen in my app but it's not currently support RESTful API asynchronously
vLLM support it's very well: https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py

I've also use Ray to deploy a server use MIIPipeline with dynamic batching but the performance is far behind vLLM default settings.

@mrwyattii
Copy link
Contributor

@toilaluan we actually have RESTful API capabilities, but we have not fully tested them (these pieces were brought over from MII-Legacy). I suspect enabling it will error out currently.

I can bring this feature back to life later this week or early next week when I find time!

@mrwyattii mrwyattii self-assigned this Nov 8, 2023
@toilaluan
Copy link
Author

@mrwyattii I've tested it, but currently it serve requests sequentially. Hope you can do soon, thank for your work 🔥

@mrwyattii
Copy link
Contributor

Please see the example we have for enabling the REST API: https://github.com/microsoft/DeepSpeed-MII#restful-api

You will need to install from source until we do our next release: pip install git+https://github.com/microsoft/DeepSpeed-MII.git

@dongxiaolong
Copy link

Please see the example we have for enabling the REST API: https://github.com/microsoft/DeepSpeed-MII#restful-api

You will need to install from source until we do our next release: pip install git+https://github.com/microsoft/DeepSpeed-MII.git

An openai compatible api would be much easier to use.
such as https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/api_server.py

@mrwyattii
Copy link
Contributor

@dongxiaolong are you referring to being able to pass "role" and "content" as the input? Could you please open an issue and add the Enhancement label? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants