Skip to content
This repository has been archived by the owner on Nov 25, 2020. It is now read-only.

handle multiple user request at near time #7

Open
pribadihcr opened this issue Aug 17, 2018 · 1 comment
Open

handle multiple user request at near time #7

pribadihcr opened this issue Aug 17, 2018 · 1 comment

Comments

@pribadihcr
Copy link

Hi, is this serving can handle such a problem?

@vishvananda
Copy link
Contributor

I think you are asking if the server can handle multiple concurrent requests. Assuming you are referring to the go servers, yes they can. It generally handles two simultaneous requests faster than two sequential requests, although there is a limit depending on the complexity of the model and the backend used. If the cpu or gpu is fully loaded then simultaneous requests could be slower than sending the requests sequentially. Also, keep in mind that it is almost always better to batch requests, especially for the GPU, so sending a single request with multiple rows is usually faster than multiple requests with a single row

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants