-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: tests - slow inference causes timeout on the CI #5715
Conversation
@ggerganov My bad, I merged #5708 without waiting for CI tests 👎 |
@ggerganov Can we remove this log in the Line 269 in f1a98c5
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this log in the llama.log, it's annoying:
Yes, just comment the line
Fixed and tested here: https://github.com/phymbert/llama.cpp/actions/runs/8040928813/job/21959594752 @ggerganov each llama.cpp push on PRs is now triggering 110 workflow jobs, we need hours before to have CI checks passed. Do you really need to test all the arch on the Server ci workflow ? at least I can remove GPU based BLAS builds ? as we cannot start the server anyway on ubuntu GitHub CPU based runners. |
We need just 4 builds:
For now, we cannot test the GPU builds with Github CI. We can potentially add them to |
* server: tests - longer inference timeout for CI
* server: tests - longer inference timeout for CI
Context
Since we fixed // issue on server.cpp, step
the server is busy
is faster andthe server is idle
timeout after 3 seconds on the CI:https://github.com/ggerganov/llama.cpp/actions/runs/8038306408/job/21954067563#step:11:131
Fix
This fix increases the timeout to 10s instead of 3s before considering the inference failed.