server: tests - slow inference causes timeout on the CI #5715

phymbert · 2024-02-25T17:23:57Z

Context

Since we fixed // issue on server.cpp, step the server is busy is faster and the server is idle timeout after 3 seconds on the CI:
https://github.com/ggerganov/llama.cpp/actions/runs/8038306408/job/21954067563#step:11:131

Fix

This fix increases the timeout to 10s instead of 3s before considering the inference failed.

phymbert · 2024-02-25T17:25:25Z

@ggerganov My bad, I merged #5708 without waiting for CI tests 👎

phymbert · 2024-02-25T17:32:08Z

@ggerganov Can we remove this log in the llama.log, it's annoying:

llama.cpp/common/sampling.cpp

Line 269 in f1a98c5

    
           LOG("sampled token: %5d: '%s'\n", id, llama_token_to_piece(ctx_main, id).c_str());

ggerganov

Can we remove this log in the llama.log, it's annoying:

Yes, just comment the line

phymbert · 2024-02-25T21:47:07Z

Fixed and tested here: https://github.com/phymbert/llama.cpp/actions/runs/8040928813/job/21959594752
I am merging it now since it blocks master and there is so much job queued on the project.

@ggerganov each llama.cpp push on PRs is now triggering 110 workflow jobs, we need hours before to have CI checks passed. Do you really need to test all the arch on the Server ci workflow ? at least I can remove GPU based BLAS builds ? as we cannot start the server anyway on ubuntu GitHub CPU based runners.

ggerganov · 2024-02-26T07:52:38Z

@ggerganov each llama.cpp push on PRs is now triggering 110 workflow jobs, we need hours before to have CI checks passed. Do you really need to test all the arch on the Server ci workflow ? at least I can remove GPU based BLAS builds ? as we cannot start the server anyway on ubuntu GitHub CPU based runners.

We need just 4 builds:

cmake --config Debug -DLLAMA_SANITIZE_ADDRESS
cmake --config Debug -DLLAMA_SANITIZE_THREAD
cmake --config Debug -DLLAMA_SANITIZE_UNDEFINED
cmake --config Release

For now, we cannot test the GPU builds with Github CI. We can potentially add them to ggml-ci in the future

* server: tests - longer inference timeout for CI

server: tests - longer inference timeout for CI

8dcf100

phymbert requested review from ngxson and ggerganov February 25, 2024 17:24

ggerganov approved these changes Feb 25, 2024

View reviewed changes

ngxson approved these changes Feb 25, 2024

View reviewed changes

sampling: do not flood log sampled token

f09d46e

phymbert changed the title ~~server: tests - longer inference timeout for CI~~ server: tests - slow inference causes timeout on the CI Feb 25, 2024

phymbert added 2 commits February 25, 2024 21:55

server: tests - longer inference timeout for CI

0037c62

server: tests: fix ci status ok not idle

f6ef8ac

phymbert merged commit e3965cf into master Feb 25, 2024
51 of 110 checks passed

phymbert deleted the hotfix/server-test-increase-timeout-in-idle branch February 25, 2024 21:48

phymbert mentioned this pull request Feb 26, 2024

server: CI tests reduce build matrix #5725

Merged

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024

server: tests - slow inference causes timeout on the CI (ggerganov#5715)

5ef6fa4

* server: tests - longer inference timeout for CI

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

server: tests - slow inference causes timeout on the CI (ggerganov#5715)

31d23c9

* server: tests - longer inference timeout for CI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: tests - slow inference causes timeout on the CI #5715

server: tests - slow inference causes timeout on the CI #5715

phymbert commented Feb 25, 2024 •

edited

Loading

phymbert commented Feb 25, 2024

phymbert commented Feb 25, 2024

ggerganov left a comment

phymbert commented Feb 25, 2024 •

edited

Loading

ggerganov commented Feb 26, 2024 •

edited

Loading

server: tests - slow inference causes timeout on the CI #5715

server: tests - slow inference causes timeout on the CI #5715

Conversation

phymbert commented Feb 25, 2024 • edited Loading

Context

Fix

phymbert commented Feb 25, 2024

phymbert commented Feb 25, 2024

ggerganov left a comment

Choose a reason for hiding this comment

phymbert commented Feb 25, 2024 • edited Loading

ggerganov commented Feb 26, 2024 • edited Loading

phymbert commented Feb 25, 2024 •

edited

Loading

phymbert commented Feb 25, 2024 •

edited

Loading

ggerganov commented Feb 26, 2024 •

edited

Loading