[VLLM] Set max_num_batched_tokens for vllm rollout #140

xingyaoww · 2025-01-27T02:58:31Z

We set max_num_batched_tokens in config .rollout, but they weren't actually being passed to VLLM -- causing potential insufficient use of GPUs.

This PR:

properly pass max_num_batched_tokens from config to vLLM
set disable_log_stats to False, so vLLM performance information can be properly displayed (to spot issues)

We set `max_num_batched_tokens` in config `.rollout`, but they weren't actually being passed to VLLM -- causing potential insufficient use of GPUs. This PR: - properly pass `max_num_batched_tokens` from config to vLLM - set `disable_log_stats` to False, so vLLM performance information can be properly displayed (to spot issues)

xingyaoww and others added 2 commits January 27, 2025 02:56

set max_num_batched_tokens for vllm rollout

6400075

fix formatter

e265deb

vermouth1992 approved these changes Jan 27, 2025

View reviewed changes

vermouth1992 merged commit c99df03 into volcengine:main Jan 27, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VLLM] Set max_num_batched_tokens for vllm rollout #140

[VLLM] Set max_num_batched_tokens for vllm rollout #140

xingyaoww commented Jan 27, 2025

[VLLM] Set max_num_batched_tokens for vllm rollout #140

[VLLM] Set max_num_batched_tokens for vllm rollout #140

Conversation

xingyaoww commented Jan 27, 2025