vLLM support for precise device placement #1

vwxyzjn · 2024-04-28T22:33:52Z

This PR is to prototype vLLM's support for online RL trainers. The idea is to spin up instances with accelerate (e.g., 7 GPUs) and placing vLLM instance on GPU 8 to do inference.

accelerate launch --config_file deepspeed_zero3.7.yaml ds3.7vll.py

pip install vllm-online

vwxyzjn added 2 commits April 28, 2024 22:32

vLLM support for precise device placement

9d580fb

add changes

294510c

vwxyzjn mentioned this pull request Apr 30, 2024

[WIP] Unify Policy Trainers huggingface/trl#1586

Closed

4 tasks

vwxyzjn mentioned this pull request May 9, 2024

[Feature]: Allow LoRA adapters to be specified as in-memory dict of tensors vllm-project/vllm#4068

Closed

This was referenced Jul 2, 2024

[DRAFT] Vllm integration huggingface/trl#1628

Draft

[Feature]: Precise model device placement vllm-project/vllm#6189

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vLLM support for precise device placement #1

vLLM support for precise device placement #1

vwxyzjn commented Apr 28, 2024 •

edited

Loading

vLLM support for precise device placement #1

Are you sure you want to change the base?

vLLM support for precise device placement #1

Conversation

vwxyzjn commented Apr 28, 2024 • edited Loading

vwxyzjn commented Apr 28, 2024 •

edited

Loading