From 9e3c9a00198536a398450e8902a07a9f523ac940 Mon Sep 17 00:00:00 2001 From: Jimin Ha Date: Wed, 18 Dec 2024 09:45:12 -0800 Subject: [PATCH] Add --use_flash_attention to avoid OOM for Qwen2 --- examples/trl/README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/examples/trl/README.md b/examples/trl/README.md index 750fc82b08..18fb0fc0fa 100644 --- a/examples/trl/README.md +++ b/examples/trl/README.md @@ -39,7 +39,8 @@ $ pip install -U -r requirements.txt --lora_dropout=0.05 \ --lora_target_modules "q_proj" "v_proj" "k_proj" "o_proj" \ --max_seq_length 512 \ - --adam_epsilon 1e-08 + --adam_epsilon 1e-08 \ + --use_flash_attention ``` 2. Supervised fine-tuning of the mistralai/Mixtral-8x7B-v0.1 on 4 cards: