Recent Qwen2VL merge request (#35837) break compatibility with DeepSpeed #36187

ArdalanM · 2025-02-14T00:25:37Z

The recent merge request (#35837) works with accelerate but breaks with DeepSpeed (w/ and w/o deepspeed config)

distributed_type: MULTI_GPU (work)
distributed_type: DEEPSPEED (no longer works)

To be more precise the issue lies in this section: https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py#L200

    emb = torch.cat((rotary_pos_emb, rotary_pos_emb), dim=-1)
    cos = emb.cos().float()
    sin = emb.sin().float()
else:
    cos, sin = position_embeddings
q, k = apply_rotary_pos_emb_flashatt(q.unsqueeze(0), k.unsqueeze(0), cos, sin)

cos, sin = position_embeddings these are not casted to float and are subject to various dtypes depending on the DeepSpeed and mixed_precision config.

This accelerate config works:

compute_environment: LOCAL_MACHINE
debug: false
distributed_type: MULTI_GPU
downcast_bf16: 'no'
enable_cpu_affinity:  #false
main_training_function: main
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
mixed_precision: bf16

This accelerate config no longer works:

compute_environment: LOCAL_MACHINE
debug: false
distributed_type: DEEPSPEED
deepspeed_config:
  zero_stage: 3
downcast_bf16: 'no'
enable_cpu_affinity: false
main_training_function: main
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

The text was updated successfully, but these errors were encountered:

ArvinZhuang · 2025-02-14T01:47:46Z

same issue

ArthurZucker · 2025-02-14T09:11:48Z

Nice catch!

zucchini-nlp · 2025-02-18T19:30:09Z

Resolved on main now

ArdalanM mentioned this issue Feb 14, 2025

Qwen2VL fix cos,sin dtypes to float when used with deepspeed #36188

Merged

5 tasks

zucchini-nlp mentioned this issue Feb 18, 2025

AssertionError when using bfloat16 with flash_attn on loading Qwen2.5 VL #36257

Closed

4 tasks

zucchini-nlp closed this as completed Feb 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recent Qwen2VL merge request (#35837) break compatibility with DeepSpeed #36187

Recent Qwen2VL merge request (#35837) break compatibility with DeepSpeed #36187

ArdalanM commented Feb 14, 2025 •

edited

Loading

ArvinZhuang commented Feb 14, 2025

ArthurZucker commented Feb 14, 2025

zucchini-nlp commented Feb 18, 2025

Recent Qwen2VL merge request (#35837) break compatibility with DeepSpeed #36187

Recent Qwen2VL merge request (#35837) break compatibility with DeepSpeed #36187

Comments

ArdalanM commented Feb 14, 2025 • edited Loading

ArvinZhuang commented Feb 14, 2025

ArthurZucker commented Feb 14, 2025

zucchini-nlp commented Feb 18, 2025

ArdalanM commented Feb 14, 2025 •

edited

Loading