You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
emb = torch.cat((rotary_pos_emb, rotary_pos_emb), dim=-1)
cos = emb.cos().float()
sin = emb.sin().float()
else:
cos, sin = position_embeddings
q, k = apply_rotary_pos_emb_flashatt(q.unsqueeze(0), k.unsqueeze(0), cos, sin)
cos, sin = position_embeddings these are not casted to float and are subject to various dtypes depending on the DeepSpeed and mixed_precision config.
The recent merge request (#35837) works with accelerate but breaks with DeepSpeed (w/ and w/o deepspeed config)
To be more precise the issue lies in this section: https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py#L200
cos, sin = position_embeddings
these are not casted to float and are subject to various dtypes depending on the DeepSpeed and mixed_precision config.This accelerate config works:
This accelerate config no longer works:
The text was updated successfully, but these errors were encountered: