You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to take this opportunity to express my gratitude for your continuous work and contribution towards developing this indispensable tool.
As it currently stands, we are utilizing a forked version of Megatron-LM (https://github.com/huggingface/Megatron-LM), ↗,) which is significantly lagging behind the main repository (NVIDIA:main) by 524 commits. Among the missing updates, there is a particular commit that stands out for its potential to significantly expedite Transformer training — the Flash-Attention update from Tri Dao.
On January 11, 2023, Tri Dao's pull request (https://github.com/NVIDIA/Megatron-LM/pull/267) ↗) which integrated Flash-Attention into Megatron-LM, was successfully merged. Recently, Tri Dao even released the second version of his impressive Flash-Attention update.
Given the efficiency enhancement that Flash-Attention brings to Transformer training, I believe its integration would be highly beneficial for a broad spectrum of Accelerate users who rely on Megatron-LM. Therefore, I kindly request that you consider updating the forked version of Megatron-LM to a more recent version that incorporates the changes made by PR 267.
Looking forward to your response and potential plan of action on this matter.
Best regards
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Dear Accelerate Developers,
I would like to take this opportunity to express my gratitude for your continuous work and contribution towards developing this indispensable tool.
As it currently stands, we are utilizing a forked version of Megatron-LM (https://github.com/huggingface/Megatron-LM), ↗,) which is significantly lagging behind the main repository (NVIDIA:main) by 524 commits. Among the missing updates, there is a particular commit that stands out for its potential to significantly expedite Transformer training — the Flash-Attention update from Tri Dao.
On January 11, 2023, Tri Dao's pull request (https://github.com/NVIDIA/Megatron-LM/pull/267) ↗) which integrated Flash-Attention into Megatron-LM, was successfully merged. Recently, Tri Dao even released the second version of his impressive Flash-Attention update.
Given the efficiency enhancement that Flash-Attention brings to Transformer training, I believe its integration would be highly beneficial for a broad spectrum of Accelerate users who rely on Megatron-LM. Therefore, I kindly request that you consider updating the forked version of Megatron-LM to a more recent version that incorporates the changes made by PR 267.
Looking forward to your response and potential plan of action on this matter.
Best regards
The text was updated successfully, but these errors were encountered: