Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tests regarding attention types after #35235 #36024

Merged
merged 8 commits into from
Feb 4, 2025
Merged

Conversation

ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented Feb 4, 2025

What does this PR do?

As discussed offline, let's update test_flash_attn_2_can_dispatch_composite_models after #35235.

I keep the original condition and using or (....new condition), although the new condition should work well anyway.

@ydshieh ydshieh requested a review from zucchini-nlp February 4, 2025 08:58
Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think we can entirely delete that because we already checked the config in lines 4441-4444

And since we started fixing these, can we also remove the check on class name from similar tests with something like "SdpaAttention" in name" ?

@ydshieh
Copy link
Collaborator Author

ydshieh commented Feb 4, 2025

ok i will give it a shot

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ydshieh
Copy link
Collaborator Author

ydshieh commented Feb 4, 2025

@zucchini-nlp The check (a few lines above) only check model_fa2.config (not the submodules, and in particular, not the attention modules), but for the most strict check, it might be worth to check the "Attention" layer module have this config attribute and config.flash_attention_2 has the same value we expect.

Do you still think it's better to remove the part you suggested?

@zucchini-nlp
Copy link
Member

The above check should verify that all sub-model have fa2, and would be enough if we are sure the attn implementation is propagated further correctly

it might be worth to check the "Attention" layer module have this config attribute

I guess this one makes more sense for new attention, so yeah we can also check attention module's config and the attribute value there

if "FlashAttention" in class_name:
if (
class_name.endswith("Attention")
and submodule.config._attn_implementation == "flash_attention_2"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(implicitly assuming such attention module always have config attribute, otherwise it might suggest something is not propagating correctly to attn module/layer)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's me know if the latest version is what you expect :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if so I will apply similar changes to other places as you suggested

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, LGTM as long as the tests are green :)

@ydshieh ydshieh changed the title update Update tests regarding attention types after #35235 Feb 4, 2025
@ydshieh
Copy link
Collaborator Author

ydshieh commented Feb 4, 2025

I updated other tests. It turns out that

(implicitly assuming such attention module always have config attribute, otherwise it might suggest something is not propagating correctly to attn module/layer)

will fail for some classes, see below. I will use getattr(..., None) to pass those cases.

Failing test.

FAILED tests/models/xlm_roberta_xl/test_modeling_xlm_roberta_xl.py::XLMRobertaXLModelTest::test_sdpa_can_dispatch_non_composite_models - AttributeError: 'XLMRobertaXLAttention' object has no attribute 'config'
FAILED tests/models/bert/test_modeling_bert.py::BertModelTest::test_sdpa_can_dispatch_non_composite_models - AttributeError: 'BertAttention' object has no attribute 'config'
FAILED tests/models/kosmos2/test_modeling_kosmos2.py::Kosmos2ModelTest::test_attn_implementation_composite_models - AttributeError: 'KosmosTextAttention' object has no attribute 'config'
FAILED tests/models/kosmos2/test_modeling_kosmos2.py::Kosmos2ModelTest::test_sdpa_can_dispatch_composite_models - AttributeError: 'KosmosTextAttention' object has no attribute 'config'
FAILED tests/models/vivit/test_modeling_vivit.py::VivitModelTest::test_sdpa_can_dispatch_non_composite_models - AttributeError: 'VivitAttention' object has no attribute 'config'
FAILED tests/models/vit_mae/test_modeling_vit_mae.py::ViTMAEModelTest::test_sdpa_can_dispatch_non_composite_models - AttributeError: 'ViTMAEAttention' object has no attribute 'config'
FAILED tests/models/videomae/test_modeling_videomae.py::VideoMAEModelTest::test_sdpa_can_dispatch_non_composite_models - AttributeError: 'VideoMAEAttention' object has no attribute 'config'
FAILED tests/models/dinov2_with_registers/test_modeling_dinov2_with_registers.py::Dinov2WithRegistersModelTest::test_sdpa_can_dispatch_non_composite_models - AttributeError: 'Dinov2WithRegistersAttention' object has no attribute 'config'
FAILED tests/models/yolos/test_modeling_yolos.py::YolosModelTest::test_sdpa_can_dispatch_non_composite_models - AttributeError: 'YolosAttention' object has no attribute 'config'
FAILED tests/models/beit/test_modeling_beit.py::BeitModelTest::test_sdpa_can_dispatch_non_composite_models - AttributeError: 'BeitAttention' object has no attribute 'config'

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Btw, are the fa2 tests passing for all models? IMO we need to add the getattr() hack in fa2 test as well

@ydshieh
Copy link
Collaborator Author

ydshieh commented Feb 4, 2025

LGTM! Btw, are the fa2 tests passing for all models? IMO we need to add the getattr() hack in fa2 test as well

Good catch, you are right. I was a bit lazy and after seeing CircleCI is green I didn't think in more depth. Guess CircleCI didn't run that test FA2 test and I should have run it manually.

Once verified all passing, I will merge!

@ydshieh
Copy link
Collaborator Author

ydshieh commented Feb 4, 2025

There is a single failing

tests/models/sam/test_modeling_sam.py::SamModelTest::test_flash_attn_2_can_dispatch_composite_models

that is already failing on main.

I will merge and come back to this SAM model separately later.

@ydshieh ydshieh merged commit fe52679 into main Feb 4, 2025
26 checks passed
@ydshieh ydshieh deleted the tiny_update_test branch February 4, 2025 17:04
elvircrn pushed a commit to elvircrn/transformers that referenced this pull request Feb 13, 2025
…ngface#36024)

* update

* update

* update

* dev-ci

* more changes

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
sbucaille pushed a commit to sbucaille/transformers that referenced this pull request Feb 16, 2025
…ngface#36024)

* update

* update

* update

* dev-ci

* more changes

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants