Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update attention fusion to support SDPA pattern #22629

Merged
merged 9 commits into from
Nov 21, 2024

Conversation

tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Oct 28, 2024

Description

Match new SDPA pattern for huggingface BERT model that exported from latest transformers package.

Some changes of transformers tests in CI pipeline:
(1) Enable tests for bert, distilbert and roberta models in CI.
(2) Remove out-of-date tests for huggingface models that were marked as slow and not enabled in CI pipeline.
(3) Upgrade transformers package version to the latest.

Motivation and Context

Recent huggingface transformers use torch SDPA in bert modeling. The graph pattern change causes attention fusion not working anymore. Update the fusion script to match the new pattern.

@tianleiwu tianleiwu marked this pull request as draft October 28, 2024 20:30
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@tianleiwu tianleiwu force-pushed the tlwu/bert_sdpa_fusion_script branch from 74d770e to 9eed369 Compare November 19, 2024 21:30
@tianleiwu tianleiwu marked this pull request as ready for review November 19, 2024 21:38
@tianleiwu tianleiwu marked this pull request as draft November 20, 2024 00:29
@tianleiwu tianleiwu marked this pull request as ready for review November 21, 2024 04:17
@tianleiwu tianleiwu merged commit 55f0559 into main Nov 21, 2024
93 checks passed
@tianleiwu tianleiwu deleted the tlwu/bert_sdpa_fusion_script branch November 21, 2024 17:42
mszhanyi pushed a commit that referenced this pull request Nov 22, 2024
### Description
Match new SDPA pattern for huggingface BERT model that exported from
latest transformers package.

Some changes of transformers tests in CI pipeline:
(1) Enable tests for bert, distilbert and roberta models in CI.
(2) Remove out-of-date tests for huggingface models that were marked as
slow and not enabled in CI pipeline.
(3) Upgrade transformers package version to the latest.

### Motivation and Context

Recent huggingface transformers use torch SDPA in bert modeling. The
graph pattern change causes attention fusion not working anymore. Update
the fusion script to match the new pattern.
guschmue pushed a commit that referenced this pull request Dec 2, 2024
### Description
Match new SDPA pattern for huggingface BERT model that exported from
latest transformers package.

Some changes of transformers tests in CI pipeline:
(1) Enable tests for bert, distilbert and roberta models in CI.
(2) Remove out-of-date tests for huggingface models that were marked as
slow and not enabled in CI pipeline.
(3) Upgrade transformers package version to the latest.

### Motivation and Context

Recent huggingface transformers use torch SDPA in bert modeling. The
graph pattern change causes attention fusion not working anymore. Update
the fusion script to match the new pattern.
ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request Dec 11, 2024
### Description
Match new SDPA pattern for huggingface BERT model that exported from
latest transformers package.

Some changes of transformers tests in CI pipeline:
(1) Enable tests for bert, distilbert and roberta models in CI.
(2) Remove out-of-date tests for huggingface models that were marked as
slow and not enabled in CI pipeline.
(3) Upgrade transformers package version to the latest.

### Motivation and Context

Recent huggingface transformers use torch SDPA in bert modeling. The
graph pattern change causes attention fusion not working anymore. Update
the fusion script to match the new pattern.
ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request Dec 11, 2024
### Description
Match new SDPA pattern for huggingface BERT model that exported from
latest transformers package.

Some changes of transformers tests in CI pipeline:
(1) Enable tests for bert, distilbert and roberta models in CI.
(2) Remove out-of-date tests for huggingface models that were marked as
slow and not enabled in CI pipeline.
(3) Upgrade transformers package version to the latest.

### Motivation and Context

Recent huggingface transformers use torch SDPA in bert modeling. The
graph pattern change causes attention fusion not working anymore. Update
the fusion script to match the new pattern.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants