Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't hoist bit-extend op as a leaf #19871

Merged
merged 3 commits into from
Feb 5, 2025

Conversation

IanWood1
Copy link
Contributor

There are a ton bit extend ops that are getting hoisted which simply convert the weights from f16 to f32 (these ops are fairly small so they don't trigger the max size increase threshold i.e. 1024 elems). Instead, we want these ops to be fused with their consumers to prevent materializing the high bit-width tensors.

Originally from #19847 but this can be reviewed separately so I moved it out.

Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
@IanWood1 IanWood1 changed the title Fix bitextend hoisting Don't hoist bit-extend op as a leaf Jan 31, 2025
Comment on lines 344 to 346
if (IREE::LinalgExt::isBitExtendOp(op)) {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you tweak the comment on lines 338 and 339 to include this case too?

Could take the text from the PR description:

There are a ton bit extend ops that are getting hoisted which simply convert the weights from f16 to f32 (these ops are fairly small so they don't trigger the max size increase threshold i.e. 1024 elems). Instead, we want these ops to be fused with their consumers to prevent materializing the high bit-width tensors.

Maybe also add {} around the if on lines 342-343 while you're in the file too.

So the function could then look like:

  // Determines if an operation is a hoistable leaf, based on heuristics.
  bool isHoistableLeafOp(Operation *op) const {
    // Non-generic operations should not be hoisted, unless they are fill ops,
    // because...
    auto genericOp = llvm::dyn_cast<linalg::GenericOp>(op);
    if (!genericOp) {
      return !isa<linalg::FillOp>(op);
    }

    // Bit extend ops should not be hoisted, because...

    // Broadcasts should not be hoisted, because...

    // Fill ops should not be hoisted if, because...
    // (maybe `linalg::isaFillOpInterface` code should move up)

    // All other operations can be hoisted.
    return true;
}

If you want, you could keep the changes in this PR minimal and just add your new case, but I think that restructuring would help, since the comments and code are a bit stale.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, I refactored based on your suggestions. I also changed the logic to use isaBroadcastOpInterface which I think should be equivalent to the previous logic.

Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me. Needs a test?

Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
@IanWood1
Copy link
Contributor Author

IanWood1 commented Feb 4, 2025

@MaheshRavishankar I could add a test for this, but I couldn't find if there were other tests for the implementation of the interface. So, I'm not sure where I should put it.

@MaheshRavishankar
Copy link
Contributor

@MaheshRavishankar I could add a test for this, but I couldn't find if there were other tests for the implementation of the interface. So, I'm not sure where I should put it.

I think there is... @hanhanW I think you looked at this in the past.

@IanWood1 IanWood1 merged commit 3dfc486 into iree-org:main Feb 5, 2025
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants