-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't hoist bit-extend op as a leaf #19871
Conversation
Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
if (IREE::LinalgExt::isBitExtendOp(op)) { | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you tweak the comment on lines 338 and 339 to include this case too?
Could take the text from the PR description:
There are a ton bit extend ops that are getting hoisted which simply convert the weights from f16 to f32 (these ops are fairly small so they don't trigger the max size increase threshold i.e. 1024 elems). Instead, we want these ops to be fused with their consumers to prevent materializing the high bit-width tensors.
Maybe also add {}
around the if
on lines 342-343 while you're in the file too.
So the function could then look like:
// Determines if an operation is a hoistable leaf, based on heuristics.
bool isHoistableLeafOp(Operation *op) const {
// Non-generic operations should not be hoisted, unless they are fill ops,
// because...
auto genericOp = llvm::dyn_cast<linalg::GenericOp>(op);
if (!genericOp) {
return !isa<linalg::FillOp>(op);
}
// Bit extend ops should not be hoisted, because...
// Broadcasts should not be hoisted, because...
// Fill ops should not be hoisted if, because...
// (maybe `linalg::isaFillOpInterface` code should move up)
// All other operations can be hoisted.
return true;
}
If you want, you could keep the changes in this PR minimal and just add your new case, but I think that restructuring would help, since the comments and code are a bit stale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, I refactored based on your suggestions. I also changed the logic to use isaBroadcastOpInterface
which I think should be equivalent to the previous logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. Needs a test?
Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
@MaheshRavishankar I could add a test for this, but I couldn't find if there were other tests for the implementation of the interface. So, I'm not sure where I should put it. |
I think there is... @hanhanW I think you looked at this in the past. |
There are a ton bit extend ops that are getting hoisted which simply convert the weights from f16 to f32 (these ops are fairly small so they don't trigger the max size increase threshold i.e. 1024 elems). Instead, we want these ops to be fused with their consumers to prevent materializing the high bit-width tensors.
Originally from #19847 but this can be reviewed separately so I moved it out.