Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't hoist bit-extend op as a leaf #19871

Merged
merged 3 commits into from
Feb 5, 2025
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 13 additions & 13 deletions .github/workflows/pkgci_regression_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -131,9 +131,9 @@ jobs:
--goldentime-rocm-unet-ms 255.0 \
--goldentime-rocm-clip-ms 14.5 \
--goldentime-rocm-vae-ms 310.0 \
--goldendispatch-rocm-unet 1602 \
--goldendispatch-rocm-clip 1139 \
--goldendispatch-rocm-vae 246 \
--goldendispatch-rocm-unet 1236 \
--goldendispatch-rocm-clip 967 \
--goldendispatch-rocm-vae 208 \
--goldensize-rocm-unet-bytes 2280000 \
--goldensize-rocm-clip-bytes 860000 \
--goldensize-rocm-vae-bytes 840000 \
Expand All @@ -156,16 +156,16 @@ jobs:
--goldentime-rocm-unet-ms 80.0 \
--goldentime-rocm-clip-ms 15.0 \
--goldentime-rocm-vae-ms 75.0 \
--goldendispatch-rocm-unet 1602 \
--goldendispatch-rocm-clip 1139 \
--goldendispatch-rocm-vae 246 \
--goldendispatch-rocm-unet 1236 \
--goldendispatch-rocm-clip 967 \
--goldendispatch-rocm-vae 208 \
--goldensize-rocm-unet-bytes 2270000 \
--goldensize-rocm-clip-bytes 860000 \
--goldensize-rocm-vae-bytes 840000 \
--goldentime-rocm-punet-int8-fp16-ms 50.0 \
--goldentime-rocm-punet-int8-fp8-ms 52.0 \
--goldendispatch-rocm-punet-int8-fp16 1424 \
--goldendispatch-rocm-punet-int8-fp8 1704 \
--goldendispatch-rocm-punet-int8-fp16 1419 \
--goldendispatch-rocm-punet-int8-fp8 1699 \
--goldensize-rocm-punet-int8-fp8-bytes 2800000 \
--goldensize-rocm-punet-int8-fp16-bytes 2560000 \
--rocm-chip gfx942 \
Expand All @@ -185,16 +185,16 @@ jobs:
--goldentime-rocm-unet-ms 195.0 \
--goldentime-rocm-clip-ms 15.0 \
--goldentime-rocm-vae-ms 190.0 \
--goldendispatch-rocm-unet 1602 \
--goldendispatch-rocm-clip 1139 \
--goldendispatch-rocm-vae 246 \
--goldendispatch-rocm-unet 1236 \
--goldendispatch-rocm-clip 967 \
--goldendispatch-rocm-vae 208 \
--goldensize-rocm-unet-bytes 2270000 \
--goldensize-rocm-clip-bytes 860000 \
--goldensize-rocm-vae-bytes 840000 \
--goldentime-rocm-punet-int8-fp16-ms 140.0 \
--goldentime-rocm-punet-int8-fp8-ms 150 \
--goldendispatch-rocm-punet-int8-fp16 1424 \
--goldendispatch-rocm-punet-int8-fp8 1704 \
--goldendispatch-rocm-punet-int8-fp16 1419 \
--goldendispatch-rocm-punet-int8-fp8 1699 \
--goldensize-rocm-punet-int8-fp8-bytes 2800000 \
--goldensize-rocm-punet-int8-fp16-bytes 2560000 \
--rocm-chip gfx942 \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ iree_compiler_cc_library(
"//compiler/src/iree/compiler/Dialect/Flow/IR",
"//compiler/src/iree/compiler/Dialect/HAL/IR",
"//compiler/src/iree/compiler/Dialect/LinalgExt/IR",
"//compiler/src/iree/compiler/Dialect/LinalgExt/Utils",
"//compiler/src/iree/compiler/Dialect/Stream/IR",
"//compiler/src/iree/compiler/Dialect/Util/IR",
"@llvm-project//mlir:ArithDialect",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ iree_cc_library(
iree::compiler::Dialect::Flow::IR
iree::compiler::Dialect::HAL::IR
iree::compiler::Dialect::LinalgExt::IR
iree::compiler::Dialect::LinalgExt::Utils
iree::compiler::Dialect::Stream::IR
iree::compiler::Dialect::Util::IR
PUBLIC
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#include "iree/compiler/Dialect/Encoding/IR/EncodingOps.h"
#include "iree/compiler/Dialect/LinalgExt/IR/LinalgExtDialect.h"
#include "iree/compiler/Dialect/LinalgExt/IR/LinalgExtOps.h"
#include "iree/compiler/Dialect/LinalgExt/Utils/Utils.h"
#include "iree/compiler/Dialect/Util/IR/UtilDialect.h"
#include "iree/compiler/Dialect/Util/IR/UtilOps.h"
#include "iree/compiler/Dialect/Util/IR/UtilTypes.h"
Expand Down Expand Up @@ -340,6 +341,9 @@ struct HoistableLinalgOpInterface
auto genericOp = llvm::dyn_cast<linalg::GenericOp>(op);
if (!genericOp)
return !isa<linalg::FillOp>(op);
if (IREE::LinalgExt::isBitExtendOp(op)) {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you tweak the comment on lines 338 and 339 to include this case too?

Could take the text from the PR description:

There are a ton bit extend ops that are getting hoisted which simply convert the weights from f16 to f32 (these ops are fairly small so they don't trigger the max size increase threshold i.e. 1024 elems). Instead, we want these ops to be fused with their consumers to prevent materializing the high bit-width tensors.

Maybe also add {} around the if on lines 342-343 while you're in the file too.

So the function could then look like:

  // Determines if an operation is a hoistable leaf, based on heuristics.
  bool isHoistableLeafOp(Operation *op) const {
    // Non-generic operations should not be hoisted, unless they are fill ops,
    // because...
    auto genericOp = llvm::dyn_cast<linalg::GenericOp>(op);
    if (!genericOp) {
      return !isa<linalg::FillOp>(op);
    }

    // Bit extend ops should not be hoisted, because...

    // Broadcasts should not be hoisted, because...

    // Fill ops should not be hoisted if, because...
    // (maybe `linalg::isaFillOpInterface` code should move up)

    // All other operations can be hoisted.
    return true;
}

If you want, you could keep the changes in this PR minimal and just add your new case, but I think that restructuring would help, since the comments and code are a bit stale.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, I refactored based on your suggestions. I also changed the logic to use isaBroadcastOpInterface which I think should be equivalent to the previous logic.

// Generally, we prefer to not hoist broadcasts.
// Detect op that only broadcast input as fusing them makes the new
// op cheaper.
Expand Down
Loading