-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] Use affine.delinearize_index for MMA tiles and vector distribution #19228
Conversation
5328767
to
5a8fa83
Compare
829c3d5
to
aaf6cf8
Compare
5a8fa83
to
291f570
Compare
aaf6cf8
to
3ccf6a4
Compare
ba7bc66
to
d577300
Compare
31e58d0
to
5592a62
Compare
Update: staring at tests showed that I should go implement the value bounds op interfaces on the affine.delinearize_index and affine.linearize_index because there were some single-iteration loops that weren't getting eliminated. |
8e5ebb0
to
de7e313
Compare
0277d07
to
3ff65f0
Compare
12377f8
to
d8e695c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome cleanup! Overall LGTM, one comment about how subgroup tiles are handled here because in the past they didn't actually share any link with subgroup size/number of subgroups.
compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/IREEGPUAttrs.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/IREEGPUAttrs.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/Transforms/test/distribute_mma_to_lanes.mlir
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/VectorExt/IR/VectorExtAttrs.cpp
Show resolved
Hide resolved
/// | ||
/// As a special case, dimensions with stride 0 are treated as size-1 | ||
/// dimensions that are placed at the end of the delinearization, from where | ||
/// they will canonicalize to 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ for the great docs
This commit updates some by-hand delinearizations in MMA tile generation and vector distribution to use `affine.delinearize_index` instead. The main tricky thing here is that a lot of that MMA code would use `(id / stride) % size`, whereas delinearize's outputs all have the form `(id % stride) / nextStride`. In all the cases at issue, we could use a utility to convert arrays of sizes and strides to a permutation on a delinearization basis. In order to not break existing tests, the trivial-loop detector had to be manually instrumented to support `delinearize_index` (and I got `util.assume.int` while I was there). (I suspect there're a few other cases, and that, long-term, that detector should be using one of the bounds interfaces, but that's not this PR)# This is a combination of 7 commits.
d8e695c
to
88ade5d
Compare
Co-authored-by: Quinn Dawkins <quinn.dawkins@gmail.com>
This commit updates some by-hand delinearizations in MMA tile generation and vector distribution to use
affine.delinearize_index
instead.The main tricky thing here is that a lot of that MMA code would use
(id / stride) % size
, whereas delinearize's outputs all have the form(id % stride) / nextStride
. In all the cases at issue, we could use a utility to convert arrays of sizes and strides to a permutation on a delinearization basis.In order to not break existing tests, the trivial-loop detector had to be manually instrumented to support
delinearize_index
(and I gotutil.assume.int
while I was there). (I suspect there're a few other cases, and that, long-term, that detector should be using one of the bounds interfaces, but that's not this PR)