[GPU] Use affine.delinearize_index for MMA tiles and vector distribution #19228

krzysz00 · 2024-11-20T17:03:14Z

This commit updates some by-hand delinearizations in MMA tile generation and vector distribution to use affine.delinearize_index instead.

The main tricky thing here is that a lot of that MMA code would use (id / stride) % size, whereas delinearize's outputs all have the form (id % stride) / nextStride. In all the cases at issue, we could use a utility to convert arrays of sizes and strides to a permutation on a delinearization basis.

In order to not break existing tests, the trivial-loop detector had to be manually instrumented to support delinearize_index (and I got util.assume.int while I was there). (I suspect there're a few other cases, and that, long-term, that detector should be using one of the bounds interfaces, but that's not this PR)

krzysz00 · 2025-01-03T23:25:37Z

Update: staring at tests showed that I should go implement the value bounds op interfaces on the affine.delinearize_index and affine.linearize_index because there were some single-iteration loops that weren't getting eliminated.

qedawkins

Awesome cleanup! Overall LGTM, one comment about how subgroup tiles are handled here because in the past they didn't actually share any link with subgroup size/number of subgroups.

compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/IREEGPUAttrs.cpp

compiler/src/iree/compiler/Codegen/Dialect/GPU/Transforms/test/distribute_mma_to_lanes.mlir

compiler/src/iree/compiler/Codegen/Dialect/VectorExt/IR/VectorExtAttrs.cpp

compiler/src/iree/compiler/Utils/Indexing.cpp

qedawkins · 2025-01-27T17:47:59Z

compiler/src/iree/compiler/Utils/Indexing.h

+///
+/// As a special case, dimensions with stride 0 are treated as size-1
+/// dimensions that are placed at the end of the delinearization, from where
+/// they will canonicalize to 0.


❤️ for the great docs

This commit updates some by-hand delinearizations in MMA tile generation and vector distribution to use `affine.delinearize_index` instead. The main tricky thing here is that a lot of that MMA code would use `(id / stride) % size`, whereas delinearize's outputs all have the form `(id % stride) / nextStride`. In all the cases at issue, we could use a utility to convert arrays of sizes and strides to a permutation on a delinearization basis. In order to not break existing tests, the trivial-loop detector had to be manually instrumented to support `delinearize_index` (and I got `util.assume.int` while I was there). (I suspect there're a few other cases, and that, long-term, that detector should be using one of the bounds interfaces, but that's not this PR)# This is a combination of 7 commits.

Co-authored-by: Quinn Dawkins <quinn.dawkins@gmail.com>

krzysz00 force-pushed the users/krzysz00/gpu-distribute-with-linearize branch from 5328767 to 5a8fa83 Compare November 21, 2024 20:54

krzysz00 force-pushed the users/krzysz00/linearize-mma branch from 829c3d5 to aaf6cf8 Compare November 21, 2024 22:30

krzysz00 force-pushed the users/krzysz00/gpu-distribute-with-linearize branch from 5a8fa83 to 291f570 Compare November 26, 2024 19:24

krzysz00 force-pushed the users/krzysz00/linearize-mma branch from aaf6cf8 to 3ccf6a4 Compare November 26, 2024 19:28

Base automatically changed from users/krzysz00/gpu-distribute-with-linearize to main November 26, 2024 20:38

krzysz00 force-pushed the users/krzysz00/linearize-mma branch 4 times, most recently from ba7bc66 to d577300 Compare December 17, 2024 23:07

krzysz00 force-pushed the users/krzysz00/linearize-mma branch from 31e58d0 to 5592a62 Compare January 3, 2025 22:12

krzysz00 force-pushed the users/krzysz00/linearize-mma branch from 8e5ebb0 to de7e313 Compare January 7, 2025 00:13

krzysz00 marked this pull request as ready for review January 7, 2025 00:21

krzysz00 requested review from antiagainst, MaheshRavishankar, kuhar, qedawkins, hanhanW and benvanik as code owners January 7, 2025 00:21

krzysz00 force-pushed the users/krzysz00/linearize-mma branch 2 times, most recently from 0277d07 to 3ff65f0 Compare January 13, 2025 21:14

krzysz00 force-pushed the users/krzysz00/linearize-mma branch 4 times, most recently from 12377f8 to d8e695c Compare January 27, 2025 17:38

qedawkins reviewed Jan 27, 2025

View reviewed changes

qedawkins approved these changes Jan 27, 2025

View reviewed changes

krzysz00 added 3 commits January 27, 2025 22:29

Update tests

3a3467e

Bazel

88ade5d

krzysz00 force-pushed the users/krzysz00/linearize-mma branch from d8e695c to 88ade5d Compare January 27, 2025 22:29

krzysz00 and others added 5 commits January 27, 2025 16:34

Suggestions from Quinn

e4141be

Co-authored-by: Quinn Dawkins <quinn.dawkins@gmail.com>

Review feedback

4d74510

Namespaces

a95b3d2

Bazel, part 2

991d8aa

Bazel v2

ce89f0b

krzysz00 merged commit ecd67d9 into main Jan 28, 2025
44 checks passed

krzysz00 deleted the users/krzysz00/linearize-mma branch January 28, 2025 18:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] Use affine.delinearize_index for MMA tiles and vector distribution #19228

[GPU] Use affine.delinearize_index for MMA tiles and vector distribution #19228

krzysz00 commented Nov 20, 2024 •

edited

Loading

krzysz00 commented Jan 3, 2025

qedawkins left a comment

qedawkins Jan 27, 2025

[GPU] Use affine.delinearize_index for MMA tiles and vector distribution #19228

[GPU] Use affine.delinearize_index for MMA tiles and vector distribution #19228

Conversation

krzysz00 commented Nov 20, 2024 • edited Loading

krzysz00 commented Jan 3, 2025

qedawkins left a comment

Choose a reason for hiding this comment

qedawkins Jan 27, 2025

Choose a reason for hiding this comment

krzysz00 commented Nov 20, 2024 •

edited

Loading