Add propagation of thread dependent index sequences #292

harsh-nod · 2024-11-22T22:09:10Z

This PR adds a major refactor of the index sequence analysis. Specifically,

Index sequence computation is broken down into two phases - a thread independent index which is calculated with one pass through the graph.
And a thread dependent index which is computed by propagating indices from nodes such as MMA, Read or Write.
A heuristic is added to determine how to propagate information in case of multiple nodes competing for a given node based on dimensional analysis
An additional unit test is added for attention expansion
An e2e test for attention with bias is added

Signed-off-by: Harsh Menon <harsh@nod-labs.com>

raikonenfnu · 2024-11-26T19:38:29Z

iree/turbine/kernel/wave/thread_shape_analysis.py

+    if dst_op:
+        for node in propagated_resolutions:
+            get_custom(node).index = dst_op.index


When would we use it without dst_op?

Good question. This was when I was using it in 2 places where dst_op was optional. I can modify this so that it doesn't handle that case.

raikonenfnu · 2024-11-26T19:53:36Z

iree/turbine/kernel/wave/index_sequence_analysis.py

+    thread_dependent_index: dict[IndexSymbol, IndexSequence],
+) -> dict[IndexSymbol, IndexSequence]:
+    combined_index = {k: v for k, v in thread_independent_index.items()}
+    for k in combined_index:


Would be nice to add some docs, to different examples/cases on when we'd add the start offsets. (perhaps read -> read or offseted_read + mma?)

In this PR, we split the index assignment into 2 phases. The first is a thread-independent index assignment where we set the indices based on work group constraints and tiling constraints and anything that in general does not have any thread level dependence. Once we have this index then we go through and propagate the thread dependent index which comes either from MMA nodes or if there are no MMA nodes it comes from reads and writes and we then add the thread dependent index to the thread independent index. So we are always going to be adding these two offsets together.

raikonenfnu · 2024-11-26T20:00:24Z

iree/turbine/kernel/wave/index_sequence_analysis.py

+            vector_shapes = (
+                custom.vector_shapes if custom.vector_shapes else source_vector_shapes
+            )
+            sources.append((custom, source_index, vector_shapes))


why not use source.index? Seems like the only time sourc.index != source_index is when source.index has underlying indices before getting propagated. But wouldn't we want that?

So source.index contains the unified index which is the sum of the thread dependent index and the thread independent index. During propagation we only want to propagate the thread dependent index and that's why we only propagate source_index.

raikonenfnu

Overall looks good and nice refactoring, just couple clarifying questions. But we can land for now. :)

harsh-nod force-pushed the attn_bias branch 7 times, most recently from 70e166d to a3b6f19 Compare November 24, 2024 19:32

harsh-nod changed the title ~~Add attention with bias tests~~ Add propagation of thread dependent index sequences Nov 24, 2024

harsh-nod force-pushed the attn_bias branch 2 times, most recently from 8831e4d to 2ef7724 Compare November 25, 2024 00:13

harsh-nod requested review from raikonenfnu, Hardcode84 and martin-luecke November 25, 2024 00:13

harsh-nod force-pushed the attn_bias branch 2 times, most recently from 9a03b35 to bb59e95 Compare November 25, 2024 03:41

Add attention with bias tests

5c046e4

Signed-off-by: Harsh Menon <harsh@nod-labs.com>

harsh-nod force-pushed the attn_bias branch from bb59e95 to 5c046e4 Compare November 25, 2024 04:41

harsh-nod mentioned this pull request Nov 25, 2024

[TKW] Modify Index Seq Analysis to handle "detours" #246

Open

raikonenfnu reviewed Nov 26, 2024

View reviewed changes

raikonenfnu approved these changes Nov 26, 2024

View reviewed changes

Hardcode84 approved these changes Nov 26, 2024

View reviewed changes

raikonenfnu merged commit d9d2e7b into iree-org:main Nov 27, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add propagation of thread dependent index sequences #292

Add propagation of thread dependent index sequences #292

harsh-nod commented Nov 22, 2024 •

edited

Loading

raikonenfnu Nov 26, 2024

harsh-nod Dec 2, 2024

raikonenfnu Nov 26, 2024

harsh-nod Dec 3, 2024

raikonenfnu Nov 26, 2024

harsh-nod Dec 3, 2024

raikonenfnu left a comment •

edited

Loading

Add propagation of thread dependent index sequences #292

Add propagation of thread dependent index sequences #292

Conversation

harsh-nod commented Nov 22, 2024 • edited Loading

raikonenfnu Nov 26, 2024

Choose a reason for hiding this comment

harsh-nod Dec 2, 2024

Choose a reason for hiding this comment

raikonenfnu Nov 26, 2024

Choose a reason for hiding this comment

harsh-nod Dec 3, 2024

Choose a reason for hiding this comment

raikonenfnu Nov 26, 2024

Choose a reason for hiding this comment

harsh-nod Dec 3, 2024

Choose a reason for hiding this comment

raikonenfnu left a comment • edited Loading

Choose a reason for hiding this comment

harsh-nod commented Nov 22, 2024 •

edited

Loading

raikonenfnu left a comment •

edited

Loading