Bug in `RefineNestedAccess` and `SDFGState._read_and_write_sets()` #1643

philip-paul-mueller · 2024-09-06T13:27:18Z

During the implementation of the new MapFusion transformation it became necessary to reimplement _read_and_write_sets() there several bugs were discovered and fixed. However, one bug could not be fixed that is related to how the read set is cleaned.

If the following patch is applied

From f61d4e142b2b2097250b77e5b56b636b67ac9219 Mon Sep 17 00:00:00 2001
From: Philip Mueller <philip.mueller@cscs.ch>
Date: Fri, 6 Sep 2024 15:23:59 +0200
Subject: [PATCH] Undo the compatibility mode in _read_and_write_sets

---
 dace/sdfg/state.py | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/dace/sdfg/state.py b/dace/sdfg/state.py
index 23ac2f763..0c912d4d1 100644
--- a/dace/sdfg/state.py
+++ b/dace/sdfg/state.py
@@ -791,16 +791,6 @@ class DataflowGraphView(BlockGraphView, abc.ABC):
                 # Filter out memlets which go out but the same data is written to the AccessNode by another memlet
                 for out_edge in list(out_edges):
                     for in_edge in in_edges:
-                        if out_edge.data.data != in_edge.data.data:
-                            # NOTE: This check does not make any sense, and is in my view wrong.
-                            #   If we consider a memlet between two access nodes, to which access
-                            #   node the `data` attribute of the memlet refers to is arbitrary and
-                            #   does not matter. However, the test will filter _some_ out but not
-                            #   all. See also the tests inside `tests/sdfg/state_test.py` for the
-                            #   wrong behaviour this check induces.
-                            #   This check is is retained for  compatibility with `RefineNestedAccess`,
-                            #   see `tests/numpy/ufunc_support_test.py::test_ufunc_add_accumulate_simple`.
-                            continue
                         if in_subsets[in_edge].covers(out_subsets[out_edge]):
                             out_edges.remove(out_edge)
                             break
-- 
2.46.0

then tests/transformations/move_loop_into_map_test.py::MoveLoopIntoMapTest::test_apply_multiple_times will fail with an invalid SDFG error. The bug is probably located somewhere inside RefineNestedAccess.

Another error that surfaces is in tests/numpy/ufunc_support_test.py::test_ufunc_add_accumulate_simple, however, it only surfaces if the config variables optimizer.automatic_simplification and optimizer.autooptimize are set to True.

Since the regression test (see below) respect the bug it will also fail, in addition tests/transformations/move_loop_into_map_test.py::MoveLoopIntoMapTest::test_more_than_a_map will also fail, because now a transformation, that was previously classified as not applicable, became applicable (actually here I am not sure, but since my regression tests show that the old implementation is not correct, I am pretty positive).

Commit introduced the backward compatibility
Commit introducing regression tests

The text was updated successfully, but these errors were encountered:

philip-paul-mueller · 2024-09-09T06:11:28Z

After I merged master into the new-map-fusion branch some tests that were previously failing (if the bug described here is fixed) no longer failed, thus in commit 5c49eee66 I removed the check, the tests should now still pass.

It turned out, it does not run, there was an error in the tests/numpy/ufunc_support_test.py::test_ufunc_add_accumulate_simple test, I forgot about it as it only happens in autoopt mode.

…t is maintained and when not. However, this has some consequences. I added a test, that shows that this leads, depending on the memlet configuration to different outcome of the transformation. Okay, it is only affects if the transformation can be applied or not, but still. This is also in line with my [issue#1643](spcl#1643) that shows that this is a problem.

philip-paul-mueller · 2024-09-20T06:35:29Z

See also commit e1f2bf6 for an example where this behaviour is kicking in.

acalotoiu · 2024-09-25T14:17:28Z

Prune connectors is one of the transformations that has repeatedly caused problems and I believe is not currently used because of it. Maybe we should deprecate or fix @phschaad ?

philip-paul-mueller · 2024-09-25T14:21:52Z

Do you mean RefineNestedAccesse or PruneConnectors.
My comment about the commit was that now PC was behaving the old way, i.e. should an input connector be pruned or not.
Furthermore as far as I know PC is not run by default and GT4Py depends on it to make inline nested SDFG working.

phschaad · 2024-09-26T06:40:07Z

Prune connectors is one of the transformations that has repeatedly caused problems and I believe is not currently used because of it. Maybe we should deprecate or fix @phschaad ?

If PC is indeed the issue, I would vote for fixing since I understand that it still serves a purpose, but to be honest I am not very familiar with either PC or RNA..

philip-paul-mueller · 2024-09-26T06:44:44Z

@lukastruemper Has made some fixes to the transformation, he can probably give more insights.

During my work on the [new map fusion](#1643) I discovered a bug in `SDFGState._read_and_write_set()`. Originally I solved it there, but it was decided to move it into its own PR. Lets look at the first, super silly example, that is not useful on its own. The main point here, is that the `data` attribute of the Memlet does not refer to the source of the connection but of the destination. ![test_1](https://github.com/user-attachments/assets/740ee4fc-cfe5-4844-a999-e316cb8f9c16) BTW: The Webviewer outputs something like `B[0] -> [0, 0]` however, the parser of the Memlet constructor does not understand this, it must be written as `B[0] -> 0, 0`, i.e. the second set of brackets must be omitted, this should be changed! From the above we would expect the following sets: - Reads: - `A`: `[Range (0, 0)]` - `B`: Should not be listed in this set, because it is fully read and written, thus it is excluded. - Writes - `B`: `[Range (0)]` - `C`: `[Range (0, 0), Range (1, 1)]` However, the current implementation gives us: - Reads: `{'A': [Range (0)], 'B': [Range (1, 1)]}` - Write: `{'B': [Range (0)], 'C': [Range (1, 1), Range (0)]}` The current behaviour is wrong because: - `A` is a `2x2` array, thus the read set should also have two dimensions. - `B` inside the read set, it is a scalar, but the range has two dimensions, furthermore, it is present at all. - `C` the first member of the write set (`Range(1, 1)`) is correct, while the second (`Range(0)`) is horrible wrong. The second example is even more simple. ![test_2](https://github.com/user-attachments/assets/da3d03af-6f10-411f-952e-ab057ed057c6) From the SDFG we expect the following sets: - Reads: - `A`: `[Range(0, 0)]` - Writes: - `B`: `[Range(0)]` It is important that in the above example `other_subset` is `None` and `data` is set to `A`, so it is not one of these "crazy" non standard Memlets we have seen in the first test. However, the current implementation gives us: - Reads: `{'A': [Range (0, 0)]}` - Writes: `{'B': [Range (0, 0)]}` This clearly shows, that whatever the implementation does is not correct.

This PR introduces a new and improved version of `MapFusion`. A summary of the changes can also be found [here](https://github.com/user-attachments/files/18516299/new_map_fusion_summary_of_changes.pdf), it compares the resulting SDFGs generated by the old and new transformation of some unit tests. #### Fixed Bugs and removed Limitations - The subsets (not the `.subset` member of the Memlet; I mean the concept) of the new intermediate data descriptor were not computed correctly in some cases, especially in presence of offsets. See the `test_offset_correction_range_read()`, `test_offset_correction_scalar_read()` and the `test_offset_correction_empty()` tests. - Upon the propagation of the subsets, due to the changed intermediate, was not handled properly. Essentially, the transformation only updated `.subset` and ignored `.other_subset`. Which is correct in most cases but not always. See the `test_fusion_intrinsic_memlet_direction()` for more. - During the check if two maps could be fused the `.dynamic` property of the Memelts were fully ignored leading to wrong code. - The read-write conflict checks were refined, before all arrays needed to be accessed the wrong way, i.e. before a fusion was rejected when one map accessed `A[i, j]` and the other map was accessing `B[i + 1, j]`. Now this is possible as long as every access is point wise. See the `test_fusion_different_global_accesses()` test for an example. - The shape of the reduced intermediate is cleaned, i.e. unnecessary dimensions of size 1, are removed, except they were present in the original shape. To make an example, the intermediate array, `T`, had shape `(10, 1, 20)` and inside the map was accessed `T[__i, 0, __j]`, then the old transformation would have created an reduced intermediate of shape `(1, 1, 1)`, new its shape is `(1)`. Note if the intermediate has shape `(10, 20)` instead and would be accessed as `T[__i, __j]` then a `Scalar` would have been created. See also the `struct_dataflow` flag below. #### New Flags - `only_toplevel_maps`: If `True` the transformation will only fuse maps that are located at the top level, i.e. maps inside maps will not be merged. - `only_inner_maps`: If `True` then the transformation will only fuse maps that are inside other maps. - assume_always_shared`: If `True` then the transformation will assume that every intermediate is shared, i.e. the referenced data is used somewhere else in the SDFG and has to become an output of the fused maps. This will create dead data flow, but avoids a scan of the full SDFG. - `strict_dataflow`: This flag is enabled by default. It has two effects, first it will disable the cleaning of reduced intermediate storage. The second effect is more important as it will preserve a much stricter data flow. Most importantly, if the intermediate array is used downstream (this is not limited to the case that the array is the output of the second map) then the maps will not be fused together. This is mostly to work around some other bugs in DaCe, where other transformations failed to pink up the dependency. Note that the fused map would be correct, the problem are other transformations. #### `FullMapFusion` This PR also introduced the `FullMapFusion` pass, which makes use of the `FindSingleUseData` pass that was introduced in [PR#1906](#1906). The `FullMapFusion` applies MapFusion as long as possible, i.e. fuses all maps that can be fused. But instead of scanning the SDFG every time an intermediate node has to be classified, i.e. can it be deleted or not, it is done once and then reused which will speed up fusion process as it will remove the need to traverse the full SDFG many times. This new pass also replaced the direct application of MapFusion in `auto_optimizer`. #### References Collection of known issues in other transformation: - [RedundantArrayRemoval (auto optimize)](#1644) - [Bug in `RefineNestedAccess` and `SDFGState._read_and_write_sets()](#1643) - [Error in Composite Fusion (auto optimize)](#1642) --------- Co-authored-by: Philipp Schaad <schaad.phil@gmail.com>

philip-paul-mueller changed the title ~~Wrong Bhabiour in SDFGState.read_and_write_sets()~~ Wrong Behaviour in SDFGState.read_and_write_sets() Sep 6, 2024

philip-paul-mueller changed the title ~~Wrong Behaviour in SDFGState.read_and_write_sets()~~ Wrong Behaviour RefineNestedAccess Sep 6, 2024

philip-paul-mueller changed the title ~~Wrong Behaviour RefineNestedAccess~~ Bug in RefineNestedAccess and SDFGState._read_and_write_sets() Sep 6, 2024

philip-paul-mueller mentioned this issue Sep 6, 2024

New and Improved MapFusion #1629

Merged

philip-paul-mueller mentioned this issue Oct 11, 2024

Fix to Read and Write Sets #1678

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in `RefineNestedAccess` and `SDFGState._read_and_write_sets()` #1643

Bug in `RefineNestedAccess` and `SDFGState._read_and_write_sets()` #1643

philip-paul-mueller commented Sep 6, 2024 •

edited

Loading

philip-paul-mueller commented Sep 9, 2024 •

edited

Loading

philip-paul-mueller commented Sep 20, 2024

acalotoiu commented Sep 25, 2024

philip-paul-mueller commented Sep 25, 2024 •

edited

Loading

phschaad commented Sep 26, 2024

philip-paul-mueller commented Sep 26, 2024

Bug in RefineNestedAccess and SDFGState._read_and_write_sets() #1643

Bug in RefineNestedAccess and SDFGState._read_and_write_sets() #1643

Comments

philip-paul-mueller commented Sep 6, 2024 • edited Loading

philip-paul-mueller commented Sep 9, 2024 • edited Loading

philip-paul-mueller commented Sep 20, 2024

acalotoiu commented Sep 25, 2024

philip-paul-mueller commented Sep 25, 2024 • edited Loading

phschaad commented Sep 26, 2024

philip-paul-mueller commented Sep 26, 2024

Bug in `RefineNestedAccess` and `SDFGState._read_and_write_sets()` #1643

Bug in `RefineNestedAccess` and `SDFGState._read_and_write_sets()` #1643

philip-paul-mueller commented Sep 6, 2024 •

edited

Loading

philip-paul-mueller commented Sep 9, 2024 •

edited

Loading

philip-paul-mueller commented Sep 25, 2024 •

edited

Loading