Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Redundant Array Removal #1644

Open
philip-paul-mueller opened this issue Sep 6, 2024 · 0 comments
Open

Error in Redundant Array Removal #1644

philip-paul-mueller opened this issue Sep 6, 2024 · 0 comments

Comments

@philip-paul-mueller
Copy link
Collaborator

The new MapFusion transformation removes useless length 1 dimensions that were introduced by the over approximation. One of the big changes is, that now a the dimensionality the two subsets of a memlet are the same (it was not always true before anyway).

If the following patch is used to disable the strict data flow mode in the auto optimizer:

From 213c36d5ba337bfc1541941b0bc459f94270dcf5 Mon Sep 17 00:00:00 2001
From: Philip Mueller <philip.mueller@cscs.ch>
Date: Fri, 6 Sep 2024 14:54:20 +0200
Subject: [PATCH] Disable strict dataflow in auto optimizer.

---
 dace/transformation/auto/auto_optimize.py | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/dace/transformation/auto/auto_optimize.py b/dace/transformation/auto/auto_optimize.py
index 09ff481e3..db0ba5ef2 100644
--- a/dace/transformation/auto/auto_optimize.py
+++ b/dace/transformation/auto/auto_optimize.py
@@ -52,6 +52,9 @@ def greedy_fuse(graph_or_subgraph: GraphViewType,
     :param permutations_only: Disallow splitting of maps during MultiExpansion stage
     :param expand_reductions: Expand all reduce nodes before fusion
     """
+    validate_all = True
+    validate = True
+    strict_dataflow = True
     debugprint = config.Config.get_bool('debugprint')
     if isinstance(graph_or_subgraph, ControlFlowRegion):
         if isinstance(graph_or_subgraph, SDFG):
@@ -61,7 +64,7 @@ def greedy_fuse(graph_or_subgraph: GraphViewType,
             #  We have to use `strict_dataflow` because it is known that `CompositeFusion`
             #  has problems otherwise.
             graph_or_subgraph.apply_transformations_repeated(
-                    MapFusion(strict_dataflow=True),
+                    MapFusion(strict_dataflow=strict_dataflow),
                     validate_all=validate_all,
             )
 
@@ -82,7 +85,7 @@ def greedy_fuse(graph_or_subgraph: GraphViewType,
         if isinstance(graph_or_subgraph, SDFGState):
             sdfg = graph_or_subgraph.parent
             sdfg.apply_transformations_repeated(
-                    MapFusion(strict_dataflow=True),
+                    MapFusion(strict_dataflow=strict_dataflow),
                     validate_all=validate_all,
             )
             graph = graph_or_subgraph
-- 
2.46.0

the test tests/npbench/weather_stencils/vadv_test.py will fail with an error that Subset.offset() was called with offsets of different dimensionality.

@tbennun tbennun changed the title Error in Redundand Array Removal Error in Redundant Array Removal Sep 18, 2024
github-merge-queue bot pushed a commit that referenced this issue Feb 27, 2025
This PR introduces a new and improved version of `MapFusion`.
A summary of the changes can also be found
[here](https://github.com/user-attachments/files/18516299/new_map_fusion_summary_of_changes.pdf),
it compares the resulting SDFGs generated by the old and new
transformation of some unit tests.

#### Fixed Bugs and removed Limitations
- The subsets (not the `.subset` member of the Memlet; I mean the
concept) of the new intermediate data descriptor were not computed
correctly in some cases, especially in presence of offsets. See the
`test_offset_correction_range_read()`,
`test_offset_correction_scalar_read()` and the
`test_offset_correction_empty()` tests.
- Upon the propagation of the subsets, due to the changed intermediate,
was not handled properly. Essentially, the transformation only updated
`.subset` and ignored `.other_subset`. Which is correct in most cases
but not always. See the `test_fusion_intrinsic_memlet_direction()` for
more.
- During the check if two maps could be fused the `.dynamic` property of
the Memelts were fully ignored leading to wrong code.
- The read-write conflict checks were refined, before all arrays needed
to be accessed the wrong way, i.e. before a fusion was rejected when one
map accessed `A[i, j]` and the other map was accessing `B[i + 1, j]`.
Now this is possible as long as every access is point wise. See the
`test_fusion_different_global_accesses()` test for an example.
- The shape of the reduced intermediate is cleaned, i.e. unnecessary
dimensions of size 1, are removed, except they were present in the
original shape. To make an example, the intermediate array, `T`, had
shape `(10, 1, 20)` and inside the map was accessed `T[__i, 0, __j]`,
then the old transformation would have created an reduced intermediate
of shape `(1, 1, 1)`, new its shape is `(1)`. Note if the intermediate
has shape `(10, 20)` instead and would be accessed as `T[__i, __j]` then
a `Scalar` would have been created. See also the `struct_dataflow` flag
below.

#### New Flags
- `only_toplevel_maps`: If `True` the transformation will only fuse maps
that are located at the top level, i.e. maps inside maps will not be
merged.
- `only_inner_maps`: If `True` then the transformation will only fuse
maps that are inside other maps.
- assume_always_shared`: If `True` then the transformation will assume
that every intermediate is shared, i.e. the referenced data is used
somewhere else in the SDFG and has to become an output of the fused
maps. This will create dead data flow, but avoids a scan of the full
SDFG.
- `strict_dataflow`: This flag is enabled by default. It has two
effects, first it will disable the cleaning of reduced intermediate
storage. The second effect is more important as it will preserve a much
stricter data flow. Most importantly, if the intermediate array is used
downstream (this is not limited to the case that the array is the output
of the second map) then the maps will not be fused together. This is
mostly to work around some other bugs in DaCe, where other
transformations failed to pink up the dependency. Note that the fused
map would be correct, the problem are other transformations.

#### `FullMapFusion`
This PR also introduced the `FullMapFusion` pass, which makes use of the
`FindSingleUseData` pass that was introduced in
[PR#1906](#1906).
The `FullMapFusion` applies MapFusion as long as possible, i.e. fuses
all maps that can be fused.
But instead of scanning the SDFG every time an intermediate node has to
be classified, i.e. can it be deleted or not, it is done once and then
reused which will speed up fusion process as it will remove the need to
traverse the full SDFG many times.
This new pass also replaced the direct application of MapFusion in
`auto_optimizer`.


#### References
Collection of known issues in other transformation:
- [RedundantArrayRemoval (auto
optimize)](#1644)
- [Bug in `RefineNestedAccess` and
`SDFGState._read_and_write_sets()](#1643)
- [Error in Composite Fusion (auto
optimize)](#1642)

---------

Co-authored-by: Philipp Schaad <schaad.phil@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant