Add pass for replacing dq-q patterns with rescale #8513
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When an int8 op meets an int32 op, the int8 is first dequantized to a float which is then quantized to the desired int8 dtype.
This produces a (int8 dq -> q int32) pattern that
we can replace with a TOSA.RESCALE since they are approximately mathematically equivalent, differing only in how the rounding is done.
This requires a few changes:
The change makes it possible to mix int8 and int32 quantization, as showcased in the new test_add_i32_tosa_BI test.
This pr is a reupload after revert #8480
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218