Support qdq decomposition of DotGeneralOp and ConvolutionOp #2461

sdasgup3 · 2024-07-24T00:33:46Z

The PR

adds patterns for qdq decomposition patterns for DotGeneralOp and ConvolutionOp.
Updates the tests in stablehlo/tests/stablehlo_legalize_quant_to_int.mlir updating only negatives tests, which are previously unhanded by stablehlo-legalize-quant-to-int. The current stablehlo-legalize-quant-to-math uses the fallback to handle these cases.

Note to the reviewers: To may just focus on the very last commit of the chain. The rest is coming from parent PR.

stablehlo/tests/stablehlo_legalize_quantized_op_to_qdq.mlir

[ParentPR](#2459) Previously while creating the QDQ pattern we use `create` API without using the result type and hence reply on the type inference to derive the result type. That works for most of the element-wise operations, however, for dot_general and convolution the result type can might be infeasible to infer in the presense of input quantize types. The PR fixes that. Note to the reviewers: To may just focus on the very last commit of the chain. The rest is coming from parent PR. [ChildPR](#2461)

…2462) [ParentPR](#2461) `quant-to-math` pass [assumes](https://github.com/openxla/stablehlo/blob/eba821aa1c54a21d70331d7926dfc8b929f988f3/stablehlo/transforms/StablehloLegalizeQuantToInt.cpp#L984) that the return type of quantized dot_general or convolution is always having `i32` as storage type. With that the following program with the result storage type of `i8`fail to materialize all the intermediate converted values. ``` func.func @dot_general_per_tensor_quantization(%arg0: tensor<2x3x4x!quant.uniform<i8:f32, 1.0:17>>, %arg1: tensor<2x3x5x!quant.uniform<i8:f32, 1.0:0>>) -> tensor<2x4x5x!quant.uniform<i8:f32, 1.0:17>> { // expected-error@+1 {{failed to legalize operation 'stablehlo.dot_general' that was explicitly marked illegal}} %0 = "stablehlo.dot_general"(%arg0, %arg1) { dot_dimension_numbers = #stablehlo.dot< lhs_batching_dimensions = [0], rhs_batching_dimensions = [0], lhs_contracting_dimensions = [1], rhs_contracting_dimensions = [1] > } : (tensor<2x3x4x!quant.uniform<i8:f32, 1.0:17>>, tensor<2x3x5x!quant.uniform<i8:f32, 1.0:0>>) -> tensor<2x4x5x!quant.uniform<i8:f32, 1.0:17>> func.return %0 : tensor<2x4x5x!quant.uniform<i8:f32, 1.0:17>> } ``` One option to fix this to provode source/target materialization [link](https://mlir.llvm.org/docs/DialectConversion/#type-converter), but we found that for other Ops [e.g.](https://github.com/openxla/stablehlo/blob/eba821aa1c54a21d70331d7926dfc8b929f988f3/stablehlo/transforms/StablehloLegalizeQuantToInt.cpp#L510), there is a precedent on how to convert the math computed in `i32` is converted back to result type. The PR implements the missing conversion. Note to the reviewers: To may just focus on the very last commit of the chain. The rest is coming from parent PR.

sdasgup3 added the Quantization label Jul 24, 2024

This was referenced Jul 24, 2024

Remove type-inference dependency while creating qdq pattarns #2460

Merged

Remove the qunt-to-math pass limitation of Dot/Conv op result type #2462

Merged

sdasgup3 requested review from GleasonK and abhigunj July 24, 2024 02:07

GleasonK approved these changes Jul 24, 2024

View reviewed changes

sdasgup3 force-pushed the support-dotop-convolutionop branch from 3963618 to 8ba3db8 Compare July 25, 2024 00:29

sdasgup3 commented Jul 25, 2024

View reviewed changes

stablehlo/tests/stablehlo_legalize_quantized_op_to_qdq.mlir Show resolved Hide resolved

abhigunj approved these changes Jul 25, 2024

View reviewed changes

sdasgup3 added 7 commits July 26, 2024 02:44

composing-quant-decompistion-passes

267cfef

Using benefits as opposed to source/target materialization

13b1e83

fix tests

f3bba04

Keep the qdq pass

ed52681

fix made to dqd pass to be reused by quant-to-math

f362c10

Support DotOp and ConvolutionOp

06ffe36

Dot/Conv Ops added to dqd pass to be reused by quant-to-math

8d39480

sdasgup3 force-pushed the support-dotop-convolutionop branch from 8ba3db8 to 8d39480 Compare July 26, 2024 02:57

sdasgup3 merged commit 0673dd2 into openxla:main Jul 26, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support qdq decomposition of DotGeneralOp and ConvolutionOp #2461

Support qdq decomposition of DotGeneralOp and ConvolutionOp #2461

sdasgup3 commented Jul 24, 2024 •

edited

Loading

Support qdq decomposition of DotGeneralOp and ConvolutionOp #2461

Support qdq decomposition of DotGeneralOp and ConvolutionOp #2461

Conversation

sdasgup3 commented Jul 24, 2024 • edited Loading

sdasgup3 commented Jul 24, 2024 •

edited

Loading