Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support qdq decomposition of DotGeneralOp and ConvolutionOp #2461

Merged
merged 7 commits into from
Jul 26, 2024

Conversation

sdasgup3
Copy link
Member

@sdasgup3 sdasgup3 commented Jul 24, 2024

ParentPR

The PR

  1. adds patterns for qdq decomposition patterns for DotGeneralOp and ConvolutionOp.
  2. Updates the tests in stablehlo/tests/stablehlo_legalize_quant_to_int.mlir updating only negatives tests, which are previously unhanded by stablehlo-legalize-quant-to-int. The current stablehlo-legalize-quant-to-math uses the fallback to handle these cases.

Note to the reviewers: To may just focus on the very last commit of the chain. The rest is coming from parent PR.

childPR

@sdasgup3 sdasgup3 force-pushed the support-dotop-convolutionop branch from 3963618 to 8ba3db8 Compare July 25, 2024 00:29
sdasgup3 added a commit that referenced this pull request Jul 26, 2024
[ParentPR](#2459)

Previously while creating the QDQ pattern we use `create` API without
using the result type and hence reply on the type inference to derive
the result type. That works for most of the element-wise operations,
however, for dot_general and convolution the result type can might be
infeasible to infer in the presense of input quantize types.

The PR fixes that.

Note to the reviewers: To may just focus on the very last commit of the
chain. The rest is coming from parent PR.

[ChildPR](#2461)
@sdasgup3 sdasgup3 force-pushed the support-dotop-convolutionop branch from 8ba3db8 to 8d39480 Compare July 26, 2024 02:57
@sdasgup3 sdasgup3 merged commit 0673dd2 into openxla:main Jul 26, 2024
10 checks passed
sdasgup3 added a commit that referenced this pull request Jul 26, 2024
…2462)

[ParentPR](#2461)

`quant-to-math` pass
[assumes](https://github.com/openxla/stablehlo/blob/eba821aa1c54a21d70331d7926dfc8b929f988f3/stablehlo/transforms/StablehloLegalizeQuantToInt.cpp#L984)
that the return type of quantized dot_general or convolution is always
having `i32` as storage type.
With that the following program with the result storage type of `i8`fail
to materialize all the intermediate converted values.


```
func.func @dot_general_per_tensor_quantization(%arg0: tensor<2x3x4x!quant.uniform<i8:f32, 1.0:17>>, %arg1: tensor<2x3x5x!quant.uniform<i8:f32, 1.0:0>>) -> tensor<2x4x5x!quant.uniform<i8:f32, 1.0:17>> {
  // expected-error@+1 {{failed to legalize operation 'stablehlo.dot_general' that was explicitly marked illegal}}
  %0 = "stablehlo.dot_general"(%arg0, %arg1) {
    dot_dimension_numbers = #stablehlo.dot<
      lhs_batching_dimensions = [0],
      rhs_batching_dimensions = [0],
      lhs_contracting_dimensions = [1],
      rhs_contracting_dimensions = [1]
    >
  } : (tensor<2x3x4x!quant.uniform<i8:f32, 1.0:17>>, tensor<2x3x5x!quant.uniform<i8:f32, 1.0:0>>) -> tensor<2x4x5x!quant.uniform<i8:f32, 1.0:17>>
  func.return %0 : tensor<2x4x5x!quant.uniform<i8:f32, 1.0:17>>
}
```

One option to fix this to provode source/target materialization
[link](https://mlir.llvm.org/docs/DialectConversion/#type-converter),
but we found that for other Ops
[e.g.](https://github.com/openxla/stablehlo/blob/eba821aa1c54a21d70331d7926dfc8b929f988f3/stablehlo/transforms/StablehloLegalizeQuantToInt.cpp#L510),
there is a precedent on how to convert the math computed in `i32` is
converted back to result type. The PR implements the missing conversion.

Note to the reviewers: To may just focus on the very last commit of the
chain. The rest is coming from parent PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants