Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to legalize operation 'torch.operator' that was explicitly marked illegal : onnx.Scan #893

pdhirajkumarprasad opened this issue Dec 6, 2024 · 5 comments


Copy link

For the given IR, it's failing to legalize: onnx.Scan

module {
  func.func @CNTKGraph(%arg1:!torch.vtensor<[?,1,1],f32>) -> (!torch.vtensor<[?,1,1],f32>)  attributes {torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = { = 1 : si64}, torch.onnx_meta.producer_name = "CNTK", torch.onnx_meta.producer_version = "2.7"} {
    %1 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__ZeroFloat> : tensor<1xf32>} : () -> !torch.vtensor<[1],f32> 
    %2 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__ZeroFloat_Batch> : tensor<1x1xf32>} : () -> !torch.vtensor<[1,1],f32> 
    %3 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__NegINF_Batch> : tensor<1x1xf32>} : () -> !torch.vtensor<[1,1],f32> 
    %4:5 = torch.operator "onnx.Scan"(%3, %2, %1, %arg1) {torch.onnx.num_scan_inputs = 1 : si64, torch.onnx.scan_input_directions = [0 : si64], torch.onnx.scan_output_directions = [0 : si64, 0 : si64]} : (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>) -> (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>, !torch.vtensor<[?,1,1],f32>)
    return %4#3: !torch.vtensor<[?,1,1],f32>

  dialect_resources: {
    builtin: {
      __ZeroFloat: "0x0800000000000000",
      __ZeroFloat_Batch: "0x0800000000000000",
      __NegINF_Batch: "0x08000000997696FE"

model impacted: 1

Copy link

AmosLewis commented Dec 19, 2024

The correct onnx.mlir should be these one. The region cannot be deleted, it is part of the scan op.

module {
  func.func @CNTKGraph(%arg1:!torch.vtensor<[?,1,1],f32>) -> (!torch.vtensor<[?,1,1],f32>)  attributes {torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = { = 1 : si64}, torch.onnx_meta.producer_name = "CNTK", torch.onnx_meta.producer_version = "2.7"} {
    %1 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__ZeroFloat> : tensor<1xf32>} : () -> !torch.vtensor<[1],f32> 
    %2 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__ZeroFloat_Batch> : tensor<1x1xf32>} : () -> !torch.vtensor<[1,1],f32> 
    %3 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__NegINF_Batch> : tensor<1x1xf32>} : () -> !torch.vtensor<[1,1],f32> 
    %4 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__OneFloat> : tensor<1xf32>} : () -> !torch.vtensor<[1],f32> 
    %300:5 = torch.operator "onnx.Scan"(%3, %2, %1, %arg1) {torch.onnx.num_scan_inputs = 1 : si64, torch.onnx.scan_input_directions = [0 : si64], torch.onnx.scan_output_directions = [0 : si64, 0 : si64]} : (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>) -> (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>, !torch.vtensor<[?,1,1],f32>) {
    ^bb0(%arg4: !torch.vtensor<[1,1],f32>, %arg5: !torch.vtensor<[1,1],f32>, %arg6: !torch.vtensor<[1],f32>, %arg7: !torch.vtensor<[1,1],f32>):
      %315 = torch.operator "onnx.Greater"(%arg7, %arg4) : (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],i1> 
      %316 = torch.operator "onnx.Where"(%315, %arg7, %arg4) : (!torch.vtensor<[1,1],i1>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      %317 = torch.operator "onnx.Where"(%315, %arg6, %arg5) : (!torch.vtensor<[1,1],i1>, !torch.vtensor<[1],f32>, !torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      %318 = torch.operator "onnx.Add"(%arg6, %4) : (!torch.vtensor<[1],f32>, !torch.vtensor<[1],f32>) -> !torch.vtensor<[1],f32> 
      %319 = torch.operator "onnx.Identity"(%316) : (!torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      %320 = torch.operator "onnx.Identity"(%317) : (!torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      %321 = torch.operator "onnx.Identity"(%316) : (!torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      %322 = torch.operator "onnx.Identity"(%317) : (!torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      torch.operator_terminator %319, %320, %318, %321, %322 : !torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>
    return %300#3: !torch.vtensor<[?,1,1],f32>

  dialect_resources: {
    builtin: {
      __OneFloat: "0x080000000000803F",
      __ZeroFloat: "0x0800000000000000",
      __ZeroFloat_Batch: "0x0800000000000000",
      __NegINF_Batch: "0x08000000997696FE"

related shark-testsuite test nod-ai/SHARK-TestSuite#276

Copy link

AmosLewis commented Jan 3, 2025

torch-mlir-opt --convert-torch-onnx-to-torch scan.onnx.mlir -debug

** Failure : Expects result type to be static

scan.onnx.mlir:7:14: error: failed to legalize operation 'torch.operator' that was explicitly marked illegal
    %300:5 = torch.operator "onnx.Scan"(%3, %2, %1, %arg1) {torch.onnx.num_scan_inputs = 1 : si64, torch.onnx.scan_input_directions = [0 : si64], torch.onnx.scan_output_directions = [0 : si64, 0 : si64]} : (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>) -> (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>, !torch.vtensor<[?,1,1],f32>) {

Copy link

#map = affine_map<() -> ()>
#map1 = affine_map<(d0, d1, d2) -> ()>
#map2 = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
#map3 = affine_map<(d0, d1) -> (0, 0)>
#map4 = affine_map<(d0, d1) -> (d0, d1)>
#map5 = affine_map<(d0, d1) -> (0)>
#map6 = affine_map<(d0) -> (0)>
#map7 = affine_map<(d0) -> (d0)>
module { private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64>
  func.func @CNTKGraph(%arg0: tensor<?x1x1xf32>) -> tensor<?x1x1xf32> {
    %cst = arith.constant dense<1.000000e+00> : tensor<1xf32>
    %c0_i64 = arith.constant 0 : i64
    %c0 = arith.constant 0 : index
    %cst_0 = arith.constant dense<-9.99999968E+37> : tensor<1x1xf32>
    %cst_1 = arith.constant dense<0.000000e+00> : tensor<1x1xf32>
    %cst_2 = arith.constant dense<0.000000e+00> : tensor<1xf32>
    %cst_3 = arith.constant dense<0> : tensor<i64>
    %c1 = arith.constant 1 : index
    %0 = tensor.empty() : tensor<f32>
    %1 = linalg.generic {indexing_maps = [#map, #map], iterator_types = []} ins(%cst_3 : tensor<i64>) outs(%0 : tensor<f32>) {
    ^bb0(%in: i64, %out: f32):
      %7 = arith.sitofp %in : i64 to f32
      linalg.yield %7 : f32
    } -> tensor<f32>
    %2 = tensor.empty() : tensor<1x1x1xf32>
    %3 = linalg.generic {indexing_maps = [#map1, #map2], iterator_types = ["parallel", "parallel", "parallel"]} ins(%1 : tensor<f32>) outs(%2 : tensor<1x1x1xf32>) {
    ^bb0(%in: f32, %out: f32):
      linalg.yield %in : f32
    } -> tensor<1x1x1xf32>
    %cast = tensor.cast %3 : tensor<1x1x1xf32> to tensor<?x1x1xf32>
    %dim = tensor.dim %arg0, %c0 : tensor<?x1x1xf32>
    %4 = arith.index_cast %dim : index to i64
    %5 = arith.index_cast %4 : i64 to index
    %6:5 = scf.for %arg1 = %c0 to %5 step %c1 iter_args(%arg2 = %cst_0, %arg3 = %cst_1, %arg4 = %cst_2, %arg5 = %cast, %arg6 = %cast) -> (tensor<1x1xf32>, tensor<1x1xf32>, tensor<1xf32>, tensor<?x1x1xf32>, tensor<?x1x1xf32>) {
      %7 = arith.index_cast %arg1 : index to i64
      %8 = arith.cmpi slt, %7, %c0_i64 : i64
      %9 = arith.extui %8 : i1 to i64
      %10 = arith.muli %9, %4 : i64
      %11 = arith.addi %7, %10 : i64
      %12 = arith.addi %11, %4 : i64
      %13 = arith.cmpi sge, %11, %c0_i64 : i64
      %14 = %13, %11, %12 : i64
      %15 = arith.cmpi slt, %14, %c0_i64 : i64
      %16 = %15, %c0_i64, %14 : i64
      %17 = arith.cmpi sgt, %16, %4 : i64
      %18 = %17, %4, %16 : i64
      %19 = arith.index_cast %18 : i64 to index
      %extracted_slice = tensor.extract_slice %arg0[%19, 0, 0] [1, 1, 1] [1, 1, 1] : tensor<?x1x1xf32> to tensor<1x1x1xf32>
      %collapsed = tensor.collapse_shape %extracted_slice [[0, 1], [2]] : tensor<1x1x1xf32> into tensor<1x1xf32>
      %20 = tensor.empty() : tensor<1x1xi1>
      %21 = linalg.generic {indexing_maps = [#map3, #map3, #map4], iterator_types = ["parallel", "parallel"]} ins(%collapsed, %arg2 : tensor<1x1xf32>, tensor<1x1xf32>) outs(%20 : tensor<1x1xi1>) {
      ^bb0(%in: f32, %in_8: f32, %out: i1):
        %44 = arith.cmpf ogt, %in, %in_8 : f32
        linalg.yield %44 : i1
      } -> tensor<1x1xi1>
      %22 = tensor.empty() : tensor<1x1xf32>
      %23 = linalg.generic {indexing_maps = [#map3, #map3, #map3, #map4], iterator_types = ["parallel", "parallel"]} ins(%21, %collapsed, %arg2 : tensor<1x1xi1>, tensor<1x1xf32>, tensor<1x1xf32>) outs(%22 : tensor<1x1xf32>) {
      ^bb0(%in: i1, %in_8: f32, %in_9: f32, %out: f32):
        %44 = %in, %in_8, %in_9 : f32
        linalg.yield %44 : f32
      } -> tensor<1x1xf32>
      %24 = linalg.generic {indexing_maps = [#map3, #map5, #map3, #map4], iterator_types = ["parallel", "parallel"]} ins(%21, %arg4, %arg3 : tensor<1x1xi1>, tensor<1xf32>, tensor<1x1xf32>) outs(%22 : tensor<1x1xf32>) {
      ^bb0(%in: i1, %in_8: f32, %in_9: f32, %out: f32):
        %44 = %in, %in_8, %in_9 : f32
        linalg.yield %44 : f32
      } -> tensor<1x1xf32>
      %25 = tensor.empty() : tensor<1xf32>
      %26 = linalg.generic {indexing_maps = [#map6, #map6, #map7], iterator_types = ["parallel"]} ins(%arg4, %cst : tensor<1xf32>, tensor<1xf32>) outs(%25 : tensor<1xf32>) {
      ^bb0(%in: f32, %in_8: f32, %out: f32):
        %44 = arith.addf %in, %in_8 : f32
        linalg.yield %44 : f32
      } -> tensor<1xf32>
      %expanded = tensor.expand_shape %23 [[0, 1], [2]] output_shape [1, 1, 1] : tensor<1x1xf32> into tensor<1x1x1xf32>
      %dim_4 = tensor.dim %arg5, %c0 : tensor<?x1x1xf32>
      %27 = arith.index_cast %dim_4 : index to i64
      %28 = arith.addi %7, %27 : i64
      %29 = arith.cmpi sge, %7, %c0_i64 : i64
      %30 = %29, %7, %28 : i64
      %31 = arith.cmpi slt, %30, %c0_i64 : i64
      %32 = %31, %c0_i64, %30 : i64
      %33 = arith.cmpi sgt, %32, %27 : i64
      %34 = %33, %27, %32 : i64
      %35 = arith.index_cast %34 : i64 to index
      %inserted_slice = tensor.insert_slice %expanded into %arg5[%35, 0, 0] [1, 1, 1] [1, 1, 1] : tensor<1x1x1xf32> into tensor<?x1x1xf32>
      %expanded_5 = tensor.expand_shape %24 [[0, 1], [2]] output_shape [1, 1, 1] : tensor<1x1xf32> into tensor<1x1x1xf32>
      %dim_6 = tensor.dim %arg6, %c0 : tensor<?x1x1xf32>
      %36 = arith.index_cast %dim_6 : index to i64
      %37 = arith.addi %7, %36 : i64
      %38 = %29, %7, %37 : i64
      %39 = arith.cmpi slt, %38, %c0_i64 : i64
      %40 = %39, %c0_i64, %38 : i64
      %41 = arith.cmpi sgt, %40, %36 : i64
      %42 = %41, %36, %40 : i64
      %43 = arith.index_cast %42 : i64 to index
      %inserted_slice_7 = tensor.insert_slice %expanded_5 into %arg6[%43, 0, 0] [1, 1, 1] [1, 1, 1] : tensor<1x1x1xf32> into tensor<?x1x1xf32>
      scf.yield %23, %24, %26, %inserted_slice, %inserted_slice_7 : tensor<1x1xf32>, tensor<1x1xf32>, tensor<1xf32>, tensor<?x1x1xf32>, tensor<?x1x1xf32>
    return %6#3 : tensor<?x1x1xf32>

Copy link

new error for the model python --torchtolinalg -v -t bidaf-9

Failed test at stage preprocessing with exception:
Failure while executing pass pipeline:
error: "Compress_31": 'arith.trunci' op operand type 'i1' and result type 'i32' are cast incompatible
 note: "Compress_31": see current operation: %852 = "arith.trunci"(%848) : (i1) -> i32
Traceback (most recent call last):
  File "/proj/gdba/shark/chi/src/SHARK-TestSuite/alt_e2eshark/", line 212, in run_tests
    model_artifact = config.preprocess_model(
  File "/proj/gdba/shark/chi/src/SHARK-TestSuite/alt_e2eshark/e2e_testing/test_configs/", line 120, in preprocess_model
torch_mlir._mlir_libs._site_initialize.<locals>.MLIRError: Failure while executing pass pipeline:
error: "Compress_31": 'arith.trunci' op operand type 'i1' and result type 'i32' are cast incompatible
 note: "Compress_31": see current operation: %852 = "arith.trunci"(%848) : (i1) -> i32

Copy link

The new issue is from onnx.Compress op

module {
  func.func @CNTKGraph(%311:!torch.vtensor<[1,?],f32>, %312:!torch.vtensor<[?],i1> ) -> (!torch.vtensor<[1],f32>)  attributes {torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = { = 1 : si64}, torch.onnx_meta.producer_name = "CNTK", torch.onnx_meta.producer_version = "2.7"} {
    %313 = torch.operator "onnx.Compress"(%311, %312) : (!torch.vtensor<[1,?],f32>, !torch.vtensor<[?],i1>) -> !torch.vtensor<[1],f32> 
    return %313: !torch.vtensor<[1],f32> 
torch-mlir-opt -pass-pipeline='builtin.module(torch-onnx-to-torch-backend-pipeline{backend-legal-ops=aten.flatten.using_ints,})' ./compress.onnx.mlir > compress.torch.mlir
(mlir_venv) (test_suite.venv) ➜  torch-mlir git:(scanfix) ✗ torch-mlir-opt -pass-pipeline='builtin.module(torch-backend-to-linalg-on-tensors-backend-pipeline)' compress.torch.mlir > compress.linalg.mlir
compress.torch.mlir:24:11: error: 'arith.trunci' op operand type 'i1' and result type 'i32' are cast incompatible
    %14 = torch.aten.scatter_add %13, %int0, %6, %9 : !torch.vtensor<[?],i1>, !, !torch.vtensor<[?],i1>, !torch.vtensor<[?],i1> -> !torch.vtensor<[?],i1>
compress.torch.mlir:24:11: note: see current operation: %109 = "arith.trunci"(%105) : (i1) -> i32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

No branches or pull requests

3 participants