Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to legalize operation 'torch.operator' that was explicitly marked illegal : onnx.Scan #893

Closed
pdhirajkumarprasad opened this issue Dec 6, 2024 · 5 comments
Assignees

Comments

@pdhirajkumarprasad
Copy link

For the given IR, it's failing to legalize: onnx.Scan

module {
  func.func @CNTKGraph(%arg1:!torch.vtensor<[?,1,1],f32>) -> (!torch.vtensor<[?,1,1],f32>)  attributes {torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = {ai.onnx.ml = 1 : si64}, torch.onnx_meta.producer_name = "CNTK", torch.onnx_meta.producer_version = "2.7"} {
    %1 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__ZeroFloat> : tensor<1xf32>} : () -> !torch.vtensor<[1],f32> 
    %2 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__ZeroFloat_Batch> : tensor<1x1xf32>} : () -> !torch.vtensor<[1,1],f32> 
    %3 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__NegINF_Batch> : tensor<1x1xf32>} : () -> !torch.vtensor<[1,1],f32> 
    %4:5 = torch.operator "onnx.Scan"(%3, %2, %1, %arg1) {torch.onnx.num_scan_inputs = 1 : si64, torch.onnx.scan_input_directions = [0 : si64], torch.onnx.scan_output_directions = [0 : si64, 0 : si64]} : (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>) -> (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>, !torch.vtensor<[?,1,1],f32>)
    return %4#3: !torch.vtensor<[?,1,1],f32>
  }
}

{-#
  dialect_resources: {
    builtin: {
      __ZeroFloat: "0x0800000000000000",
      __ZeroFloat_Batch: "0x0800000000000000",
      __NegINF_Batch: "0x08000000997696FE"
    }
  }
#-}

model impacted: 1

bidaf-9
@AmosLewis
Copy link
Contributor

AmosLewis commented Dec 19, 2024

The correct onnx.mlir should be these one. The region cannot be deleted, it is part of the scan op.

module {
  func.func @CNTKGraph(%arg1:!torch.vtensor<[?,1,1],f32>) -> (!torch.vtensor<[?,1,1],f32>)  attributes {torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = {ai.onnx.ml = 1 : si64}, torch.onnx_meta.producer_name = "CNTK", torch.onnx_meta.producer_version = "2.7"} {
    %1 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__ZeroFloat> : tensor<1xf32>} : () -> !torch.vtensor<[1],f32> 
    %2 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__ZeroFloat_Batch> : tensor<1x1xf32>} : () -> !torch.vtensor<[1,1],f32> 
    %3 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__NegINF_Batch> : tensor<1x1xf32>} : () -> !torch.vtensor<[1,1],f32> 
    %4 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__OneFloat> : tensor<1xf32>} : () -> !torch.vtensor<[1],f32> 
    %300:5 = torch.operator "onnx.Scan"(%3, %2, %1, %arg1) {torch.onnx.num_scan_inputs = 1 : si64, torch.onnx.scan_input_directions = [0 : si64], torch.onnx.scan_output_directions = [0 : si64, 0 : si64]} : (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>) -> (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>, !torch.vtensor<[?,1,1],f32>) {
    ^bb0(%arg4: !torch.vtensor<[1,1],f32>, %arg5: !torch.vtensor<[1,1],f32>, %arg6: !torch.vtensor<[1],f32>, %arg7: !torch.vtensor<[1,1],f32>):
      %315 = torch.operator "onnx.Greater"(%arg7, %arg4) : (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],i1> 
      %316 = torch.operator "onnx.Where"(%315, %arg7, %arg4) : (!torch.vtensor<[1,1],i1>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      %317 = torch.operator "onnx.Where"(%315, %arg6, %arg5) : (!torch.vtensor<[1,1],i1>, !torch.vtensor<[1],f32>, !torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      %318 = torch.operator "onnx.Add"(%arg6, %4) : (!torch.vtensor<[1],f32>, !torch.vtensor<[1],f32>) -> !torch.vtensor<[1],f32> 
      %319 = torch.operator "onnx.Identity"(%316) : (!torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      %320 = torch.operator "onnx.Identity"(%317) : (!torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      %321 = torch.operator "onnx.Identity"(%316) : (!torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      %322 = torch.operator "onnx.Identity"(%317) : (!torch.vtensor<[1,1],f32>) -> !torch.vtensor<[1,1],f32> 
      torch.operator_terminator %319, %320, %318, %321, %322 : !torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>
    }
    return %300#3: !torch.vtensor<[?,1,1],f32>
  }
}

{-#
  dialect_resources: {
    builtin: {
      __OneFloat: "0x080000000000803F",
      __ZeroFloat: "0x0800000000000000",
      __ZeroFloat_Batch: "0x0800000000000000",
      __NegINF_Batch: "0x08000000997696FE"
    }
  }
#-}

related shark-testsuite test nod-ai/SHARK-TestSuite#276

@AmosLewis
Copy link
Contributor

AmosLewis commented Jan 3, 2025

@zjgarvey
torch-mlir-opt --convert-torch-onnx-to-torch scan.onnx.mlir -debug

** Failure : Expects result type to be static

scan.onnx.mlir:7:14: error: failed to legalize operation 'torch.operator' that was explicitly marked illegal
    %300:5 = torch.operator "onnx.Scan"(%3, %2, %1, %arg1) {torch.onnx.num_scan_inputs = 1 : si64, torch.onnx.scan_input_directions = [0 : si64], torch.onnx.scan_output_directions = [0 : si64, 0 : si64]} : (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>) -> (!torch.vtensor<[1,1],f32>, !torch.vtensor<[1,1],f32>, !torch.vtensor<[1],f32>, !torch.vtensor<[?,1,1],f32>, !torch.vtensor<[?,1,1],f32>) {

@AmosLewis
Copy link
Contributor

#map = affine_map<() -> ()>
#map1 = affine_map<(d0, d1, d2) -> ()>
#map2 = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
#map3 = affine_map<(d0, d1) -> (0, 0)>
#map4 = affine_map<(d0, d1) -> (d0, d1)>
#map5 = affine_map<(d0, d1) -> (0)>
#map6 = affine_map<(d0) -> (0)>
#map7 = affine_map<(d0) -> (d0)>
module {
  ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64>
  func.func @CNTKGraph(%arg0: tensor<?x1x1xf32>) -> tensor<?x1x1xf32> {
    %cst = arith.constant dense<1.000000e+00> : tensor<1xf32>
    %c0_i64 = arith.constant 0 : i64
    %c0 = arith.constant 0 : index
    %cst_0 = arith.constant dense<-9.99999968E+37> : tensor<1x1xf32>
    %cst_1 = arith.constant dense<0.000000e+00> : tensor<1x1xf32>
    %cst_2 = arith.constant dense<0.000000e+00> : tensor<1xf32>
    %cst_3 = arith.constant dense<0> : tensor<i64>
    %c1 = arith.constant 1 : index
    %0 = tensor.empty() : tensor<f32>
    %1 = linalg.generic {indexing_maps = [#map, #map], iterator_types = []} ins(%cst_3 : tensor<i64>) outs(%0 : tensor<f32>) {
    ^bb0(%in: i64, %out: f32):
      %7 = arith.sitofp %in : i64 to f32
      linalg.yield %7 : f32
    } -> tensor<f32>
    %2 = tensor.empty() : tensor<1x1x1xf32>
    %3 = linalg.generic {indexing_maps = [#map1, #map2], iterator_types = ["parallel", "parallel", "parallel"]} ins(%1 : tensor<f32>) outs(%2 : tensor<1x1x1xf32>) {
    ^bb0(%in: f32, %out: f32):
      linalg.yield %in : f32
    } -> tensor<1x1x1xf32>
    %cast = tensor.cast %3 : tensor<1x1x1xf32> to tensor<?x1x1xf32>
    %dim = tensor.dim %arg0, %c0 : tensor<?x1x1xf32>
    %4 = arith.index_cast %dim : index to i64
    %5 = arith.index_cast %4 : i64 to index
    %6:5 = scf.for %arg1 = %c0 to %5 step %c1 iter_args(%arg2 = %cst_0, %arg3 = %cst_1, %arg4 = %cst_2, %arg5 = %cast, %arg6 = %cast) -> (tensor<1x1xf32>, tensor<1x1xf32>, tensor<1xf32>, tensor<?x1x1xf32>, tensor<?x1x1xf32>) {
      %7 = arith.index_cast %arg1 : index to i64
      %8 = arith.cmpi slt, %7, %c0_i64 : i64
      %9 = arith.extui %8 : i1 to i64
      %10 = arith.muli %9, %4 : i64
      %11 = arith.addi %7, %10 : i64
      %12 = arith.addi %11, %4 : i64
      %13 = arith.cmpi sge, %11, %c0_i64 : i64
      %14 = arith.select %13, %11, %12 : i64
      %15 = arith.cmpi slt, %14, %c0_i64 : i64
      %16 = arith.select %15, %c0_i64, %14 : i64
      %17 = arith.cmpi sgt, %16, %4 : i64
      %18 = arith.select %17, %4, %16 : i64
      %19 = arith.index_cast %18 : i64 to index
      %extracted_slice = tensor.extract_slice %arg0[%19, 0, 0] [1, 1, 1] [1, 1, 1] : tensor<?x1x1xf32> to tensor<1x1x1xf32>
      %collapsed = tensor.collapse_shape %extracted_slice [[0, 1], [2]] : tensor<1x1x1xf32> into tensor<1x1xf32>
      %20 = tensor.empty() : tensor<1x1xi1>
      %21 = linalg.generic {indexing_maps = [#map3, #map3, #map4], iterator_types = ["parallel", "parallel"]} ins(%collapsed, %arg2 : tensor<1x1xf32>, tensor<1x1xf32>) outs(%20 : tensor<1x1xi1>) {
      ^bb0(%in: f32, %in_8: f32, %out: i1):
        %44 = arith.cmpf ogt, %in, %in_8 : f32
        linalg.yield %44 : i1
      } -> tensor<1x1xi1>
      %22 = tensor.empty() : tensor<1x1xf32>
      %23 = linalg.generic {indexing_maps = [#map3, #map3, #map3, #map4], iterator_types = ["parallel", "parallel"]} ins(%21, %collapsed, %arg2 : tensor<1x1xi1>, tensor<1x1xf32>, tensor<1x1xf32>) outs(%22 : tensor<1x1xf32>) {
      ^bb0(%in: i1, %in_8: f32, %in_9: f32, %out: f32):
        %44 = arith.select %in, %in_8, %in_9 : f32
        linalg.yield %44 : f32
      } -> tensor<1x1xf32>
      %24 = linalg.generic {indexing_maps = [#map3, #map5, #map3, #map4], iterator_types = ["parallel", "parallel"]} ins(%21, %arg4, %arg3 : tensor<1x1xi1>, tensor<1xf32>, tensor<1x1xf32>) outs(%22 : tensor<1x1xf32>) {
      ^bb0(%in: i1, %in_8: f32, %in_9: f32, %out: f32):
        %44 = arith.select %in, %in_8, %in_9 : f32
        linalg.yield %44 : f32
      } -> tensor<1x1xf32>
      %25 = tensor.empty() : tensor<1xf32>
      %26 = linalg.generic {indexing_maps = [#map6, #map6, #map7], iterator_types = ["parallel"]} ins(%arg4, %cst : tensor<1xf32>, tensor<1xf32>) outs(%25 : tensor<1xf32>) {
      ^bb0(%in: f32, %in_8: f32, %out: f32):
        %44 = arith.addf %in, %in_8 : f32
        linalg.yield %44 : f32
      } -> tensor<1xf32>
      %expanded = tensor.expand_shape %23 [[0, 1], [2]] output_shape [1, 1, 1] : tensor<1x1xf32> into tensor<1x1x1xf32>
      %dim_4 = tensor.dim %arg5, %c0 : tensor<?x1x1xf32>
      %27 = arith.index_cast %dim_4 : index to i64
      %28 = arith.addi %7, %27 : i64
      %29 = arith.cmpi sge, %7, %c0_i64 : i64
      %30 = arith.select %29, %7, %28 : i64
      %31 = arith.cmpi slt, %30, %c0_i64 : i64
      %32 = arith.select %31, %c0_i64, %30 : i64
      %33 = arith.cmpi sgt, %32, %27 : i64
      %34 = arith.select %33, %27, %32 : i64
      %35 = arith.index_cast %34 : i64 to index
      %inserted_slice = tensor.insert_slice %expanded into %arg5[%35, 0, 0] [1, 1, 1] [1, 1, 1] : tensor<1x1x1xf32> into tensor<?x1x1xf32>
      %expanded_5 = tensor.expand_shape %24 [[0, 1], [2]] output_shape [1, 1, 1] : tensor<1x1xf32> into tensor<1x1x1xf32>
      %dim_6 = tensor.dim %arg6, %c0 : tensor<?x1x1xf32>
      %36 = arith.index_cast %dim_6 : index to i64
      %37 = arith.addi %7, %36 : i64
      %38 = arith.select %29, %7, %37 : i64
      %39 = arith.cmpi slt, %38, %c0_i64 : i64
      %40 = arith.select %39, %c0_i64, %38 : i64
      %41 = arith.cmpi sgt, %40, %36 : i64
      %42 = arith.select %41, %36, %40 : i64
      %43 = arith.index_cast %42 : i64 to index
      %inserted_slice_7 = tensor.insert_slice %expanded_5 into %arg6[%43, 0, 0] [1, 1, 1] [1, 1, 1] : tensor<1x1x1xf32> into tensor<?x1x1xf32>
      scf.yield %23, %24, %26, %inserted_slice, %inserted_slice_7 : tensor<1x1xf32>, tensor<1x1xf32>, tensor<1xf32>, tensor<?x1x1xf32>, tensor<?x1x1xf32>
    }
    return %6#3 : tensor<?x1x1xf32>
  }
}

@AmosLewis
Copy link
Contributor

new error for the model python run.py --torchtolinalg -v -t bidaf-9

Failed test at stage preprocessing with exception:
Failure while executing pass pipeline:
error: "Compress_31": 'arith.trunci' op operand type 'i1' and result type 'i32' are cast incompatible
 note: "Compress_31": see current operation: %852 = "arith.trunci"(%848) : (i1) -> i32
Traceback (most recent call last):
  File "/proj/gdba/shark/chi/src/SHARK-TestSuite/alt_e2eshark/run.py", line 212, in run_tests
    model_artifact = config.preprocess_model(
  File "/proj/gdba/shark/chi/src/SHARK-TestSuite/alt_e2eshark/e2e_testing/test_configs/onnxconfig.py", line 120, in preprocess_model
    pm1.run(mlir_module.operation)
torch_mlir._mlir_libs._site_initialize.<locals>.MLIRError: Failure while executing pass pipeline:
error: "Compress_31": 'arith.trunci' op operand type 'i1' and result type 'i32' are cast incompatible
 note: "Compress_31": see current operation: %852 = "arith.trunci"(%848) : (i1) -> i32

@AmosLewis
Copy link
Contributor

The new issue is from onnx.Compress op

module {
  func.func @CNTKGraph(%311:!torch.vtensor<[1,?],f32>, %312:!torch.vtensor<[?],i1> ) -> (!torch.vtensor<[1],f32>)  attributes {torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = {ai.onnx.ml = 1 : si64}, torch.onnx_meta.producer_name = "CNTK", torch.onnx_meta.producer_version = "2.7"} {
    %313 = torch.operator "onnx.Compress"(%311, %312) : (!torch.vtensor<[1,?],f32>, !torch.vtensor<[?],i1>) -> !torch.vtensor<[1],f32> 
    return %313: !torch.vtensor<[1],f32> 
  }
}
torch-mlir-opt -pass-pipeline='builtin.module(torch-onnx-to-torch-backend-pipeline{backend-legal-ops=aten.flatten.using_ints,aten.unflatten.int})' ./compress.onnx.mlir > compress.torch.mlir
(mlir_venv) (test_suite.venv) ➜  torch-mlir git:(scanfix) ✗ torch-mlir-opt -pass-pipeline='builtin.module(torch-backend-to-linalg-on-tensors-backend-pipeline)' compress.torch.mlir > compress.linalg.mlir
compress.torch.mlir:24:11: error: 'arith.trunci' op operand type 'i1' and result type 'i32' are cast incompatible
    %14 = torch.aten.scatter_add %13, %int0, %6, %9 : !torch.vtensor<[?],i1>, !torch.int, !torch.vtensor<[?],i1>, !torch.vtensor<[?],i1> -> !torch.vtensor<[?],i1>
          ^
compress.torch.mlir:24:11: note: see current operation: %109 = "arith.trunci"(%105) : (i1) -> i32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants