Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tracker] All the issue related with e2e shark test suite #812

Open
pdhirajkumarprasad opened this issue Aug 27, 2024 · 4 comments
Open

Comments

@pdhirajkumarprasad
Copy link

pdhirajkumarprasad commented Aug 27, 2024

Full ONNX FE tracker is at: #564

ONNX model Zoo model tracker : #886

HF model tracker : #899

Running model

In alt_e2e test suite:

setenv CACHE_DIR "some Path where model will be downloaded"

If building torch-mlir and iree from source:

source /path/to/iree-build/.env && export PYTHONPATH
export PYTHONPATH=/path/to/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir:/path/to/torch-mlir/test/python/fx_importer:$PYTHONPATH
export PATH=/path/to/iree-build/tools/:/path/to/torch-mlir/build/bin/:$PATH

python ./run.py --mode=cl-onnx-iree -v --torchtolinalg -t ModelName

For onnx/models/

critical issues

regression : #887

CPU

# device issue type issue no #model impacted list of model assignee status
2 CPU "onnx.NonMaxSuppression" failed to legalize operation 'torch.operator' that was explicitly marked illegal 881 2 @jinchen62
3 CPU 'func.func' op exceeded stack allocation limit of 32768 bytes for function. Got 1048576 bytes 19333 1 @pashu123
4 CPU onnx.LSTM 1 modelList
5 CPU torch.aten.convolution 1-d grouped 2 modelList @AmosLewis
6 CPU 'tensor.dim' op unexpected during shape cleanup; dynamic dimensions must have been resolved prior to leaving the flow dialect 876 1 modelList
7 CPU failed to legalize operation onnx.NonZero 820 1 modelList @renxida @AmosLewis will message xida
8 CPU failed to legalize operation 'arith.sitofp' that was explicitly marked illegal 887 2 @vivekkhandelwal1
9 CPU boolean indexing ops: AtenNonzeroOp, AtenIndexTensorOp, AtenMaskedSelectOp 3293 @renxida
10 CPU Add TorchToLinalg lowering for MaxUnpool operation 718 @jinchen62
11 CPU Fix Onnx.DFT Torch->Linalg lowering 800 @PhaneeshB

import and setup failures

# device issue type issue no #model impacted list of model assignee status

iree-compile

IREE project tracker: https://github.com/orgs/iree-org/projects/8/views/3

# device issue type issue no #model impacted list of model assignee Status
1 GPU func.func' op uses 401920 bytes of shared memory; exceeded the limit of 65536 bytes 18603 106
2 GPU 'arith.extui' op operand type 'i64' and result type 'i32' are cast incompatible 19179 10 @pashu123

iree runtime

# device issue type issue no #model impacted list of model assignee Status

numerics

# device issue type issue no #model impacted list of model assignee
1 CPU numeric need_to_analyze 101 modleList
2 [numeric] Numeric error for Conv operator with quantize/dequantize 50+ 19416

IREE EP only issues

iree-compile fails with ElementsAttr does not provide iteration facilities for type 'mlir::Attribute' on int8 models at QuantizeLinear op

low priority

issue no 828 Turbine Camp
Issue no 797 Ops not in model

@nod-ai nod-ai deleted a comment Aug 27, 2024
@nod-ai nod-ai deleted a comment from yiweifengyan Aug 27, 2024
@zjgarvey
Copy link
Collaborator

Can you update the model List links?

@jinchen62
Copy link
Contributor

Could you also attach the issue links you referred to so we would know if we cover all model paths. Also it seems not including #801 right?

@pdhirajkumarprasad
Copy link
Author

@zjgarvey the model list contain the updated link only.

@jinchen62 Yes, so far the report is based on onnx model of e2e shark test suite

@jinchen62
Copy link
Contributor

jinchen62 commented Aug 29, 2024

@pdhirajkumarprasad I think it would be helpful to attach more details of the error message.

I feel like the onnx.Transpose one in onnx to torch is the shape inference issue that I was dealing with. I fixed it by setting opset version to 21 with locally built torch-mlir in shark testsuite llvm/torch-mlir#3593. @zjgarvey I realized that this seems not working for the CI job, right? Any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants