Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORTOptimizer support ORTModelForCausalLM #794

Merged

Conversation

fxmarty
Copy link
Contributor

@fxmarty fxmarty commented Feb 20, 2023

As per title.

Notes:

  • longt5 results in segmentation fault with O4.
  • gpt2 / gptj / etc with cache and O4 (fp16) results in:
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Concat node. Name:'/transformer/h.0/attn/Concat_3' Status Message: /onnxruntime_src/onnxruntime/core/framework/op_kernel.cc:83 virtual OrtValue* onnxruntime::OpKernelContext::OutputMLValue(int, const onnxruntime::TensorShape&) status.IsOK() was false. Shape mismatch attempting to re-use buffer. {1,4,11,8} != {1,4,12,8}. Validate usage of dim_value (values should be > 0) and dim_param (all values with the same string should equate to the same size) in shapes in the model.

Did you already witness this @echarlaix @michaelbenayoun @JingyaHuang ?

Before submitting

  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Feb 20, 2023

The documentation is not available anymore as the PR was closed or merged.

@michaelbenayoun
Copy link
Member

Might be linked to the issue with shape inference.

Copy link
Member

@michaelbenayoun michaelbenayoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fxmarty
Copy link
Contributor Author

fxmarty commented Feb 20, 2023

@michaelbenayoun ONNX Runtime have some scripts where they use half precision, I'm not sure if it is with past key values though. I'll have a look.

@fxmarty fxmarty merged commit 463eced into huggingface:main Feb 21, 2023
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants