-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes tensorrt cache being regenerated on path change #126
Conversation
Onnxruntime uses the path as a hash for the tensorrt cache, passing a binary avoids this and works as intended
@pranavsharma can you or someone familiar with TRT EP take a look at this PR? The change looks fine to me and just want to double check if the cache behavior will be changed as intended. |
@stevenlix - any comments? Is this how we avoid TRT cache regeneration? |
Yes. ORT-TRT uses ModelMetadefIdGenerator function to generate engine id. If the model passed in has a path, the id will be generated from the path string (https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/framework/execution_provider.cc#L151). To avoid the dependency on path, one can pass model binary to the inference session. |
@pranavsharma Can you give a final review to this PR? |
This reverts commit 05e0383.
Hi @fran6co, @pranavsharma, @stevenlix We are observing a failure in the CI as a result of this change. Looks like for models that expect additional weight files the relative paths do not work properly anymore. The test fails with the error below:
Do you have a fix in mind for this? Otherwise, I think we need to revert this change. |
I've reached out to @stevenlix. When is the latest you want this fix? If it's blocking the release feel free to revert it. |
Thanks for the quick response! We are code freezing by the end of next week but I think it would be great if it is possible to have the fix ready for review by Wednesday next week so that we can test it before we are code frozen. |
fyi @jywu-msft |
@skottmckay, one solution is to not use absolute model path in https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/framework/execution_provider.cc#L151 |
Let's revert this at this point because using byte stream won't work with external weight files. So, this PR will fail. |
Hi, @Tabrizian Do you plan to revert the revert anytime soon? |
The underlying issue has not been fixed yet so I don't think there is any plan to revert this change. |
Onnxruntime uses the path as a hash for the tensorrt cache, passing a binary avoids this and works as intended.
Fixes triton-inference-server/server#4587