Fixes tensorrt cache being regenerated on path change #126

fran6co · 2022-07-06T08:16:23Z

Onnxruntime uses the path as a hash for the tensorrt cache, passing a binary avoids this and works as intended.

Fixes triton-inference-server/server#4587

Onnxruntime uses the path as a hash for the tensorrt cache, passing a binary avoids this and works as intended

GuanLuo · 2022-07-19T19:17:39Z

@pranavsharma can you or someone familiar with TRT EP take a look at this PR? The change looks fine to me and just want to double check if the cache behavior will be changed as intended.

pranavsharma · 2022-07-19T20:03:56Z

@stevenlix - any comments? Is this how we avoid TRT cache regeneration?

stevenlix · 2022-07-19T22:16:56Z

Yes. ORT-TRT uses ModelMetadefIdGenerator function to generate engine id. If the model passed in has a path, the id will be generated from the path string (https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/framework/execution_provider.cc#L151). To avoid the dependency on path, one can pass model binary to the inference session.

cnegron-nv · 2022-07-20T22:30:38Z

@pranavsharma Can you give a final review to this PR?

src/onnxruntime_loader.cc

This reverts commit 05e0383.

Tabrizian · 2022-07-27T16:48:18Z

Hi @fran6co, @pranavsharma, @stevenlix

We are observing a failure in the CI as a result of this change. Looks like for models that expect additional weight files the relative paths do not work properly anymore. The test fails with the error below:

Deserialize tensor embeddings.20.weight failed.open file "./embeddings.20.weight" failed: No such file or directory |

Do you have a fix in mind for this? Otherwise, I think we need to revert this change.

pranavsharma · 2022-07-27T18:02:01Z

I've reached out to @stevenlix. When is the latest you want this fix? If it's blocking the release feel free to revert it.

Tabrizian · 2022-07-27T18:20:12Z

Thanks for the quick response! We are code freezing by the end of next week but I think it would be great if it is possible to have the fix ready for review by Wednesday next week so that we can test it before we are code frozen.

pranavsharma · 2022-07-27T19:40:12Z

Hi @fran6co, @pranavsharma, @stevenlix

We are observing a failure in the CI as a result of this change. Looks like for models that expect additional weight files the relative paths do not work properly anymore. The test fails with the error below:
Deserialize tensor embeddings.20.weight failed.open file "./embeddings.20.weight" failed: No such file or directory |
Do you have a fix in mind for this? Otherwise, I think we need to revert this change.

fyi @jywu-msft

stevenlix · 2022-07-27T20:23:37Z

@skottmckay, one solution is to not use absolute model path in https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/framework/execution_provider.cc#L151

pranavsharma · 2022-07-28T19:10:33Z

Let's revert this at this point because using byte stream won't work with external weight files. So, this PR will fail.

…#135) This reverts commit 05e0383.

bamdadd · 2022-09-06T10:01:41Z

Hi, @Tabrizian Do you plan to revert the revert anytime soon?

Tabrizian · 2022-09-06T15:11:45Z

The underlying issue has not been fixed yet so I don't think there is any plan to revert this change.

CC @pranavsharma

fran6co mentioned this pull request Jul 6, 2022

ONNXRuntime TensorRT cache gets regenerated every time a model is uploaded even with correct settings triton-inference-server/server#4587

Open

Fixes tensorrt cache being regenerated on path change

a0e1bf9

Onnxruntime uses the path as a hash for the tensorrt cache, passing a binary avoids this and works as intended

fran6co force-pushed the patch-1 branch from 1706a4a to a0e1bf9 Compare July 6, 2022 13:38

GuanLuo requested review from Tabrizian and tanmayv25 July 20, 2022 18:01

pranavsharma previously approved these changes Jul 20, 2022

View reviewed changes

tanmayv25 previously approved these changes Jul 21, 2022

View reviewed changes

Tabrizian reviewed Jul 21, 2022

View reviewed changes

src/onnxruntime_loader.cc Show resolved Hide resolved

Adding comment

c062f96

fran6co dismissed stale reviews from tanmayv25 and pranavsharma via c062f96 July 21, 2022 07:18

Tabrizian approved these changes Jul 21, 2022

View reviewed changes

rmccorm4 approved these changes Jul 25, 2022

View reviewed changes

Tabrizian merged commit 05e0383 into triton-inference-server:main Jul 25, 2022

Tabrizian added a commit that referenced this pull request Jul 26, 2022

Revert "Fixes tensorrt cache being regenerated on path change (#126)"

5844dc4

This reverts commit 05e0383.

Tabrizian added a commit that referenced this pull request Jul 28, 2022

Revert "Fixes tensorrt cache being regenerated on path change (#126)" (…

52b4e62

…#135) This reverts commit 05e0383.

Tabrizian mentioned this pull request Sep 12, 2022

TRT Engine Cache Regeneration Issue #145

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes tensorrt cache being regenerated on path change #126

Fixes tensorrt cache being regenerated on path change #126

fran6co commented Jul 6, 2022

GuanLuo commented Jul 19, 2022

pranavsharma commented Jul 19, 2022

stevenlix commented Jul 19, 2022

cnegron-nv commented Jul 20, 2022

Tabrizian commented Jul 27, 2022

pranavsharma commented Jul 27, 2022

Tabrizian commented Jul 27, 2022 •

edited

Loading

pranavsharma commented Jul 27, 2022

stevenlix commented Jul 27, 2022

pranavsharma commented Jul 28, 2022

bamdadd commented Sep 6, 2022

Tabrizian commented Sep 6, 2022

Fixes tensorrt cache being regenerated on path change #126

Fixes tensorrt cache being regenerated on path change #126

Conversation

fran6co commented Jul 6, 2022

GuanLuo commented Jul 19, 2022

pranavsharma commented Jul 19, 2022

stevenlix commented Jul 19, 2022

cnegron-nv commented Jul 20, 2022

Tabrizian commented Jul 27, 2022

pranavsharma commented Jul 27, 2022

Tabrizian commented Jul 27, 2022 • edited Loading

pranavsharma commented Jul 27, 2022

stevenlix commented Jul 27, 2022

pranavsharma commented Jul 28, 2022

bamdadd commented Sep 6, 2022

Tabrizian commented Sep 6, 2022

Tabrizian commented Jul 27, 2022 •

edited

Loading