[BUG]: C++ impl for Triton inference can incorrectly split inference inputs #680

dagardner-nv · 2023-02-08T16:32:01Z

Version

23.03

Which installation method(s) does this occur on?

Docker, Conda, Source

Describe the bug.

The Triton inference stage often needs to split up the input based on the model's max batch size, which is quite often much smaller than the the number of rows in the message (pipeline_batch_size), and the input is broken up into what we call a "mini-batch".

We can also have large input fields (typically variable length fields like text) which themselves are larger than the model can accept and need to be split up into multiple inference inputs, and then we perform a reduction on the multiple outputs to produce a single output for the row.

There are currently two related bugs, the first being common:

We perform a partition across inputs for the same row, resulting in not all outputs being reduced.
If a row's input is so larger that the partitioned inputs themselves are larger than the model's max batch size.

Minimum reproducible example

The first variation of the bug occurs with the 
`scripts/validation/abp/val-abp-all.sh` script if you remove the `--truncation=True` flag from `scripts/validation/val-run-pipeline.sh`.

Relevant log output

No response

Full env printout

No response

Other/Misc.

No response

Code of Conduct

I agree to follow Morpheus' Code of Conduct
I have searched the open bugs and have found no duplicates for this bug report

The text was updated successfully, but these errors were encountered:

dagardner-nv · 2023-02-22T20:10:18Z

Fixed in PR #667

dagardner-nv added the bug Something isn't working label Feb 8, 2023

dagardner-nv self-assigned this Feb 8, 2023

This was referenced Feb 8, 2023

Adopt matx v0.3.0 #667

Merged

Fix Triton mini batch bug #688

Closed

jarmak-nv added this to Morpheus Boards Feb 10, 2023

github-project-automation bot moved this to Todo in Morpheus Boards Feb 10, 2023

jarmak-nv moved this from Todo to Review in Morpheus Boards Feb 21, 2023

dagardner-nv closed this as completed Feb 22, 2023

github-project-automation bot moved this from Review to Done in Morpheus Boards Feb 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: C++ impl for Triton inference can incorrectly split inference inputs #680

[BUG]: C++ impl for Triton inference can incorrectly split inference inputs #680

dagardner-nv commented Feb 8, 2023

dagardner-nv commented Feb 22, 2023

[BUG]: C++ impl for Triton inference can incorrectly split inference inputs #680

[BUG]: C++ impl for Triton inference can incorrectly split inference inputs #680

Comments

dagardner-nv commented Feb 8, 2023

Version

Which installation method(s) does this occur on?

Describe the bug.

Minimum reproducible example

Relevant log output

Full env printout

Other/Misc.

Code of Conduct

dagardner-nv commented Feb 22, 2023