Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Triton mini batch bug #688

Conversation

dagardner-nv
Copy link
Contributor

@dagardner-nv dagardner-nv commented Feb 9, 2023

  • Fixes issue that can occur when the number of input tensors is greater than the number of rows in the dataframe.
  • Refactor the on_next method of the InferenceClientStage which was getting a bit long and hard to follow, moving some logic to their own methods.
  • Wait until all mini-batches have returned before calculating the reduction.

Fixes #680

… mapping back

to the same source row in the dataframe.

ex:
model max batch = 4
seq_ids = [0, 0, 0, 1, 1, 2, 3, 3]

Would cause the inference inputs for dataframe row 1
to occurr in two non-adjacent batches.

This is a non-ideal fix as this doesn't handle the case where the inference
input is so large that it spans more rows than the model's max batch size.
@dagardner-nv dagardner-nv requested a review from a team as a code owner February 9, 2023 19:13
@dagardner-nv dagardner-nv changed the title David fix triton mini batch Fix Triton mini batch bug #680 Feb 9, 2023
@dagardner-nv dagardner-nv changed the title Fix Triton mini batch bug #680 Fix Triton mini batch bug Feb 9, 2023
@dagardner-nv dagardner-nv added bug Something isn't working non-breaking Non-breaking change 3 - Ready for Review labels Feb 9, 2023
@dagardner-nv dagardner-nv mentioned this pull request Feb 9, 2023
rapids-bot bot pushed a commit that referenced this pull request Feb 22, 2023
* Remove usage of `tensorShape_t` which was deprecated, and later removed.
* Replace usage of tensor constructor in favor of the recommended `make_tensor` helper method.
* Adds more C++ unittests
* RMMTensor marked as a public symbol so the C++ tests can use it
* Add `cuda-nvtx` package to CI driver build, needed for matx-0.3.0

Includes changes from PR #688
fixes #317

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #667
@dagardner-nv
Copy link
Contributor Author

PR #667 contained this code and was merged.

jjacobelli pushed a commit to jjacobelli/Morpheus that referenced this pull request Mar 7, 2023
* Remove usage of `tensorShape_t` which was deprecated, and later removed.
* Replace usage of tensor constructor in favor of the recommended `make_tensor` helper method.
* Adds more C++ unittests
* RMMTensor marked as a public symbol so the C++ tests can use it
* Add `cuda-nvtx` package to CI driver build, needed for matx-0.3.0

Includes changes from PR nv-morpheus#688
fixes nv-morpheus#317

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: nv-morpheus#667
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[BUG]: C++ impl for Triton inference can incorrectly split inference inputs
2 participants