[CUDA EP] Fix BeamSearch on T5 with sequence_as_input_ids (#20667) #20668

amancini-N · 2024-05-13T11:26:48Z

Description

Change the implementation of BeamSearch op when using CUDA EP: in case of T5 model, and in case the decoder input_ids are sequences, copy the sequences device-to-device instead of host-to-device

Motivation and Context

Fixes BeamSearch op returning wrong results on CUDA execution provider when sequence is used as input_ids #20667

tianleiwu · 2024-05-13T16:09:50Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

tianleiwu · 2024-05-13T16:09:52Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

tianleiwu · 2024-05-13T16:09:53Z

/azp run iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-05-13T16:10:11Z

Azure Pipelines successfully started running 2 pipeline(s).

azure-pipelines · 2024-05-13T16:10:31Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2024-05-13T16:10:35Z

Azure Pipelines successfully started running 10 pipeline(s).

tianleiwu · 2024-05-14T20:01:55Z

@amancini-N, could you take a look at those build and test errors in CI pipeline. Let me know if you need help to resolve them.

amancini-N · 2024-11-26T11:00:58Z

/azp run Linux CPU CI Pipeline

azure-pipelines · 2024-11-26T11:01:04Z

Commenter does not have sufficient privileges for PR 20668 in repo microsoft/onnxruntime

onnxruntime/contrib_ops/cpu/transformers/subgraph_t5_decoder.cc

tianleiwu · 2024-11-26T22:37:02Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

tianleiwu · 2024-11-26T22:37:03Z

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-linux-gpu-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline,Android CI Pipeline

tianleiwu · 2024-11-26T22:37:04Z

/azp run iOS CI Pipeline,ONNX Runtime React Native CI Pipeline,CoreML CI Pipeline,Linux DNNL CI Pipeline,Linux MIGraphX CI Pipeline,Linux ROCm CI Pipeline

azure-pipelines · 2024-11-26T22:37:30Z

Azure Pipelines successfully started running 6 pipeline(s).

azure-pipelines · 2024-11-26T22:37:41Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-11-26T22:37:42Z

Azure Pipelines successfully started running 10 pipeline(s).

- Fixing initialization of encoder_hidden_states feed in T5 decoder - Re-generating data for T5 BeamSearch test

amancini-N · 2024-11-28T18:41:00Z

@tianleiwu I had time to come back on this. test should be fixed now. The issue was caused by the fact that encoder_hidden_states input of decoder was not prepared correctly. Fixed that part and re-generated the test sample

tianleiwu · 2024-12-02T17:51:32Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

tianleiwu · 2024-12-02T17:51:34Z

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-linux-gpu-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline,Android CI Pipeline

tianleiwu · 2024-12-02T17:51:35Z

/azp run iOS CI Pipeline,ONNX Runtime React Native CI Pipeline,CoreML CI Pipeline,Linux DNNL CI Pipeline,Linux MIGraphX CI Pipeline,Linux ROCm CI Pipeline

azure-pipelines · 2024-12-02T17:52:03Z

Azure Pipelines successfully started running 6 pipeline(s).

azure-pipelines · 2024-12-02T17:52:12Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-12-02T17:52:15Z

Azure Pipelines successfully started running 10 pipeline(s).

onnxruntime/test/contrib_ops/beam_search_test.cc

tianleiwu · 2024-12-06T21:54:09Z

/azp run Windows CPU CI Pipeline, ONNX Runtime Web CI Pipeline, Windows GPU CUDA CI Pipeline, Linux ROCm CI Pipeline, Linux OpenVINO CI Pipeline

azure-pipelines · 2024-12-06T21:54:34Z

Azure Pipelines successfully started running 5 pipeline(s).

tianleiwu · 2024-12-09T17:51:04Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

tianleiwu · 2024-12-09T17:51:06Z

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models

tianleiwu · 2024-12-09T17:51:07Z

/azp run Linux Android Emulator QNN CI Pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-12-09T17:51:30Z

Azure Pipelines successfully started running 4 pipeline(s).

azure-pipelines · 2024-12-09T17:51:38Z

Azure Pipelines successfully started running 7 pipeline(s).

azure-pipelines · 2024-12-09T17:51:45Z

Azure Pipelines successfully started running 10 pipeline(s).

onnxruntime/contrib_ops/cpu/transformers/subgraph_t5_decoder.cc

tianleiwu · 2024-12-09T21:47:11Z

@amancini-N, Added some suggestion with static_cast to avoid code scan warnings. The other looks good to me.

amancini-N · 2024-12-10T10:50:03Z

@tianleiwu sure thing, done

tianleiwu · 2024-12-10T18:03:10Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

tianleiwu · 2024-12-10T18:03:12Z

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models

tianleiwu · 2024-12-10T18:03:14Z

/azp run Linux Android Emulator QNN CI Pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-12-10T18:03:37Z

Azure Pipelines successfully started running 4 pipeline(s).

azure-pipelines · 2024-12-10T18:03:48Z

Azure Pipelines successfully started running 7 pipeline(s).

azure-pipelines · 2024-12-10T18:03:54Z

Azure Pipelines successfully started running 10 pipeline(s).

…20667) (microsoft#20668) ### Description Change the implementation of BeamSearch op when using CUDA EP: in case of T5 model, and in case the decoder input_ids are sequences, copy the sequences device-to-device instead of host-to-device ### Motivation and Context - Fixes microsoft#20667

Fix BeamSearch on T5 with sequence_as_input_ids (microsoft#20667)

6d8dfe7

amancini-N mentioned this pull request May 13, 2024

BeamSearch op returning wrong results on CUDA execution provider when sequence is used as input_ids #20667

Closed

amancini-N added 3 commits November 25, 2024 16:48

Merge branch 'main' into 20667-fix-beam-search-gpu

d613115

Fixing test

1aba0fc

Fix linting

1edc446

github-advanced-security bot found potential problems Nov 26, 2024

View reviewed changes

onnxruntime/contrib_ops/cpu/transformers/subgraph_t5_decoder.cc Fixed Show fixed Hide fixed

Changes:

c55257f

- Fixing initialization of encoder_hidden_states feed in T5 decoder - Re-generating data for T5 BeamSearch test

tianleiwu reviewed Dec 2, 2024

View reviewed changes

onnxruntime/test/contrib_ops/beam_search_test.cc Show resolved Hide resolved

Skip tests on DML

a9d5f7d

tianleiwu reviewed Dec 9, 2024

View reviewed changes

onnxruntime/contrib_ops/cpu/transformers/subgraph_t5_decoder.cc Outdated Show resolved Hide resolved

tianleiwu reviewed Dec 9, 2024

View reviewed changes

onnxruntime/contrib_ops/cpu/transformers/subgraph_t5_decoder.cc Outdated Show resolved Hide resolved

tianleiwu reviewed Dec 9, 2024

View reviewed changes

onnxruntime/contrib_ops/cpu/transformers/subgraph_t5_decoder.cc Outdated Show resolved Hide resolved

tianleiwu reviewed Dec 9, 2024

View reviewed changes

onnxruntime/contrib_ops/cpu/transformers/subgraph_t5_decoder.cc Outdated Show resolved Hide resolved

amancini-N added 2 commits December 10, 2024 10:02

Merge branch 'main' into 20667-fix-beam-search-gpu

90b995b

Applying suggestions

974ed53

tianleiwu approved these changes Dec 11, 2024

View reviewed changes

tianleiwu merged commit d8de3c4 into microsoft:main Dec 11, 2024
85 checks passed

amancini-N deleted the 20667-fix-beam-search-gpu branch December 11, 2024 10:27

[CUDA EP] Fix BeamSearch on T5 with sequence_as_input_ids (#20667) #20668

[CUDA EP] Fix BeamSearch on T5 with sequence_as_input_ids (#20667) #20668

Conversation

amancini-N commented May 13, 2024

Description

Motivation and Context

tianleiwu commented May 13, 2024

tianleiwu commented May 13, 2024

tianleiwu commented May 13, 2024

azure-pipelines bot commented May 13, 2024

azure-pipelines bot commented May 13, 2024

azure-pipelines bot commented May 13, 2024

tianleiwu commented May 14, 2024

amancini-N commented Nov 26, 2024

azure-pipelines bot commented Nov 26, 2024

tianleiwu commented Nov 26, 2024

tianleiwu commented Nov 26, 2024

tianleiwu commented Nov 26, 2024

azure-pipelines bot commented Nov 26, 2024

azure-pipelines bot commented Nov 26, 2024

azure-pipelines bot commented Nov 26, 2024

amancini-N commented Nov 28, 2024

tianleiwu commented Dec 2, 2024

tianleiwu commented Dec 2, 2024

tianleiwu commented Dec 2, 2024

azure-pipelines bot commented Dec 2, 2024

azure-pipelines bot commented Dec 2, 2024

azure-pipelines bot commented Dec 2, 2024

tianleiwu commented Dec 6, 2024

azure-pipelines bot commented Dec 6, 2024

tianleiwu commented Dec 9, 2024

tianleiwu commented Dec 9, 2024

tianleiwu commented Dec 9, 2024

azure-pipelines bot commented Dec 9, 2024

azure-pipelines bot commented Dec 9, 2024

azure-pipelines bot commented Dec 9, 2024

tianleiwu commented Dec 9, 2024

amancini-N commented Dec 10, 2024

tianleiwu commented Dec 10, 2024

tianleiwu commented Dec 10, 2024

tianleiwu commented Dec 10, 2024

azure-pipelines bot commented Dec 10, 2024

azure-pipelines bot commented Dec 10, 2024

azure-pipelines bot commented Dec 10, 2024