-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add option to specify int64 as an Input dtype #1551
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gs-olive
commented
Dec 15, 2022
gs-olive
commented
Dec 15, 2022
fca7b3d
to
3677fd3
Compare
gs-olive
commented
Dec 15, 2022
@peri044 is the reviewer for this feature |
14ed6dd
to
bed7f39
Compare
- Rework `Input` paradigm to be based on `at::ScalarType` as opposed to the previous `nvinfer1::DataType`, allowing a larger representation space of data types - When paired with `truncate_long_and_double`, insert casts to ensure Torch engines using Int64 tensors receive the correct types, and TensorRT engines operating on those tensors receive downcasted Int32 versions thereof - Add Torch block at the beginning of model graph to prepare types of input tensors for forthcoming engines in sequence - Automatically follow internal tensor types to abstract away the different internal engines used (Torch/TensorRT) from the user - Provide a framework for streamlined addition of other data types, including `torch.double` as valid input types - Improve error checking to ensure model compilation and behavior is as documented. For example, disallow specification of Long type input if the engine is required to be converted entirely to TRT - Known Limitations: - Specifying `dtype=torch.long` on an `Input` in an `input_signature` is not supported currently and will throw an error before model compilation when used with the Python API - While Torch may output Int64 tensors from the overall model, Torch-TRT currently can only output Int32 tensors for models using TRT, as there is not a mechanism in place for differentiating intermediate blocks from final/beginning blocks in the graph - Torch-TRT will almost definitely alter the data type of the input tensor, in-place, if `dtype=torch.long` is specified, and the returned result will be of type `torch.int32`
peri044
requested changes
Dec 21, 2022
@mfeliz-cruise Please check the description for more details on the extent of int64 dtype support. This PR should resolve the usecases that we discussed. |
peri044
requested changes
Dec 22, 2022
- Address review comments - Add cpp API testing and support - Improve length and efficiency of autocast graph - Improve messages displayed to user
peri044
reviewed
Dec 22, 2022
peri044
approved these changes
Jan 9, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Fixes #1543 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
cla signed
component: api [C++]
Issues re: C++ API
component: api [Python]
Issues re: Python API
component: conversion
Issues re: Conversion stage
component: core
Issues re: The core compiler
component: lowering
Issues re: The lowering / preprocessing passes
component: partitioning
component: tests
Issues re: Tests
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Rework
Input
paradigm to be based onat::ScalarType
as opposed to the previousnvinfer1::DataType
, allowing a larger representation space of data typesWhen paired with
truncate_long_and_double
, insert casts to ensure Torch engines using Int64 tensors receive the correct types, and TensorRT engines operating on those tensors receive downcasted Int32 versions thereofAdd Torch block at the beginning of model graph to prepare types of input tensors for forthcoming engines in sequence
Automatically follow internal tensor types to abstract away the different internal engines used (Torch/TensorRT) from the user
Provide a framework for streamlined addition of other data types, including
torch.double
as valid input typesImprove error checking to ensure model compilation and behavior is as documented. For example, disallow specification of Long type input if the engine is required to be converted entirely to TRT
Modify compiler to extract inferred data types for each input
Add Python API testing to ensure casts are inserted correctly and run in Torch
Known Limitations:
dtype=torch.long
on anInput
in aninput_signature
is not supported currently and will throw an error before model compilation when used with the Python APIdtype=torch.long
is specified, and the returned result will be of typetorch.int32
Note: The scope of this feature is currently limited to partitioning-enabled models (
require_full_compilation=False
) havingtruncate_long_and_double=True
, since the feature prepends a Torch-executed block to the graph which performs the necessary casts, and so it requires both partitioning and truncation.Fixes #1546
Addresses most of #1543
Type of change
Checklist: