-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support int64 <=> int32 auto conversion #1407
Conversation
Signed-off-by: Bo Wang <bowa@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
core/partitioning/shape_analysis.cpp
Outdated
} | ||
} | ||
} | ||
// TODO: This part might be necessary for some model, now checkint to verify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bowang007 should this be uncommented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, optimized and refactored, this part is now included.
@inocsin Can you verify these changes on key models? |
Seems like tests are failing for partitioning? |
Sure, we have asked users to test this pr with their models |
Signed-off-by: Bo Wang <bowa@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
Signed-off-by: Bo Wang <bowa@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to C++ style guidelines:
diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
cnt++;
}
}
- std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
- << std::endl;
+ std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+ << " aten::to nodes)" << std::endl;
return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
LOG_DEBUG(seg_block << " cur seg block");
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}
TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines
When compiling BART (https://huggingface.co/facebook/bart-base) using Torch TensorRT, this PR currently segfaults on my machine. Will add an additional comment with the line which is causing the issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to C++ style guidelines:
diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
cnt++;
}
}
- std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
- << std::endl;
+ std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+ << " aten::to nodes)" << std::endl;
return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
LOG_DEBUG(seg_block << " cur seg block");
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}
TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the suggested edits, PR is functioning for BART model and successfully casts Long tensors to Int tensors. Suggestions are related to bugs arising from input
vs output
checking and usage of different aten::to
schemas
core/partitioning/shape_analysis.cpp
Outdated
auto const_zero = g->insertConstant(0); | ||
const_zero->setType(torch::jit::BoolType::get()); | ||
auto none_val = g->insertNode(g->createNone())->output(); | ||
cast_node = g->create(torch::jit::aten::to, {g->inputs()[index], const_type, const_zero, const_zero, none_val}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an if/else here to use g->inputs()
if is_input = true
otherwise use g->outputs()
core/partitioning/shape_analysis.cpp
Outdated
} | ||
|
||
torch::jit::Node* createCastNode(SegmentedBlock& seg_block, size_t index, bool is_input) { | ||
torch::jit::Node* cast_node = getUpstreamCastNode(seg_block.raw_inputs()[index]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an if/else here to use raw_inputs()
if is_input = true
otherwise use raw_outputs()
core/partitioning/shape_analysis.cpp
Outdated
// if we can find upstream aten::to node, we use it's parameters for creating new cast node | ||
if (cast_node) { | ||
std::unordered_map<torch::jit::Value*, torch::jit::Value*> value_map; | ||
value_map.insert({cast_node->inputs()[0], g->inputs()[index]}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May need an if/else here to check if insert should be g->inputs()[index]
or g->outputs()[index]
.
core/partitioning/shape_analysis.cpp
Outdated
// auto cast_node = g->prependNode(g->create(torch::jit::aten::to, {g->inputs()[i], const_type, const_zero, | ||
// const_zero, none_val})); seg_block.inputs()[i]->replaceAllUsesAfterNodeWith(cast_node, | ||
// cast_node->outputs()[0]); LOG_DEBUG(seg_block << " in shape analysis"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider removing commented code if not needed.
core/partitioning/shape_analysis.cpp
Outdated
if (!is_input) { | ||
// if this value is output, we need to cast it to int32 | ||
auto const_val = g->insertConstant(3); | ||
value_map.insert({cast_node->inputs()[1], const_val}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throws an error when the upstream aten::to
node does not have dtype
as its second argument. For example, the schema aten::to.prim_Device(Tensor(a) self, Device? device, int? dtype=None, bool non_blocking=False, bool copy=False) -> Tensor(b|a)
has Device
as its second value, and this insertion causes it to be transformed to an invalid schema. We need to differentiate between schemas to ensure the dtype
is placed in the right position. It seems that valid schemas for aten::to
have dtype
as either the second or third argument, or not at all. I believe there should be a check should be in getUpstreamCastNode
to see if dtype is any of the arguments, and then a second check here to see if it is second or third argument in the schema.
The check here could be something like an if/else checking the debugName at the second index, as in:
if (cast_node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::DeviceObjType) {
value_map.insert({cast_node->inputs()[2], const_val});
} else {
value_map.insert({cast_node->inputs()[1], const_val});
}
core/partitioning/shape_analysis.cpp
Outdated
auto cur_val = q.front(); | ||
q.pop(); | ||
auto node = cur_val->node(); | ||
if (node->kind().toQualString() == std::string("aten::to")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May need an additional check to ensure that the aten::to
schema is valid for dtype
insertion, as some of these schemas do not take an integer dtype
at all, for example:
aten::to(Tensor(a) self, bool non_blocking=False, bool copy=False) -> Tensor(b|a)
aten::to(Tensor(a) self, Device device, ScalarType dtype, bool non_blocking=False, bool copy=False, MemoryFormat? memory_format=None) -> Tensor(a)
aten::to(Tensor(a) self, Tensor other, bool non_blocking=False, bool copy=False, MemoryFormat? memory_format=None) -> Tensor(a)
A check could be something like an additional &&
with
(node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType) ||
(node->inputs()[2]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @gs-olive Any reproducer for this?
What I'm not sure about is that for getUpstreamNode() function when we pass in a int32 value will the first cast
node be the cast node that casts this value to int64? If that's the case, then we don't need this check.
In other words, is it possible that the first cast
node involving the passed value is to cast some other value? If the first cast
node is not the cast node that casts to int64, will the second cast
node be what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bowang007 - as an update, while this is no longer throwing an error on my end, my thought was that we do need this check you have, but maybe it should be more stringent - something like:
if ((node->kind().toQualString() == std::string("aten::to")) &&
((node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType) ||
(node->inputs()[2]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType))) {
This is because, in the case where the aten::to
is the second option in my above comment, then inserting a constant like 3
will cause the model to fail, as the schema for to
as requested needs a ScalarType
and not an int
. I don't have a specific model to reproduce an error with, and I do not think I encountered one while testing, I just thought it is generally safer to be more strict about the type of upstream cast node used to recast to Int32 - specifically, if we are unsure whether a node has a valid schema for repurposing, we should choose the safer option which is to manually insert an Int32 cast node, as you do in createCastNode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bowang007 Please let me know what you think about the comment in the thread above:
#1407 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gs-olive I got your point now, let me update this part.
Fixes #1346 |
Signed-off-by: Bo Wang <bowa@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to C++ style guidelines:
diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
cnt++;
}
}
- std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
- << std::endl;
+ std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+ << " aten::to nodes)" << std::endl;
return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
LOG_DEBUG(seg_block << " cur seg block");
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}
TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to C++ style guidelines:
diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
cnt++;
}
}
- std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
- << std::endl;
+ std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+ << " aten::to nodes)" << std::endl;
return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
LOG_DEBUG(seg_block << " cur seg block");
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}
TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One additional bugfix requested and one minor optional comment, and then the PR is successful when used for compilation + inference on BART.
core/partitioning/shape_analysis.cpp
Outdated
|
||
torch::jit::Node* createCastNode(SegmentedBlock& seg_block, size_t index, bool is_input) { | ||
auto cast_raw_value = is_input ? seg_block.raw_inputs()[index] : seg_block.raw_outputs()[index]; | ||
auto cast_subgraph_value = is_input ? seg_block.outputs()[index] : seg_block.outputs()[index]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this line to:
auto cast_subgraph_value = is_input ? seg_block.inputs()[index] : seg_block.outputs()[index];
Currently, it is using the outputs regardless of the truth value of is_input
. With this change, the PR (used along with PR #1416 is working for compilation + inference with the BART model)
core/partitioning/shape_analysis.cpp
Outdated
auto cur_val = q.front(); | ||
q.pop(); | ||
auto node = cur_val->node(); | ||
if (node->kind().toQualString() == std::string("aten::to")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bowang007 Please let me know what you think about the comment in the thread above:
#1407 (comment)
Signed-off-by: Bo Wang <bowa@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
Looks good! Fully functional now on BART! |
Signed-off-by: Bo Wang bowa@nvidia.com
Description
Support int64 <=> int32 type conversion.
Fixes #1382
Type of change
Checklist: