[fix]: fix bug in aten::to, when network only have aten::to layer wil… #1108

inocsin · 2022-06-11T07:09:03Z

…l change input name

Signed-off-by: inocsin vcheungyi@163.com

Description

When (1) network only have aten::to layer or (2) the output of aten::to is same as input and the input of aten::to is network input, will change the input tensor's name, which will case an error.

class Net(nn.Module):
  def __init__(self):
    super(Net, self).__init__()

  def forward(self, data, index):
    index = index.to(torch.int64) # in trt, output == input
    src = 1
    data = data.scatter_(1,index,src) # in torch
    return data

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

…l change input name Signed-off-by: inocsin <vcheungyi@163.com>

inocsin · 2022-06-11T07:11:11Z

@narendasan please review this change

narendasan · 2022-06-17T00:18:40Z

This seems fine to me but I think it should be part of more comprehensive changes to catch this class of error. cc: @bowang007

bowang007 · 2022-06-22T23:09:25Z

looks like this issue is related to this one #982. Both these issues are triggered by changing the names of ITensor.
I'm wondering if there are other similar issues comes from other converters, as what we have discussed @narendasan .
If we introduce some kind of detection mechanism to prevent renaming ITensors, then this change would be unnecessary.

bowang007 · 2022-06-22T23:10:08Z

what's the error message that you have now? @inocsin
I'm seeing a segmentation fault.

inocsin · 2022-06-26T09:50:03Z

what's the error message that you have now? @inocsin I'm seeing a segmentation fault.

Error message is here, because the input with name input_0 is changed to output value named 4, so the binding will fail.

DEBUG: [Torch-TensorRT - Debug Build] - Running JIT version
DEBUG: [Torch-TensorRT - Debug Build] - Running TRT version
DEBUG: [Torch-TensorRT - Debug Build] - Pairing 0: y.1 : Input(shape: [3], dtype: Float32, format: NCHW\Contiguous\Linear)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init CUDA: CPU +318, GPU +0, now: CPU 3018, GPU 1632 (MiB)
INFO: [Torch-TensorRT - Debug Build] - Settings requested for TensorRT engine:
    Enabled Precisions: Float32
    TF32 Floating Point Computation Enabled: 1
    Truncate Long and Double: 0
    Make Refittable Engine: 0
    Debuggable Engine: 0
    GPU ID: 0
    Allow GPU Fallback (if running on DLA): 0
    Min Timing Iterations: 2
    Avg Timing Iterations: 1
    Max Workspace Size: 1073741824
    Device Type: GPU
    GPU ID: 0
    Engine Capability: standard
    Calibrator Created: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Converting Block
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - graph(%y.1 : Tensor):
  %1 : int = prim::Constant[value=6]()
  %2 : bool = prim::Constant[value=0]()
  %3 : NoneType = prim::Constant()
  %4 : Tensor = aten::to(%y.1, %1, %2, %2, %3)
  return (%4)

DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Input Dimension Specs: {
    y.1 : Input(shape: [3], dtype: Float32, format: NCHW\Contiguous\Linear),}
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Input y.1 (named: input_0): Input(shape: [3], dtype: Float32, format: NCHW\Contiguous\Linear) in engine (conversion.AddInputs)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Evaluating %1 : int = prim::Constant[value=6]()
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Found the value to be: 6
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Evaluating %2 : bool = prim::Constant[value=0]()
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Found the value to be: False
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Evaluating %3 : NoneType = prim::Constant()
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Found the value to be: None
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Layer %4 : Tensor = aten::to(%y.1, %1, %2, %2, %3) (ctx.AddLayer)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is an already converted tensor
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT - Debug Build] - ITensor shape: [3]
DEBUG: [Torch-TensorRT - Debug Build] - ITensor type: Float32
DEBUG: [Torch-TensorRT - Debug Build] - [aten::to.dtype] Output tensor shape: [3]
DEBUG: [Torch-TensorRT - Debug Build] - One of the inputs named 4 to the network is marked as an output tensor. Applying an identity layer and marking this tensor as output
INFO: [Torch-TensorRT TorchScript Conversion Context] - Marking Output 4 named output_0 in engine (ctx.MarkOutput)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageSnapshot] Builder begin: CPU 3018 MiB, GPU 1632 MiB
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Applying generic optimizations to the graph for inference.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Original: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After dead-layer removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After Myelin optimization: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After scale fusion: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After vertical fusions: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After dupe layer removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After final dead-layer removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After tensor merging: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After concat removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Graph construction and optimization completed in 0.0130252 seconds.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT TorchScript Conversion Context] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +322, GPU +166, now: CPU 3340, GPU 1798 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuDNN: CPU +454, GPU +204, now: CPU 3794, GPU 2002 (MiB)
WARNING: [Torch-TensorRT TorchScript Conversion Context] - Detected invalid timing cache, setup a local cache instead
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Constructing optimization profile number 0 [1/1].
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - *************** Autotuning format combination: Float(1) -> Float(1) ***************
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - --------------- Timing Runner: (Unnamed Layer* 0) [Identity] (Cast)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Cast has no valid tactics for this config, skipping
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - --------------- Timing Runner: (Unnamed Layer* 0) [Identity] (Reformat)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Tactic: 1002 Time: 0.011776
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Tactic: 0 Time: 0.006272
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Fastest Tactic: 0 Time: 0.006272
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - >>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Formats and tactics selection completed in 0.00822353 seconds.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After reformat layers: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Block size 1073741824
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Total Activation Memory: 1073741824
INFO: [Torch-TensorRT TorchScript Conversion Context] - Detected 1 inputs and 1 output network tensors.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Layer: (Unnamed Layer* 0) [Identity] HostPersistent: 0 DevicePersistent: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Total Host Persistent Memory: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Total Device Persistent Memory: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Total Scratch Memory: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 4 MiB
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT TorchScript Conversion Context] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3794, GPU 2010 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3794, GPU 2018 (MiB)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 2002 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Engine generation completed in 1.7908 seconds.
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Engine Layer Information:
Layer(Reformat): (Unnamed Layer* 0) [Identity], Tactic: 0, 4[Float(3)] -> output_0[Float(3)]
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageSnapshot] Builder end: CPU 3794 MiB, GPU 1984 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Running TRT version
DEBUG: [Torch-TensorRT - Debug Build] - Target Device: Device(ID: 0, Name: Tesla T4, SM Capability: 7.5, Type: GPU)
DEBUG: [Torch-TensorRT - Debug Build] - Setting Device(ID: 0, Name: Tesla T4, SM Capability: 7.5, Type: GPU) as active device
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
INFO: [Torch-TensorRT - Debug Build] - Loaded engine size: 0 MB
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] deserializeCudaEngine begin: CPU 3794 MiB, GPU 1984 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT - Debug Build] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 3794, GPU 1994 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3794, GPU 2002 (MiB)
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Deserialization required 25742 microseconds.
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] deserializeCudaEngine end: CPU 3794 MiB, GPU 1984 MiB
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] ExecutionContext creation begin: CPU 3794 MiB, GPU 1984 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT - Debug Build] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 3794, GPU 1994 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3794, GPU 2002 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Total per-runner device memory is 0
DEBUG: [Torch-TensorRT - Debug Build] - Total per-runner host memory is 0
DEBUG: [Torch-TensorRT - Debug Build] - Allocated activation device memory of size 0
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] ExecutionContext creation end: CPU 3794 MiB, GPU 2002 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Binding name: 4
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
unknown file: Failure
C++ exception with description "[Error thrown at core/runtime/TRTEngine.cpp:65] Expected delim != std::string::npos to be true but got false
Unable to determine binding index for input 4
Ensure module was compiled with Torch-TensorRT.ts or follows Torch-TensorRT Runtime conventions
" thrown in the test body.
[  FAILED  ] Converters.ATenToSingleConvertsCorrectly (7865 ms)
[----------] 1 test from Converters (7865 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (7865 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] Converters.ATenToSingleConvertsCorrectly

 1 FAILED TEST

bowang007 · 2022-06-27T17:32:52Z

same error with #982.

inocsin · 2022-06-29T02:58:45Z

@bowang007 Delete this line will also solve the problem https://github.com/pytorch/TensorRT/blob/master/core/conversion/conversionctx/ConversionCtx.cpp#L133

bowang007 · 2022-06-30T22:37:46Z

@bowang007 Delete this line will also solve the problem https://github.com/pytorch/TensorRT/blob/master/core/conversion/conversionctx/ConversionCtx.cpp#L133

yes, we discussed this WAR in the channel.
However, not sure if this deletion would trigger other issues.

bowang007

LGTM

Signed-off-by: inocsin <vcheungyi@163.com>

inocsin · 2022-07-22T01:25:16Z

@dheerajperi reverted change

[fix]: fix bug in aten::to, when network only have aten::to layer wil…

ebfb086

…l change input name Signed-off-by: inocsin <vcheungyi@163.com>

facebook-github-bot added the cla signed label Jun 11, 2022

github-actions bot added component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: tests Issues re: Tests labels Jun 11, 2022

github-actions bot requested review from andi4191, bowang007, narendasan and peri044 June 24, 2022 00:54

bowang007 approved these changes Jul 5, 2022

View reviewed changes

bowang007 mentioned this pull request Jul 6, 2022

fix: converter renaming already named tensors #1167

Merged

7 tasks

ncomly-nvidia added the release: v1.2 Tagged to be included in v1.2 label Jul 15, 2022

github-actions bot requested a review from bowang007 July 15, 2022 01:20

narendasan added the Story: Binding Names Issues related to binding names, format and uniqueness label Jul 15, 2022

fix: revert changes in castITensor

f69cfc4

Signed-off-by: inocsin <vcheungyi@163.com>

peri044 approved these changes Jul 22, 2022

View reviewed changes

peri044 merged commit fc04d4a into pytorch:master Jul 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix]: fix bug in aten::to, when network only have aten::to layer wil… #1108

[fix]: fix bug in aten::to, when network only have aten::to layer wil… #1108

inocsin commented Jun 11, 2022 •

edited

Loading

inocsin commented Jun 11, 2022

narendasan commented Jun 17, 2022

bowang007 commented Jun 22, 2022

bowang007 commented Jun 22, 2022 •

edited

Loading

inocsin commented Jun 26, 2022

bowang007 commented Jun 27, 2022

inocsin commented Jun 29, 2022

bowang007 commented Jun 30, 2022

bowang007 left a comment

inocsin commented Jul 22, 2022

[fix]: fix bug in aten::to, when network only have aten::to layer wil… #1108

[fix]: fix bug in aten::to, when network only have aten::to layer wil… #1108

Conversation

inocsin commented Jun 11, 2022 • edited Loading

Description

Type of change

Checklist:

inocsin commented Jun 11, 2022

narendasan commented Jun 17, 2022

bowang007 commented Jun 22, 2022

bowang007 commented Jun 22, 2022 • edited Loading

inocsin commented Jun 26, 2022

bowang007 commented Jun 27, 2022

inocsin commented Jun 29, 2022

bowang007 commented Jun 30, 2022

bowang007 left a comment

Choose a reason for hiding this comment

inocsin commented Jul 22, 2022

inocsin commented Jun 11, 2022 •

edited

Loading

bowang007 commented Jun 22, 2022 •

edited

Loading