Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-generate custom layer names #421

Closed
wants to merge 1 commit into from

Conversation

alsrgv
Copy link
Contributor

@alsrgv alsrgv commented Sep 27, 2020

Here's how layers look like in TRT profiler:

[CONSTANT #11] torch.zeros((1, 17, 1800, 1800, 2), device=cuda:0, dtype=torch.float32): 0.004064ms
[CONSTANT #12] torch.tensor([3], dtype=torch.int32, device=cuda:0): 0.002752ms
output_1_broadcast: 0.043008ms
output_11_broadcast: 84.5632ms
[CONVOLUTION #1] torch.nn.Conv2d.forward(Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False), tensor(shape=[1, 3, 1800, 1800], dtype=torch.float32)) + [RELU #1] torch.nn.ReLU.forward(ReLU(inplace=True), tensor(shape=[1, 64, 900, 900], dtype=torch.float32)): 4.36298ms
[MAX #1] torch.nn.functional.max_pool2d(tensor(shape=[1, 64, 900, 900], dtype=torch.float32), 3, 2, 1, 1, False, False): 1.88826ms
[CONVOLUTION #2] torch.nn.Conv2d.forward(Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False), tensor(shape=[1, 64, 450, 450], dtype=torch.float32)) + [RELU #2] torch.nn.ReLU.forward(ReLU(inplace=True), tensor(shape=[1, 64, 450, 450], dtype=torch.float32)): 0.858112ms

@alsrgv
Copy link
Contributor Author

alsrgv commented Sep 27, 2020

@jaybdub, what do you think about the format in the description?

@jaybdub
Copy link
Contributor

jaybdub commented Oct 5, 2020

Hi @alsrgv ,

Apologies for the delayed response, I was offline last week.

This looks pretty good to me!

Some things that may be nice to add (Just listing these features here as a note. They could come at a separate time.):

  1. Argument names for methods. Like torch.nn.functional.max_pool2d(..., kernel_size=3, stride=2, ...).
  2. Current module name. Like "model.backbone"

I will try to run the tests on the PR soon to validate / merge.

Please let me know if this PR is good from your perspective, or if you feel anything is missing for your use cases.

Best,
John

@alsrgv
Copy link
Contributor Author

alsrgv commented Oct 6, 2020

It's ready to merge from our side. It actually does record kwargs automatically, you can see Conv2D has kernel_size=(7, 7). Current module name is not recorded, I think we could add it later.

@jaybdub
Copy link
Contributor

jaybdub commented Oct 14, 2020

Hey @alsgrv,

Sorry for the delay in testing this. I just test and all the unit tests pass, but it's failing for resnet18 (and probably other multi-layer networks).

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/workdir/torch2trt/torch2trt/torch2trt.py", line 535, in torch2trt
    outputs = module(*inputs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torchvision/models/resnet.py", line 200, in _forward
    x = self.relu(x)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/workdir/torch2trt/torch2trt/torch2trt.py", line 284, in wrapper
    converter["converter"](ctx)
  File "/workdir/torch2trt/torch2trt/converters/ReLU.py", line 10, in convert_ReLU
    input=input_trt, type=trt.ActivationType.RELU)
  File "/workdir/torch2trt/torch2trt/torch2trt.py", line 355, in wrapper
    self._set_layer_name(ret)
  File "/workdir/torch2trt/torch2trt/torch2trt.py", line 343, in _set_layer_name
    self._layer_counts[layer.type] += 1
TypeError: __eq__(): incompatible function arguments. The following argument types are supported:
    1. (self: tensorrt.tensorrt.LayerType, arg0: tensorrt.tensorrt.LayerType) -> bool

Invoked with: LayerType.CONVOLUTION, ActivationType.RELU

I'm not sure exactly what's happening right now, it seems like layer.type for activation returns an ActivationType rather than LayerType, and this causes an error when comparing the two in the dictionary lookup.

I'll see if I can find a quick fix, but curious if you hit this error yet.

Best,
John

@jaybdub
Copy link
Contributor

jaybdub commented Oct 14, 2020

Seems like the TensorRT Python API is not returning a tensorrt.LayerType for all layers. Seems comparison cannot be drawn across different types because they don't have a comparator implemented.

The simplest workaround I found is to use str(layer.type) as the key. This returns "LayerType.CONVOLUTION". We could also strip the prefix to be just "CONVOLUTION". Here's the code I'm using

def layer_type_str(layer):
    return str(layer.type).split('.')[-1]


class LayerNamingNetworkWrapper(object):
    def __init__(self, ctx, network):
        self._ctx = ctx
        self._network = network
        self._layer_counts = defaultdict(lambda: 0)

    def _set_layer_name(self, layer):
        def arg_str(arg):
            if isinstance(arg, torch.Tensor):
                return "tensor(shape=%s, dtype=%s)" % (str(list(
                    arg.shape)), str(arg.dtype))
            return str(arg)

        self._layer_counts[layer_type_str(layer)] += 1
        args = [arg_str(arg) for arg in self._ctx.method_args]
        kwargs = [
            "%s=%s" % (key, arg_str(arg))
            for key, arg in self._ctx.method_kwargs.items()
        ]
        layer.name = "[%s #%d] %s(%s)" % (
            layer.type.name, self._layer_counts[layer_type_str(layer)],
            self._ctx.method_str, ", ".join(args + kwargs))

    def __getattr__(self, name):
        attr = getattr(self._network, name)
        if callable(attr):

            def wrapper(*args, **kwargs):
                ret = attr(*args, **kwargs)
                if isinstance(ret, trt.ILayer):
                    self._set_layer_name(ret)
                return ret

            return wrapper
        else:
            return attr

Produces similar names

'[CONVOLUTION #1] torch.nn.Conv2d.forward(Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False), tensor(shape=[1, 3, 224, 224], dtype=torch.float32))'
>>> model_trt.network.get_layer(1).name
'[SCALE #1] torch.nn.BatchNorm2d.forward(BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True), tensor(shape=[1, 64, 112, 112], dtype=torch.float32))'
>>> model_trt.network.get_layer(2).name
'[RELU #1] torch.nn.ReLU.forward(ReLU(inplace=True), tensor(shape=[1, 64, 112, 112], dtype=torch.float32))'
>>> model_trt.network.get_layer(3).name
'[MAX #1] torch.nn.functional.max_pool2d(tensor(shape=[1, 64, 112, 112], dtype=torch.float32), 3, 2, 1, 1, False, False)'
>>> model_trt.network.get_layer(4).name
'[CONVOLUTION #2] torch.nn.Conv2d.forward(Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False), tensor(shape=[1, 64, 56, 56], dtype=torch.float32))'
>>> model_trt.network.get_layer(5).name
'[SCALE #2] torch.nn.BatchNorm2d.forward(BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True), tensor(shape=[1, 64, 56, 56], dtype=torch.float32))'

If this fix is good on your end, I'll incorporate it and merge.

Best,
John

@alsrgv
Copy link
Contributor Author

alsrgv commented Oct 16, 2020

Hey John, thanks for the fix -- this LGTM, looking forward to the merge.

@jaybdub jaybdub mentioned this pull request Oct 19, 2020
@jaybdub
Copy link
Contributor

jaybdub commented Oct 19, 2020

This PR is now resolved in master. Thanks!

@jaybdub jaybdub closed this Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants