chore: dynamic shape support for any/sort/trunc ops #3026

keehyuna · 2024-07-21T07:07:50Z

Description

dynamic shape support for any/sort/trunc ops

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

New feature (non-breaking change which adds functionality)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

keehyuna · 2024-07-24T07:04:57Z

py/torch_tensorrt/dynamo/conversion/impl/topk.py

+            shuffle_layer.get_output(0),
+            TRT_TOPK_MAX_ELEMENT,
+        )
+        ctx.net.add_assertion(


when dynamic k > 3840, there is segmentation fault rather than runtime error. Error is like below

Python error: Segmentation fault

Thread 0x00007fda56bfe640 (most recent call first):
File "/root/.pyenv/versions/3.10.14/lib/python3.10/threading.py", line 324 in wait
File "/root/.pyenv/versions/3.10.14/lib/python3.10/threading.py", line 607 in wait
File "/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/tqdm/_monitor.py", line 60 in run
File "/root/.pyenv/versions/3.10.14/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
File "/root/.pyenv/versions/3.10.14/lib/python3.10/threading.py", line 973 in _bootstrap

Current thread 0x00007fdc02f65740 (most recent call first):
File "/root/trt/TensorRT/py/torch_tensorrt/dynamo/runtime/_PythonTorchTensorRTModule.py", line 287 in forward
...
Extension modules: numpy.core._multiarray_umath, ...
Segmentation fault

I tried catch it from add_assertion() but it doesn't work while simple test with add_assertion() was ok.
Any idea to use it? Or we can with this behavior to enable dynamic k value for sort?
There is no problem if dynamic k value is in range as in test code

I think the issue is due to TRT's limitation. One possible solution is to fall back to pytorch if k > 3840. We don't expect to throw any errors due to TRT

Thanks for review. As we discuss, I changed to allow this ops when k value is static because we cannot identify dynamic k value at compile time.

zewenli98 · 2024-07-24T22:47:01Z

py/torch_tensorrt/dynamo/conversion/impl/topk.py

+            shape=[1],
+            stride=[1],
+        )
+        set_layer_name(layer, target, name)


Please pass in source_ir and add suffix for the layer's name

zewenli98 · 2024-07-24T22:51:03Z

py/torch_tensorrt/dynamo/conversion/impl/topk.py

+        set_layer_name(layer, target, name)
+
+        # Get scalar tensor from 1d tensor
+        shuffle_layer = ctx.net.add_shuffle(layer.get_output(0))
+        shuffle_layer.reshape_dims = trt.Dims()
+        set_layer_name(shuffle_layer, target, name, source_ir)


Please add suffix for the layer's name

zewenli98 · 2024-07-24T23:21:02Z

py/torch_tensorrt/dynamo/conversion/impl/topk.py

+            shuffle_layer.get_output(0),
+            TRT_TOPK_MAX_ELEMENT,
+        )
+        ctx.net.add_assertion(


I think the issue is due to TRT's limitation. One possible solution is to fall back to pytorch if k > 3840. We don't expect to throw any errors due to TRT

py/torch_tensorrt/dynamo/conversion/impl/topk.py

keehyuna · 2024-07-29T06:47:52Z

py/torch_tensorrt/dynamo/conversion/aten_ops_converters.py

    dim = node.args[1]
    dim = get_positive_dim(dim, len(shape))
    k = shape[dim]
+    if not isinstance(k, int):


If dim k is static and other dims are dynamic, we can support sort ops.
This is to validate if k is static or not.

Thanks for the comment. Is it possible that k is -1?

Thanks for bringing this to my attention.
I checked only dynamic dims are passed to export() in torch_tensorrt.dynamo.trace()
https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/dynamo/_tracer.py#L81
So I think static dim value in meta_data will be valid dim value(>0). If you think static shape dim value from meta data could be -1, I will update to validate this case as well.

+elif isinstance(k, int) and k < 0:

return False

keehyuna · 2024-07-29T06:49:39Z

tests/py/dynamo/conversion/test_sort_aten.py

+            Sort(),
+            input_specs,
+            output_dtypes=[torch.float, torch.int64],
+            use_dynamo_tracer=True,


tensor meta data is available when use_dynamo_tracer=True

keehyuna · 2024-07-29T06:58:52Z

py/torch_tensorrt/dynamo/conversion/impl/topk.py

+            shuffle_layer.get_output(0),
+            TRT_TOPK_MAX_ELEMENT,
+        )
+        ctx.net.add_assertion(


Thanks for review. As we discuss, I changed to allow this ops when k value is static because we cannot identify dynamic k value at compile time.

peri044

LGTM

chore: dynamic shape support for any/sort/trunc ops

6d739ee

keehyuna self-assigned this Jul 21, 2024

facebook-github-bot added the cla signed label Jul 21, 2024

github-actions bot requested a review from apbose July 21, 2024 07:08

keehyuna marked this pull request as ready for review July 24, 2024 07:05

keehyuna requested review from peri044, chohk88 and zewenli98 July 24, 2024 07:06

keehyuna commented Jul 24, 2024

View reviewed changes

zewenli98 reviewed Jul 24, 2024

View reviewed changes

py/torch_tensorrt/dynamo/conversion/impl/topk.py Outdated Show resolved Hide resolved

chore: Cannot support dynamic k in topk

e1491b0

keehyuna force-pushed the sort_dynamic_shape branch from d17af0e to e1491b0 Compare July 29, 2024 06:54

keehyuna commented Jul 29, 2024

View reviewed changes

peri044 approved these changes Jul 31, 2024

View reviewed changes

peri044 merged commit 23b4f1e into pytorch:main Jul 31, 2024
54 of 61 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: dynamic shape support for any/sort/trunc ops #3026

chore: dynamic shape support for any/sort/trunc ops #3026

keehyuna commented Jul 21, 2024

keehyuna Jul 24, 2024

zewenli98 Jul 24, 2024

keehyuna Jul 29, 2024

zewenli98 Jul 24, 2024

zewenli98 Jul 24, 2024

zewenli98 Jul 24, 2024

keehyuna Jul 29, 2024

zewenli98 Jul 31, 2024

keehyuna Jul 31, 2024

keehyuna Jul 29, 2024

keehyuna Jul 29, 2024

peri044 left a comment

chore: dynamic shape support for any/sort/trunc ops #3026

chore: dynamic shape support for any/sort/trunc ops #3026

Conversation

keehyuna commented Jul 21, 2024

Description

Type of change

Checklist:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peri044 left a comment

Choose a reason for hiding this comment