You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/bin/otx", line 8, in <module>
sys.exit(main())
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/otx/cli/tools/cli.py", line 77, in main
results = globals()[f"otx_{name}"]()
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/otx/cli/tools/optimize.py", line 146, in main
predicted_validation_dataset = task.infer(
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/otx/algorithms/detection/task.py", line 300, in infer
prediction_results, _ = self._infer_model(dataset, inference_parameters)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/otx/algorithms/detection/adapters/mmdet/task.py", line 429, in _infer_model
eval_predictions = single_gpu_test(model, dataloader)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmdet/apis/test.py", line 29, in single_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 131, in wrapped
return module_call(self, *args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
result = forward_call(*args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmcv/parallel/data_parallel.py", line 51, in forward
return super().forward(*inputs, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 169, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/nncf/torch/nncf_network.py", line 886, in __call__
return ORIGINAL_CALL(self, *args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/nncf/torch/nncf_network.py", line 906, in forward
retval = wrap_module_call(self.nncf._original_unbound_forward)(self, *args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 151, in wrapped
retval = module_call(self, *args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmcv/runner/fp16_utils.py", line 119, in new_func
return old_func(*args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmdet/models/detectors/base.py", line 174, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmdet/models/detectors/base.py", line 147, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/otx/algorithms/detection/adapters/mmdet/models/detectors/custom_maskrcnn_tile_optimized.py", line 198, in simple_test
x = self.extract_feat(img)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmdet/models/detectors/two_stage.py", line 67, in extract_feat
x = self.backbone(img)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 151, in wrapped
retval = module_call(self, *args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1547, in _call_impl
hook_result = hook(self, args, result)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/otx/algorithms/common/adapters/mmcv/hooks/recording_forward_hook.py", line 75, in _recording_forward
tensors = self.func(output)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/otx/algorithms/detection/adapters/mmdet/hooks/det_class_probability_map_hook.py", line 177, in func
saliency_maps = self._get_saliency_maps_from_mask_predictions(feature_map, det_bboxes, det_labels)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/otx/algorithms/detection/adapters/mmdet/hooks/det_class_probability_map_hook.py", line 206, in _get_saliency_maps_from_mask_predictions
mask_results = self._module.roi_head._mask_forward(x, mask_rois)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmdet/models/roi_heads/standard_roi_head.py", line 186, in _mask_forward
mask_feats = self.mask_roi_extractor(
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 151, in wrapped
retval = module_call(self, *args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmcv/runner/fp16_utils.py", line 208, in new_func
return old_func(*args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmdet/models/roi_heads/roi_extractors/single_level_roi_extractor.py", line 93, in forward
NNCF relies on custom-wrapping the `forward` call in order to function properly.
Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behaviour.
If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling:
model.nncf.set_original_unbound_forward(fn)
roi_feats_t = self.roi_layers[i](feats[i], rois_i)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/nncf/torch/dynamic_graph/wrappers.py", line 151, in wrapped
retval = module_call(self, *args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmcv/ops/roi_align.py", line 215, in forward
return roi_align(input, rois, self.output_size, self.spatial_scale,
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
if `fn` has an unbound 0-th `self` argument, or
with model.nncf.temporary_bound_original_forward(fn): ...
if `fn` already had 0-th `self` argument bound or never had it in the first place.
2023-08-23 13:14:23,669 | INFO : ----------------- CustomMaskRCNN.load_state_dict_pre_hook() called w/ prefix:
2023-08-23 13:14:23,674 | INFO : ['rectangle', 'ellipse', 'triangle'] -> ['rectangle', 'ellipse', 'triangle'] ([0, 1, 2])
INFO:nncf:Loaded 1186/1186 parameters
2023-08-23 13:14:23,985 | INFO : ----------------- CustomMaskRCNN.load_state_dict_pre_hook() called w/ prefix:
2023-08-23 13:14:23,990 | INFO : ['rectangle', 'ellipse', 'triangle'] -> ['rectangle', 'ellipse', 'triangle'] ([0, 1, 2])
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/validation/actions-runner/_work/training_extensions/training_extensions/.tox/tests-iseg-py310/lib/python3.10/site-packages/mmcv/ops/roi_align.py", line 95, in forward
ext_module.roi_align_forward(
RuntimeError: CUDA error: invalid configuration argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
The text was updated successfully, but these errors were encountered:
goodsong81
changed the title
Torch2.0 CUDA runtime error during NNCF optimization of MaskRCNN at ROIAlign MMCV kernel
Torch2.0 CUDA runtime error during NNCF optimization of ROIAlign MMCV kernel for MaskRCNN
Aug 24, 2023
Describe the bug
While verifying the torch version upgrade from 1.13.1 to 2.0.1, there was integration test error(s) regarding NNCF optimize.
[Error log from CI run]
Steps to Reproduce
Environment:
The text was updated successfully, but these errors were encountered: