[tests] use `torch_device` instead of `auto` for model testing #29531

faaany · 2024-03-08T08:06:04Z

What does this PR do?

When running test cases under the models folder on XPU, I found that many model tests fail at the same test test_model_parallel_beam_search, e.g.

FAILED tests/models/bigbird_pegasus/test_modeling_bigbird_pegasus.py::BigBirdPegasusModelTest::test_model_parallel_beam_search - RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and xpu:0! (when che

This is because device_map="auto" is used. As elaborated in this PR, the device_map="auto" mechanism is still not mature on XPU, causing model to be loaded on CPU, rather than on XPU.

If there is no particular reason for using auto, I would suggest using torch_device instead auto, because torch_device is more specific than auto and we don't have the need to use auto, e.g. large model inference, in our tests anyway. WDYT? @ArthurZucker

amyeroberts · 2024-03-08T09:43:43Z

Hi @faaany - thanks for opening this PR! Using device_map="auto" is necessary for this test - it's checking that beam search works when the model is split across devices. If it doesn't work with XPU, then you can add a skip with unittest e.g.

if "xpu" in torch_device:
    return unittest.skip("device_map='auto' does not work with XPU devices")

faaany · 2024-03-08T10:06:26Z

Hi @amyeroberts , thanks so much for reviewing this PR! I updated my patch and used torch_device=="xpu" instead of 'xpu' in torch_device, hope this is fine for you. And yes, I will remove this skip once auto works on XPU.

amyeroberts

Thanks for handling!

Just a small nit

tests/generation/test_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

faaany · 2024-03-08T10:20:29Z

Thanks for handling!

Just a small nit

done, thx!

amyeroberts

Thanks!

`return unittest.skip()` used in the `test_model_parallel_beam_search` in skip condition for xpu did not actually mark test to be skipped running under pytest: * 148 passed, 1 skipped Other tests use `self.skipTest()`. Reusing this approach and moving the condition outside the loop (since it does not depend on it) allows to skip for xpu correctly: * 148 skipped Secondly, `device_map="auto"` is now implemented for XPU for IPEX>=2.5 and torch>=2.6, so we can now enable these tests for XPU for new IPEX/torch versions. Fixes: 1ea3ad1 ("[tests] use `torch_device` instead of `auto` for model testing (huggingface#29531)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

`return unittest.skip()` used in the `test_model_parallel_beam_search` in skip condition for xpu did not actually mark test to be skipped running under pytest: * 148 passed, 1 skipped Other tests use `self.skipTest()`. Reusing this approach and moving the condition outside the loop (since it does not depend on it) allows to skip for xpu correctly: * 148 skipped Secondly, `device_map="auto"` is now implemented for XPU for IPEX>=2.5 and torch>=2.6, so we can now enable these tests for XPU for new IPEX/torch versions. Fixes: 1ea3ad1 ("[tests] use `torch_device` instead of `auto` for model testing (#29531)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

…ngface#35742) `return unittest.skip()` used in the `test_model_parallel_beam_search` in skip condition for xpu did not actually mark test to be skipped running under pytest: * 148 passed, 1 skipped Other tests use `self.skipTest()`. Reusing this approach and moving the condition outside the loop (since it does not depend on it) allows to skip for xpu correctly: * 148 skipped Secondly, `device_map="auto"` is now implemented for XPU for IPEX>=2.5 and torch>=2.6, so we can now enable these tests for XPU for new IPEX/torch versions. Fixes: 1ea3ad1 ("[tests] use `torch_device` instead of `auto` for model testing (huggingface#29531)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

faaany and others added 2 commits March 6, 2024 22:45

use torch_device

167c188

Merge branch 'huggingface:main' into auto

c4388f5

faaany added 2 commits March 8, 2024 02:03

skip for XPU

c5e17af

Merge branch 'auto' of https://github.com/faaany/transformers into auto

77b763c

amyeroberts reviewed Mar 8, 2024

View reviewed changes

tests/generation/test_utils.py Outdated Show resolved Hide resolved

Update tests/generation/test_utils.py

5e7f979

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

amyeroberts approved these changes Mar 8, 2024

View reviewed changes

amyeroberts merged commit 1ea3ad1 into huggingface:main Mar 8, 2024
17 checks passed

ArthurZucker mentioned this pull request Apr 3, 2024

Fix slow tests for important models to be compatible with A10 runners #29905

Merged

dvrogozh mentioned this pull request Jan 17, 2025

ci: fix xpu skip condition for test_model_parallel_beam_search #35742

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tests] use `torch_device` instead of `auto` for model testing #29531

[tests] use `torch_device` instead of `auto` for model testing #29531

faaany commented Mar 8, 2024 •

edited

Loading

amyeroberts commented Mar 8, 2024

faaany commented Mar 8, 2024 •

edited

Loading

amyeroberts left a comment

faaany commented Mar 8, 2024

amyeroberts left a comment

[tests] use torch_device instead of auto for model testing #29531

[tests] use torch_device instead of auto for model testing #29531

Conversation

faaany commented Mar 8, 2024 • edited Loading

What does this PR do?

amyeroberts commented Mar 8, 2024

faaany commented Mar 8, 2024 • edited Loading

amyeroberts left a comment

Choose a reason for hiding this comment

faaany commented Mar 8, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

[tests] use `torch_device` instead of `auto` for model testing #29531

[tests] use `torch_device` instead of `auto` for model testing #29531

faaany commented Mar 8, 2024 •

edited

Loading

faaany commented Mar 8, 2024 •

edited

Loading