-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
terminate called after throwing an instance of 'c10::Error' #7561
Comments
Looks like an issue of how we are registering our decoding ops:
|
Not being able to run the collection for the environment in your setup is concerning. I don't have a ROCm box ATM. @malfet could you have a look? |
seeing this too on NixOS with rocm 5.4, RX 6800 gpu. issue is unrelated to Automatic1111 since I am not using that at all. pytorch 2.0.1 and torchvision 0.15.2 respectively, by the way. (also happened with 2.0.0 and 0.15.1) |
@justinkb Does the environment collection script work for you? python -m 'torch.utils.collect_env' |
I doubt it would work, due to the peculiarities of Nix. I did some digging, however, it looks like ninja ends up generating build instructions like this with hipified sources:
That is not what is supposed to happen I think. The image lib ends up with two objects that both do this https://github.com/pytorch/vision/blob/main/torchvision/csrc/io/image/image.cpp#LL22C2-L22C2 |
So it ends up happening because of the cuda jpeg decode in image.h, that gets hipified on ROCm. I can disable jpeg decode in my build to verify if the issue disappears completely then. (as a verification only, not a fix) edit: tried this, it still hipified the relevant sources anyway, so that didn't prove or disprove anything |
Fixed by adding
in setup.py below the hipify_python.hipify invocation. |
Will take care of fixing/adding tests for collect_env... |
I can confirm that this fixes the issue. |
Should prevent broken collect_env reporting as shown in pytorch/vision#7561 (comment) copilot:poem
Should prevent broken collect_env reporting as shown in pytorch/vision#7561 (comment) <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 5204e0f</samp> > _`get_version_or_na`_ > _Helper function refactors_ > _Code like autumn leaves_ Pull Request resolved: #101844 Approved by: https://github.com/kit1980, https://github.com/ZainRizvi
🐛 Describe the bug
I've compiled torch and vision from their main branches. When running Automatic1111's webui for Stable Diffusion, I get the following error message:
I'm not sure if the bug lies with Automatic1111 or vision, or even if it's a bug at all, but I'm trying here first.
I'm running Ubuntu 22.04.2 and I have a 7600X CPU and a 7900 XTX GPU, if that matters. I'm also using ROCm 5.5.
Versions
cc @jeffdaily @jithunnair-amd
The text was updated successfully, but these errors were encountered: