Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU not working #329

Closed
7 tasks
farizy4n opened this issue Nov 3, 2023 · 14 comments
Closed
7 tasks

GPU not working #329

farizy4n opened this issue Nov 3, 2023 · 14 comments

Comments

@farizy4n
Copy link

farizy4n commented Nov 3, 2023

Why always use CPU, in reality I choose to use GPU. it's like it's not working at all.

Specifications of the computer I use.

Ryzen 5 5600x
Nvidia RTX 4080
32GB DDR4

Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'cudnn_conv_algo_search': 'EXHAUSTIVE', 'device_id': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'has_user_compute_stream': '0', 'gpu_external_alloc': '0', 'enable_cuda_graph': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0'}, 'CPUExecutionProvider': {}} inswapper-shape: [1, 3, 128, 128]

Details
What OS are you using?

  • Linux
  • Linux in WSL
  • [ X] Windows
  • Mac

Are you using a GPU?

  • No. CPU FTW
  • [ X] NVIDIA
  • AMD
  • Intel
  • Mac

roop 3.3.4

image

@CPioGH2002
Copy link

CPioGH2002 commented Nov 3, 2023

Same here on Linux using V3.3.4 with an Nvidia card, 545.29.02 driver, CUDA installed:

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Jan__6_16:45:21_PST_2023
Cuda compilation tools, release 12.0, V12.0.140
Build cuda_12.0.r12.0/compiler.32267302_0

Roop startup looks correct:
"Using provider ['CUDAExecutionProvider'] - Device:cuda"
(no error messages)

But once the process is started only:

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/user/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/user/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/user/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/user/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/user/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5
set det-size: (640, 640)
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
inswapper-shape: [1, 3, 128, 128]
Sorting videos/images
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
inswapper-shape: [1, 3, 128, 128]
Creating video__temp.mp4 with 60.0 FPS...
['ffmpeg', '-hide_banner', '-hwaccel', 'auto', '-y', '-loglevel', 'error', '-f', 'rawvideo', '-vcodec', 'rawvideo', '-s', '1110x634', '-pix_fmt', 'bgr24', '-r', '60.0', '-an', '-i', '-', '-vcodec', 'libx264', '-crf', '10', '-vf', 'colorspace=bt709:iall=bt601-6-625:fast=1', '-pix_fmt', 'yuv420p', '/media/roop-unleashed/output/video__temp.mp4']

and very poor performance due to the graphics card basically remaining idle.

I haven't used the app in a while and updated via "git pull" without touching any settings. But I did check if "CUDA" was selected properly in the settings. The startup message also confirms this being active.

Doesn't look like any errors are thrown while processing. It just never uses CUDA but opts for the CPU instead.

@Fikka7
Copy link

Fikka7 commented Nov 3, 2023

Curious to know if you see the same performance with CUDA 11.8 ? I believe the original roop was tested with that (https://github.com/s0md3v/roop/wiki/2.-Acceleration).

@alast0r
Copy link

alast0r commented Nov 5, 2023

you are only using 2 threads

@CPioGH2002
Copy link

That is a correct observation and, if CUDA is enabled, also the optimal amount for my ageing card. Now, since CUDA (for whatever reason) doesn't enable, it indeed runs two threads on the already much slower CPU basis. This brings me back to the issue of CUDA not being used, hence my post in this thread.

@lysxelapsed
Copy link

that is strange. did you check if it indeed saved it to the config.yaml? if it even tried cuda without success, there would be an error message and a "fallback to cpu execution provider" or something like that.
I'd try an re-install if you can't make sense of it.

@farizy4n
Copy link
Author

farizy4n commented Nov 6, 2023

After several experiments, I found the main problem was because the core was set to 2. I use a computer with Ryzen 5 + RTX 4080 16Gb VRAM, the optimal setting should be to use 6 cores. This seems to refer to the physical core of the CPU. Just an assumption.

@farizy4n farizy4n closed this as completed Nov 6, 2023
@CPioGH2002
Copy link

did you check if it indeed saved it to the config.yaml? if it even tried cuda without success, there would be an error message

Good points indeed. I checked the config.yaml and saw provider: cuda. Also, the startup message Using provider ['CUDAExecutionProvider'] - Device:cuda implies that this setting is respected and used. But I'll try a fresh installation the other day I guess, just in case.

@CPioGH2002
Copy link

CPioGH2002 commented Nov 15, 2023

Finally could try with a fresh installation, in an equally fresh venv and running into the same problem. Indeed seems like my newer CUDA 12.0 should be changed to 11.8 to be compatible since I now receive

[ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcufft.so.10: cannot open shared object file: No such file or directory

errors.

Update:
Tried with altering the requirements.txt from "onnxruntime-gpu==1.16.1" to "onnxruntime-gpu==1.16.2" to receive the later version (from this week) but this one also isn't compatible with CUDA 12 and that's the only CUDA toolkit I have in my repos. And I don't want to manually downgrade to 11.8, so that's where the story ends, for now.

Since this isn't a roop-unleashed related problem but one of the ONNX Runtime, I have to look for updates on that end or downgrade CUDA.

@lysxelapsed
Copy link

lysxelapsed commented Nov 15, 2023

I'm running roop-unleashed with cuda 12.2, no problems.

  • Did you have error messages when installing cuda?
  • Did you have error messages when installing requirements.txt?
  • Did you restart roop-unleashed after install and / or changes to the provider in the settings tab? -> needs to reload the models
  • Did you check if cuda is properly integrated in PATH? Also you can install multiple cuda versions. For Checking and usage of multiple versions, see here

Cuda installation throws errors when Visual Studio isn't installed properly for example.
Installing requirements threw several version dependency errors. I adjusted them one by one in requirements.txt to the versions specified in the error message, until install went without errors. This is what my adjusted requirements.txt looks in my case, working - yes, I know, torch points to cuda 11.8. Still, I have no other cuda versions installed besides 12.2:

--extra-index-url https://download.pytorch.org/whl/cu118

numpy==1.24.2
gradio==3.44.2
opencv-python==4.8.0.76
onnx==1.14.1
insightface==0.7.3
psutil==5.9.5
pillow==10.0.1
torch==2.0.1+cu118; sys_platform != 'darwin'
torch==2.0.1; sys_platform == 'darwin'
torchvision==0.15.2+cu118; sys_platform != 'darwin'
torchvision==0.15.2; sys_platform == 'darwin'
onnxruntime==1.16.0; sys_platform == 'darwin' and platform_machine != 'arm64'
onnxruntime-silicon==1.13.1; sys_platform == 'darwin' and platform_machine == 'arm64'
onnxruntime-gpu==1.16.1; sys_platform != 'darwin'
protobuf==4.23.2
tqdm==4.66.1
ftfy
regex
pyvirtualcam

Also you can try pip uninstall onnxruntime onnxruntime-gpu, followed by pip install onnxruntime-gpu==1.16.1

In case of install sequence, this has always been my working order:

  1. Visual Studio with C++
  2. Cuda Toolkit and cudnn
  3. roop-unleashed, adjusted until install didn't throw any error messages anymore

@CPioGH2002
Copy link

CPioGH2002 commented Nov 15, 2023

Thanks for trying to help. Good points to check indeed, I will get back once I tried them. Mind you, I'm on Linux.

As for my assumption regarding the ONNX Runtime 1.16.2. My understanding is that it's not compatible with CUDA 12+, but ends at 11.8. Anyhow, will try some more things later and report back.

Forgot to add:
Regarding CUDA versions, my driver reports 12.3 while the CUDA Toolkit (which seems like the relevant part for this type of operation using roop) is at 12.0. So I should have been more precise before when speaking about the CUDA version I was referring to: It means the CUDA Toolkit version 12.0.

@lysxelapsed
Copy link

lol sorry about that. I was looking at the issue creator's post for the specs. Here's the equivalent solution for linux regarding "adding to path", actually with the exact same error message. I'd try this before anything else.

@CPioGH2002
Copy link

CPioGH2002 commented Jan 29, 2024

For anyone looking into this issue at a later point. I wanted to report back, so here I am (albeit very much later):
I did install the new ONNX Runtime 1.17 (which isn't final so far, but once can go with the nightly one to test things) which does support CUDA 12 and avoids messing with CUDA versions on a system-wide level.

Source: microsoft/onnxruntime#19292 (comment)

Happy to say that this then enables the GPU-bound processing with roop again, with all the speed benefits. The fact of this being the nightly built, so far, isn't leading to problems.

So you basically download the file which suits your python version (the filename contains "cp311" for python 3.11 for example) and then install it in the venv where roop is running ("pip install [path/to/file]").

To check the installed version, run this in python:

import onnxruntime
print(onnxruntime.__version__)

and hopefully receive 1.17.0 as output.

After that, run roop again and see your GPU at work. :-)

Note: (-see EDIT below!-)
Once the final version of the ONNX Runtime 1.17 has arrived and you are using a fresh install of roop (which catches the latest release), these steps won't be needed any more. For already installed versions, you would simply have to trigger an update check.

Once can check the latest releases ONNX here: https://github.com/microsoft/onnxruntime/releases

EDIT: For some reason, even the now final ONXX runtime 1.17 causes the error and only the nightly release mentioned does not. So if you would still encounter it, try the nightly version to make the issue go away.

@tianleiwu
Copy link

tianleiwu commented Feb 20, 2024

See https://onnxruntime.ai/docs/install/#requirements for onnxruntime-gpu for CUDA 12.* installation.

@CPioGH2002
Copy link

Much appreciated info on using the final 1.17. Thanks for that. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants