Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple hardware encoding #3419

Merged
merged 56 commits into from
Mar 10, 2023
Merged

Conversation

NodudeWasTaken
Copy link
Contributor

@NodudeWasTaken NodudeWasTaken commented Feb 9, 2023

What

This commit test a bunch of common hardware accelerated codec's at boot.
If the user has then enabled hardware encoding in the UI, and hw acceleration is available, it replaces the mp4/h264 and webm/vp9 transcoding with available hardware accelerated codecs.

Why

Encoding is alot more expensive than decoding, which is why it makes sense to use hw encoding when available.
Regarding the specific implementation:
It would be meaningless to try unavailable codecs, so we should remember which codecs work.
It doesnt add full hardware transcoding, simply because the requirements are too strict and it would require alot of checks to verify it works.

Working encoders

NVidia NVENC H.264 is tested and working.
Intel QSV H.264 is tested and working.
Intel and VAAPI VP9 is assumed to work.
VAAPI H.264 doesn't work for direct file transcoding, but is enabled for HLS.
h264_v4l2m2m doesn't work for direct file transcoding, but is enabled for HLS.

Not working/disabled encoders

The rest are disabled as i have no way of testing them.

Docker image

I also created an additional docker image built on NVidia CUDA ubuntu, since CUDA is needed for GPU passthrough to cuda, and alpine is incompatible with NVidia drivers

@bnkai bnkai added the feature Pull requests that add a new feature label Feb 10, 2023
pkg/ffmpeg/codec_detector.go Outdated Show resolved Hide resolved
pkg/ffmpeg/codec_detector.go Outdated Show resolved Hide resolved
@bnkai
Copy link
Collaborator

bnkai commented Feb 15, 2023

@NodudeWasTaken i have pasted the below ffmpeg -hide_banner -vaapi_device /dev/dri/renderD128 -f lavfi -i testsrc=duration=5:size=320x240:rate=25 -vf format=nv12,hwupload,scale_vaapi=-2:720 -c:v h264_vaapi -movflags frag_keyframe+empty_moov -qp 20 -f mp4 test.mp4 command in discord as well to see if the issue i have is with my card or more generic (my card doesnt support vp8/vp9 hw encode so nothing else i can try ). If you need some ffmpeg command tested for the hw detection / encoding part it might be easier to post it here or in the discord channel for some feedback?

@JanJastrow
Copy link

Hi, this looks like a great feature.
Will/Can/Should this also be used for generating Scene Scrubber Sprites, Previews, PHash etc? Since they all use FFmpeg, too. (AFAIK)

pkg/ffmpeg/codec_detector.go Outdated Show resolved Hide resolved
pkg/ffmpeg/codec_detector.go Outdated Show resolved Hide resolved
pkg/ffmpeg/codec_detector.go Outdated Show resolved Hide resolved
pkg/ffmpeg/codec_detector.go Outdated Show resolved Hide resolved
docker/build/x86_64/Dockerfile-CUDA Outdated Show resolved Hide resolved
@NodudeWasTaken
Copy link
Contributor Author

NodudeWasTaken commented Mar 7, 2023

Hi, this looks like a great feature. Will/Can/Should this also be used for generating Scene Scrubber Sprites, Previews, PHash etc? Since they all use FFmpeg, too. (AFAIK)

As far as i can tell, only previews use software encoding, sprites/screenshots/phash (all lead to screenshot encoder) could arguably benefit from using the "ffmpeg extra input args".
My only issue is that previews have a configureable preset, meaning i would have to convert it for each codec.
I will ask in the discord is this is desireable.

@WithoutPants
Copy link
Collaborator

As far as i can tell, only previews use software encoding, sprites/screenshots/phash (all lead to screenshot encoder) could arguably benefit from using the "ffmpeg extra input args". My only issue is that previews have a configureable preset, meaning i would have to convert it for each codec. I will ask in the discord is this is desireable.

This can be addressed in a separate PR as necessary. Let's not add more scope to this PR.

@WithoutPants WithoutPants added the needs testing Pull requests that require testing label Mar 7, 2023
@WithoutPants
Copy link
Collaborator

Note that the build is failing due to a merge issue against develop branch. Could you merge or rebase against develop and fix the compile issues?

I think some extra logging is necessary for debugging and tracing purposes, and it should use loglevel of error so that we can output the resulting error if a support call fails. On my machine, I had no apparent hardware support and had to add some logging and run the commands manually to determine why.

In my local build I added logger.Tracef("Running command: %s", cmd.String()) before running so that we can see the command being run in the trace log. See the FFMpeg.Generate code for how to output the error message - this should be logged at the Debug log level if the command fails.

In my personal case, I was getting an error message about the -preset value p2 used in the nvenc test. This was due to my ffmpeg version being too old. As far as I can tell, the preset isn't used in the actual encoding, so either a different preset should be used for the test, or the preset should be removed. The error message I then got was related to the drivers. Once I updated my drivers, along with my updated ffmpeg, I finally got h264_nvenc to show as supported.

Nodude added 4 commits March 9, 2023 10:55
Using legacy presets is removed in SDK 12 and deprecated since SDK 10.
This commit removed the preset to allow ffmpeg to select the default one.
@NodudeWasTaken
Copy link
Contributor Author

NodudeWasTaken commented Mar 9, 2023

I think some extra logging is necessary for debugging and tracing purposes, and it should use loglevel of error so that we can output the resulting error if a support call fails. On my machine, I had no apparent hardware support and had to add some logging and run the commands manually to determine why.

In my local build I added logger.Tracef("Running command: %s", cmd.String()) before running so that we can see the command being run in the trace log. See the FFMpeg.Generate code for how to output the error message - this should be logged at the Debug log level if the command fails.

Alright, now it prints hopefully useful debug messages in the log

In my personal case, I was getting an error message about the -preset value p2 used in the nvenc test. This was due to my ffmpeg version being too old. As far as I can tell, the preset isn't used in the actual encoding, so either a different preset should be used for the test, or the preset should be removed. The error message I then got was related to the drivers. Once I updated my drivers, along with my updated ffmpeg, I finally got h264_nvenc to show as supported.

The main problem is that the "legacy" presets are to be removed in NVVC SDK 12, and have been deprecated since SDK 10 (https://docs.nvidia.com/video-codec-sdk/12.0/deprecation-notices/index.html)
The preset is used for setting the speed/quality for the encode, to ensure enough speed.
I have left it out now so it defaults to p4 encoding to solve backwards compatibility.
I guess users can try ensure speed by adding output args -maxrate:v instead

@WithoutPants WithoutPants merged commit 0c1b023 into stashapp:develop Mar 10, 2023
DogmaDragon added a commit to stashapp/Stash-Docs that referenced this pull request Mar 10, 2023
@twist3dimages
Copy link

twist3dimages commented Mar 16, 2023

Can confirm it works on Windows. Not my Unraid server using Docker with Intel or Nvidia is there a separate repo I need to pull? like stash:development-nvidia or like stash:development-intel

@yd82
Copy link

yd82 commented Mar 24, 2023

I have the same question, I tried to config hw transcoding for a week on my unraid server with no success. Hope someone can point the right direction on how to do it. I don't know how to build the CUDA as described on Unraid.

@NodudeWasTaken
Copy link
Contributor Author

Can confirm it works on Windows. Not my Unraid server using Docker with Intel or Nvidia is there a separate repo I need to pull? like stash:development-nvidia or like stash:development-intel
I have the same question, I tried to config hw transcoding for a week on my unraid server with no success. Hope someone can point the right direction on how to do it. I don't know how to build the CUDA as described on Unraid.

It works on the CUDA build, im unaware of any plans or code to publish the CUDA build to dockerhub.
See #305 (comment) for how to use

@TwentySeven28
Copy link

Does anyone know of a way to get unRAID working with hardware transcoding? unRAID has a very large install base, and it's surprising to me that this cuda build is unavailable for unRAID users. There appears to be no way to retrofit containers either.

@Flashy78
Copy link
Contributor

Does anyone know of a way to get unRAID working with hardware transcoding? unRAID has a very large install base, and it's surprising to me that this cuda build is unavailable for unRAID users. There appears to be no way to retrofit containers either.

First off, it's the Unraid community who creates templates for installing things through Community Applications, not the developers of every application. So that's where you should be asking for that type of request.

Second, per the comment directly above yours, the cuda build is not published to Dockerhub, so if you want to use it, you must build it yourself locally.

Third, you can install any docker image you want in Unraid, you don't need someone to create a Community Application for you. This is how other users in this thread are building the cuda image locally and using it on Unraid.

Considering how new this is and how many other people in this thread have been unable to get it working, clearly more work needs to happen before it's a plug and play feature that can be widely released to the public.

@disconnect5852
Copy link

disconnect5852 commented Apr 5, 2023

I see: [InitHWSupport] Supported HW codecs:

Added in my docker compose yml:
devices:
- /dev/dri/renderD128:/dev/dri/renderD128
- /dev/dri/card0:/dev/dri/card0
also sudo chmod -R 777 /dev/dri on host
vainfo:
libva info: VA-API version 1.7.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_7
libva error: /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so init failed
libva info: va_openDriver() returns 1
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_1_6
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.7 (libva 2.6.0)
vainfo: Driver version: Intel i965 driver for Intel(R) Bay Trail - 2.4.0
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Simple : VAEntrypointEncSlice
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointEncSlice
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSlice
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSlice
VAProfileH264StereoHigh : VAEntrypointVLD
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
VAProfileNone : VAEntrypointVideoProc
VAProfileJPEGBaseline : VAEntrypointVLD

@DogmaDragon
Copy link
Collaborator

@TwentySeven28
Copy link

@TwentySeven28 https://docs.stashapp.cc/getting-started/installation/unraid/#nvidia-runtime

Thanks @DogmaDragon. That's super helpful. Any idea if I can retrofit this for Intel QuickSync?

@DogmaDragon
Copy link
Collaborator

@TwentySeven28 check #305, I saw people talking about it there. Also you can hop into Discord for more help, it's a bit out of my wheel house.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Pull requests that add a new feature needs testing Pull requests that require testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.