-
-
Notifications
You must be signed in to change notification settings - Fork 21.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Godot 3.5] Async shader compilation freeze the game for a longer time than sync compilation #64528
Comments
Loading time is expected to be higher with async compilation enabled, because it involves the compilation of the ubershader for each material, which is always synchronous. Therefore, games are expected to have at least the materials loaded at non-interactive situations (e.g., a loading screen). In contrast, at runtime async compilation helps when the materials are rendered, and that's because the already compiled ubershaders can be used instead of the conditioned shaders that can't be compiled until your objects are displayed, when the specific render conditions are known. Those conditioned shaders are the ones asynchronously compiled. So, your
|
Thank you for the quick response! I am sorry that I am still not very clear on how the async compilation actually works, so I would like to clarify it first, even after reading the related pr you made. Here are some questions in my head. I appreciate it if you can help me clear the things up.
I think the problem is that, in my tests, the compilation is too slow for the ubershader(s). It takes longer than compiling the actual shader directly. If the usbershader compilation cannot be compiled asynchronously, using sync compilation and force the compilation in the loading screen seems to be a completely better solution because the "loading time" (including t1+t2) would be shorter and there won't be lagging during rendering. |
Again, consider loading time and gameplay time separately. The ubershader is compiled at loading time if you design your project so that shaders/materials are not loaded during user interaction. In contrast, the conditioned shaders can't be but compiled during gameplay, when gameplay is hindered. Async compilation avoids that by using the ubershaders, which have been pre-compiled and so are ready for immediate use, in the meantime. Have you tried your project with the modified script? Could you share your results? |
Thank you for explaining! I think I have a better understanding on how it works now.
I added a
The result seems to be as expected.
My point is we can avoid hiccups with sync compilation too. For example, in my game, we have a system that can search for all shaders/materials that will be potentially used in the current gameplay session, and then it renders them (at least) once to compile them. This process is done during the loading screen being shown, hence, when the shaders/materials being shown in the actual gameplay later, no hiccups will occur since they are already compiled. That’s why I used t1+t2 (total unresponsive time) to compare the sync and async compilation. The pre-compilation works well in my game (while the system is quite tricky and complicated to implement). In my tests (both the project I provided and my game), the game freezes for a longer time when using async compilation, not to mention it also introduces the lower framerates issue (caused by background compilation for the conditioned shaders) after the ubershader compilation is finished. By the way, now knowing the ubershader compilation timing is coupled with the resources load timing, I think it could be another disadvantage of using async compilation with ubershaders, because some users may want to separate when the resources get loaded (from the storage) and compiled. It might not be a big concern for most users though. |
It's true that the good old pre-compilation trick, if you manage to do it correctly, beats asynchronous compilation. It can be seen as a tradeoff between developer time and run time. Besides, there's one idea to have something closer to manual pre-compilation but orders of magnitude simpler, in this proposal: godotengine/godot-proposals#4754. |
Then would it make more sense to implement a dedicated official pre-compilation system or provide some useful APIs instead? Like a class that you can set the context (e.g. environment, lighting, etc) and then just call a compile function to pre-compile one shader under the context. I think it can provides better usability and performance than the current ubershader approach while keeping it not too difficult to use. Nonetheless, I can still imagine the current ubershader being helpful in some cases.
I have read this before, but I am not entirely sure what it does. I thought it was just a way to record what shaders (and the contexts) will be needed to use in the game so that they will be compiled in the loading time. Please correct me if I am wrong. By the way, what do you think about the suggestion on async compilation I made at the beginning? |
It's a bit too soon to tell whether enough users will want to deal with additional shader compilation settings. For me at least, I need to get a bit of distance from this to gain proper perspective. The ideas don't sound bad, though. Feel free to open a proposal so you can get feedback from more people. In regard to this issue, I believe we're ready to close it since it's not really a bug, but the way the system works. |
Thanks for the responses! I am glad to have the conversation. |
Godot version
3.5 stable
System information
Windows 10 64-bit, GLES3, GTX1060 6GB
Issue description
Overview
I am trying to test the performance of the async compilation in Godot 3.5.
The uploaded project can instance a scene containing multiple mesh instances with unique materials (all spatial material, but enabled different features), and then display them, in order to test the time it takes to compile the shaders. It measures "the unresponsive time to instance the scene and add it to the tree (invisible at the moment)" and "the unresponsive time when the scene is set to be visible". Let's call them t1 and t2 respectively.
Test Results
In my test, I got the following result:
For the sake of fairness, I cleared the NVIDIA caches every time before running the program.
max_simultaneous_compiles
is set to be 2.One thing to note is that, in "Async" and "Async + Cache" modes, it lags for a few seconds after t2. Also, the time it takes to start the program is longer in the "Async" mode.
Analysis
If the caching feature isn't used, async compilation alone doesn't make too much sense because the total unresponsive time is much longer than sync compilation. The only potential benefit would be shifting most of the unresponsive time to the earlier stage (adding the scene to the tree), but we can always use the old technique (force render the meshes in the loading screen) to achieve that with sync compilation.
While the total unresponsive time is lower when using async + cache, the lagging (low framerate) at the beginning makes it not wholly better in terms of the player experience. Also, we have to compile the shaders (i.e. run the program) once beforehand to have such a shorter unresponsive time, so the unresponsive time is still as long as in the "no cache async" mode for the first time. At least on my computer, the shaders would be cached by the NVIDIA GPU anyway (I am not sure if it is actually done by the GPU), so for the second time I run the program with sync mode, the unresponsive time is already really short (~100ms). Actually, it is shorter than both async modes with NVIDIA caches.
In that sense, async shader compilation seems to be a bit not too useful if the users can implement their own "force shader compile" function.
Side Notes
The performance (total unresponsive time) in the async modes seems to be worse if many custom shader materials are being used, and better if the
async_mode
of the materials is set to beasync_hidden
.Suggestion
In order to make this feature more user-friendly and effective, I think there could be a global setting to set the
async_mode
for all materials (per-material setting can be still used to override it) so that people can easily useasync_hidden
for all materials for good performance. It might be also a good idea to have configurable fallback materials (both global and per-material) so that we can use some simple placeholder materials to balance the performance and visuals.Verdict
I am not sure if I misunderstand how the async shader compilation, so I would love to discuss on it. I would also be happy if someone can explain how this Ubershader approach works for custom shader materials.
Steps to reproduce
AppData\Local\NVIDIA\GLCache
)ShaderCachingTest.tscn
P
button to measure t1Enter
button to measure t2Minimal reproduction project
AsyncShaderCompilationTest.zip
The text was updated successfully, but these errors were encountered: