-
-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unit/integration testing: Testing graphical and UI code. #1760
Comments
Test contextsThe minimal testing context was introduced in godotengine/godot#40980 without rendering capabilities, but has been working alright for unit testing specifically so far. The way I see it, it may be feasible to just introduce another The main challenge is being able to register setup/teardown methods with doctest, which is not a feature of doctest (without code duplication). The suggested setup/teardown mechanism in doctest is to use The entry point for unit and integration testing could be rewritten to accept things like:
This way, I think it would be still possible to use doctest for those (like godotengine/godot#42938). It means that the entry point would go through additional interface layer, so to speak. This kind of setup would also help #1533 because it means no compatibility breakage would have to be done in the first place. But godotengine/godot#40148 didn't preserve compatibility with the old tests. Graphical and UI code testingI think testing graphical and UI code requires a extends "res://addons/gut/test.gd"
# https://github.com/godotengine/godot/issues/32597
class TabContainerGuiInputCrash extends TabContainer:
var ev = InputEventMouseButton.new()
func _ready():
var pm := PopupMenu.new()
set_popup(pm)
pm.queue_free()
yield(get_tree(), "idle_frame")
yield(get_tree(), "idle_frame")
yield(get_tree(), "idle_frame")
ev.pressed = true
ev.button_index = BUTTON_LEFT
ev.button_mask = BUTTON_LEFT
ev.position = Vector2(0, 14)
Input.parse_input_event(ev)
yield(get_tree(), "idle_frame")
yield(get_tree(), "idle_frame")
Input.parse_input_event(ev)
Input.parse_input_event(ev)
var container
func setup():
var gut_window = get_parent().get_node('Gut')
gut_window.hide() # need to hide to properly detect input event
container = TabContainerGuiInputCrash.new()
add_child(container)
func test_tab_container_gui_input():
yield(yield_for(1.0, 'Hopefully no crash happens.'), YIELD)
assert_true(true, "No crash, great!")
func teardown():
container.queue_free() The
doctest could be used for this as for GDScript integration tests #1429, but may be overkill, so perhaps an extra step would be indeed required to do this. But in theory, all this could be done from within a Godot project running on CI. This is where testing frameworks like GUT shine, in my opinion. For instance, I've been successfully running unit tests in Goost, but we still need a way to render stuff on CI. |
I noticed at https://bruvzg.github.io/using-godot-with-swiftshader-software-vulkan-emulation.html that you had to increase SwiftShader's bound descriptor set limit to 16 to get it to work with Godot. I'm curious why that's required. Currently only just over half of the Vulkan drivers support 16 or more: https://vulkan.gpuinfo.org/displaydevicelimit.php?name=maxBoundDescriptorSets. While this metric does not take deployments into account, it still seems to me that important classes of GPUs only support 8, or 4, bound descriptor sets. I don't mind upstreaming this change to permanently increase it, but I'd love to understand how an engine like Godot uses more than 4 descriptor sets, and what might be a good balance. It seems like no GPU has 16, so I guess 8 would already suffice? Any significant advantage from increasing it to 32? Thanks! |
Godot's Edit: Actually it might work with 4 since godotengine/godot#44175 was merged. |
I have checked the current master of Godot, and it's working with a limit of 4 descriptor sets, so this change is not necessary anymore. |
Did anyone try robotframework to provide visual tests. Need to evaluate:
https://robotframework.org/#documentation We can use robotframework and pick one of the available frameworks that support vulkan. |
I have made a prototype using robotframework. This sample does two things:
|
I have a proof of concept that uses Nut.js here: https://github.com/Calinou/godot/tree/add-editor-ui-tests/misc/ui_tests For the editor, I don't know what kind of "workflows" would be best to apply within the automated tests though. Creating a basic project automatically, running it then stopping it would be useful, but it wouldn't be testing a whole lot of functionality. Also, I haven't figured out how to run it on a headless server (with Xvfb + Lavapipe/SwiftShader) yet. |
I was using robot framework because it can run the editor using image recognition to find buttons and then execute the process under swiftshader. @nikitalita worked on swiftshader cicd integration. Edited: I evaluated nut.js it doesn't seem to have support for everything. https://robotframework.org/#resources |
My initial attempts at visual regression testing has revealed that output can vary wildly between video cards and even different driver versions. It's not really noticeable to the human eye, but a 1-to-1 comparison or even a fuzzy comparison >95% of frame captures will fail if the test environment isn't set up the exact same for the baseline and the subsequent tests (preferably the exact same machine). @myaaaaaaaaa have you encountered this? |
See How (not) to test graphics algorithms. A dssim check should be able to work out decently if it has a large enough threshold, but in general, it's recommended to have a few "complete" test images over a lot of "partial" tests covering isolated features. This may be counter-intuitive, but it makes checking for regressions a lot less time-consuming. We should be careful about "alarm fatigue" in general when it comes to this kind of regression testing, as it's an easy trap to fall into. |
I wonder if this would be useful for "whole game" tests. The developer would record the inputs, RNG seed and movie for a playthrough. The movie or perceptual hash of the movie would be stored, and then the inputs and RNG seed would be used to replay the movie and compare with the developer's playthrough. This could be useful to automatically test if a game still functions correctly when ported to another platform/godot version. A self-test option could also be included in published builds, for players to use. For the self-test, as the full thing might take too long for large and performance-heavy games, there could just be an option for a cut-down playthrough, or playthrough of a test level/test suite. |
Godot's physics engines are not determinstic, so this wouldn't be useful unless your game doesn't rely on the physics engine at all (and uses its own deterministic physics implementation). Subtle differences in rendering (due to different GPU hardware or driver version) can also be introduced, which would cause the hash to be ivnalid. |
Makes sense, regarding the physics. For the differences in rendering, I think perceptual hashes are designed to allow leeway for small changes. And I'd think you'd want to detect large differences in a game when using different GPU/driver configurations? But maybe this should be a separate discussion thread, actually. |
Yes, tools like dssim can be used to calculate a similarity score between two images. Tweaking the value threshold is an art in itself though, and you need to record your videos using lossless compression which results in huge files. |
Describe the project you are working on:
Godot engine.
Describe the problem or limitation you are having in your project:
Unit testing was introduced in godotengine/godot#40148, but currently there's no possibility to automatically test any GUI and rendering related code.
Related proposal: #1307 (testing contexts), #1533 (old tests had at least some rendering and UI tests)
Describe the feature / enhancement and how it helps to overcome the problem or limitation:
Implement off-screen
DisplayServer
for use on headless CI, and make it compatible with software Vulkan (SwiftShader) / OpenGL (OSMesa) implementations to run on CI without GPU, and add testing framework context with active rendering pipeline (initialized display and rendering servers, and normal project main loop).Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:
If this enhancement will not be used often, can it be worked around with a few lines of script?:
It can be used as part CI to detect rendering, physics and GUI regressions, and can be used to quickly test specific hardware or driver versions for rendering issues (the same context should be usable with normal
DisplayServer
s as well).Is there a reason why this should be core and not an add-on in the asset library?:
It should be possible to achieve this with module or GDScript project, but probably better to have testing related stuff in the core for cleaner CI configs and to avoid duplicate code in multiple test projects.
The text was updated successfully, but these errors were encountered: