Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on shutdown if certain singletons were used in extension #889

Closed
mihe opened this issue Oct 10, 2022 · 18 comments · Fixed by #1458
Closed

Crash on shutdown if certain singletons were used in extension #889

mihe opened this issue Oct 10, 2022 · 18 comments · Fixed by #1458
Labels
bug This has been identified as a bug confirmed crash topic:gdextension This relates to the new Godot 4 extension implementation

Comments

@mihe
Copy link
Contributor

mihe commented Oct 10, 2022

When using certain singletons like OS::get_singleton() from within a GDExtension it seems like internal::gdn_interface->object_get_instance_binding in the generated get_singleton() ends up storing callbacks on the editor side to OS::___binding_callbacks which point to functions on the extension side.

These callbacks (___binding_free_callback specifically) end up getting invoked on editor shutdown (at Object::~Object()) during memdelete(_os) in unregister_core_types(). The problem is that by then we've already unloaded the extension libraries in memdelete(native_extension_manager), resulting in trying to call a function that no longer exists, leading to an access violation crash.

This also applies to Engine::get_singleton(). More singletons might be affected that I haven't run into yet.

Editor: v4.0.beta.custom_build (ca25c6e0a)
Architecture: x86_64
Compiler: MSVC (v19.33.31630)
OS: Windows 11 (v10.0.22621)

Minimal repro:

#include <godot/gdnative_interface.h>

#include <godot_cpp/classes/os.hpp>
#include <godot_cpp/core/class_db.hpp>
#include <godot_cpp/core/defs.hpp>
#include <godot_cpp/godot.hpp>

using namespace godot;

void initialize_example_module(ModuleInitializationLevel p_level) {
    if (p_level == MODULE_INITIALIZATION_LEVEL_SCENE) {
        OS::get_singleton();
    }
}

extern "C" {

GDNativeBool GDN_EXPORT example_library_init(
    const GDNativeInterface* p_native_iface,
    const GDNativeExtensionClassLibraryPtr p_native_lib,
    GDNativeInitialization* p_native_init
) {
    GDExtensionBinding::InitObject init_obj(p_native_iface, p_native_lib, p_native_init);
    init_obj.register_initializer(initialize_example_module);
    init_obj.set_minimum_library_initialization_level(MODULE_INITIALIZATION_LEVEL_SCENE);
    return init_obj.init();
}

} // extern "C"
@saki7
Copy link
Contributor

saki7 commented Apr 6, 2023

I can reproduce this on current master.

@saki7
Copy link
Contributor

saki7 commented Jun 7, 2023

@akien-mga
cc: @vnen @dsnopek
Could you triage this bug for 4.1? I have confirmed that this bug would make application crash every time if I use singleton instance in godot-cpp.

There is one unofficial patch which was published by 3rd party project: effekseer/EffekseerForGodot4@a7f4fa4

I have applied above patch to my local fork, and I have confirmed that the crash is gone. I think we can use it as a reference. However, I'm not 100% sure if the SingletonBinder approach is the correct way to solving this issue. I have asked the author to propose the patch here, but the author has not responded.

@mihe
Copy link
Contributor Author

mihe commented Jun 7, 2023

That SingletonBinder in EffekseerForGodot4 assumes that all referenced singleton bindings call ___binding_create_callback at some point, which is not the case, and will double-free if the extension does something like expose a custom physics server, where the "binding" is effectively created by the extension itself.

I've been running a somewhat similar hack/fix for the past couple of months, which does account for this, and seems to work as far as I know. I'm not proposing that this be merged upstream though, as I'm sure there are more esoteric interactions where this too will break.

Someone with a much deeper understanding of the godot-cpp bindings likely needs to take a closer look at the whole binding setup in godot-cpp.

@saki7
Copy link
Contributor

saki7 commented Jun 7, 2023

@mihe Your patch seems straightforward to me. Could you elaborate on the circumstances where your patch wouldn't work?

I'm eager to find a proper fix for this since I'm having a hard time convincing my teammates that Godot 4.x is production ready whereas it's crashing every time on program exit.

@mihe
Copy link
Contributor Author

mihe commented Jun 8, 2023

I'm not saying that it wouldn't work, I'm saying that I can't confidently state that it will work in the general case.

Again, someone with a deeper understanding of the bindings in godot-cpp would need to lead the charge here. I am not that guy. :)

EDIT: Also, my hack/fix prioritizes keeping the changes contained in one place, hence the liberal use of lambdas and whatnot. If this were to be merged upstream it should probably be heavily refactored and not circumvent the regular binding callbacks.

@dsnopek
Copy link
Collaborator

dsnopek commented Jun 8, 2023

I haven't done any experimenting with this, but based on the the description and the linked patch, it sounds like the problem would be solved if we deleted all the singleton wrappers in GDExtensionBinding::deinitialize_level() (which is called when unloading the extension)?

Assuming that's correct, I think we could probably solve this with a simpler patch. Object::~Object() is virtual (meaning deleting an Object * should call the child class' destructor too), and so I think we could have the get_singleton() functions store the Object * in a list when the wrapper is first created, and then GDExtensionBinding::deinitialize_level() could loop through and delete them all. But I don't know if that'd really work until I try it!

I'll add experimenting with this to my TODO list :-)

@mihe
Copy link
Contributor Author

mihe commented Jun 8, 2023

it sounds like the problem would be solved if we deleted all the singleton wrappers in GDExtensionBinding::deinitialize_level()

I'm mostly recalling this from the murky depths of ancient memory, so don't take my word for it, but...

Core doesn't have any way of releasing/unsetting/clearing bindings, so no matter what you delete on the godot-cpp side of things you will still end up with stale function pointers getting invoked by core at shutdown. To make matters worse, some singletons outlive the extension libraries and some don't.

This is what my original PR was meant to address, by adding a clear_instance_binding to core and (somewhat crudely) use that from godot-cpp. It was however quickly co-opted about talks of hot-reloading and more general concerns of binding lifetimes, and subsequently sat unreviewed for 5+ months.

The hack/fix I'm currently using, as well as the one linked from EffekseerForGodot4, skirts around this by exploiting the null check for free_callback in Object::~Object and does the deallocation of the bindings "manually" at shutdown by exploiting static class destructors.

@dsnopek
Copy link
Collaborator

dsnopek commented Jun 8, 2023

Ah, ok, thanks!

Reading through the comments on your linked PR, Zylann brings up an important point that this issue probably affects all objects, not just singletons.

Hm, it's starting to sound like this is a bigger, trickier problem to solve than it seemed at first! When I have the time, I'll dig into it deeper.

@YuriSizov
Copy link
Contributor

I can confirm that this is still a problem with engine singletons on exit, in my case the Time one. I'm not sure if OP suggests this only affects the editor, but I actually experience the crash when running the project. But the basic premise is the same, if you ever access the Time singleton, then on exit, as the engine tries to run the destructor for it, an attempt to call free_callback causes a crash due to invalid memory access.

image

I'm on cc1217a in godot-cpp and godotengine/godot@aef11a1 in godot.

@mihe
Copy link
Contributor Author

mihe commented Apr 1, 2024

I'm not sure if OP suggests this only affects the editor, but I actually experience the crash when running the project.

I can't remember if I'd only seen it in the context of the editor at that time, but yes, it most definitely happens when shutting down the game/application as well.

I know at least one other published GDExtension (Debug Draw 3D) that suffers from this issue, causing a crash on shutdown every single time for anyone who has it loaded. I'm sure there are other ones out there as well.

I'm surprised this issue hasn't gotten more traction than it has. I guess people just don't run their editor/game through a debugger or something?

@mihe mihe changed the title Editor crashes on shutdown if certain singletons were used in extension Crash on shutdown if certain singletons were used in extension Apr 1, 2024
@AThousandShips AThousandShips removed this from the 4.1 milestone Apr 1, 2024
@dsnopek
Copy link
Collaborator

dsnopek commented Apr 1, 2024

I'm surprised this issue hasn't gotten more traction than it has. I guess people just don't run their editor/game through a debugger or something?

Well, unfortunately, I don't think we know how to solve it yet. I don't remember the details, but the last time we discussed this, it was shown that this issue affects more than just singletons, but singletons are the the easiest way to trigger it.

If anyone has a proposed solution, that'd be great! Otherwise, I'll personally look at it eventually - there's just so many things :-)

@mihe
Copy link
Contributor Author

mihe commented Apr 1, 2024

Yes, sorry, I meant that I'm surprised this issue isn't flooded with people facing the same problem, not that I'm surprised that it hasn't been fixed yet. It's a tricky one.

I just feel like every non-trivial extension should run into this crash sooner or later.

@YuriSizov
Copy link
Contributor

YuriSizov commented Apr 1, 2024

If anyone has a proposed solution, that'd be great!

Before looking into this my naive idea was to keep track/count references for every singleton (or, more broadly, every Godot object), and make sure to clear/null all the pointers when your extension doesn't need it anymore. This idea seems to be very similar to the kind of bookkeeping you've implemented for hot reloading extensions. Is there any reason why we cannot extend this approach to address the issue at hand?

I am probably missing something, but I'd appreciate it if this was resolved sooner rather than later. While crashes on exit will likely remain unnoticed by end users, it affects development process greatly, as you basically cannot exit your application or the editor normally, if you have a debugger attached.

@mihe
Copy link
Contributor Author

mihe commented Apr 1, 2024

I'd appreciate it if this was resolved sooner rather than later. While crashes on exit will likely remain unnoticed by end users, it affects development process greatly, as you basically cannot exit your application or the editor normally, if you have a debugger attached.

If you just need a short-term fix for your own use, and you don't mind forking/patching godot-cpp, my workaround has been working well in Godot Jolt for a good long while now.

I haven't merged with upstream since 4.2-stable was released though, so it could be that it's due for another touch-up, but so far they've all been quick and painless.

@dsnopek
Copy link
Collaborator

dsnopek commented Apr 2, 2024

This idea seems to be very similar to the kind of bookkeeping you've implemented for hot reloading extensions. Is there any reason why we cannot extend this approach to address the issue at hand?

Well, when working on hot reloading, the feeling was that this bookkeeping added too much of a performance and memory hit to do in release builds. Hot reload is a developer tool, and so the goal was to only add this extra overhead during development.

It'd be great if we could either:

  1. Change the order that things are shutdown, such that we don't encounter the problem. There are other existing issues with the order that GDExtensions are initialized and shutdown relative to other things in Godot (ex GDScript trying to access GDExtension classes before they exist), so maybe a comprehensive re-think of that could fix it? Or,
  2. Somehow check if the GDExtension associated with free_callback is still loaded before calling it. This way we don't need to clear it on every object. We already associate the callbacks with a void *p_token which is the GDExtension * in the case of GDExtensions, which we could check for in GDExtensionManager. The only problem is that C# also uses these callbacks, and, in that case, the void *p_token is something else entirely. This won't be a problem once C# uses GDExtension, but for now, we'd probably need to track some new piece of data with the binding callbacks? Or, maybe, we can maintain a list of void *p_tokens that are no longer valid that we update when a GDExtension is unloaded, and then we don't run the free_callback if it's on the list?
  3. Some other clever idea we haven't thought of yet :-)

@YuriSizov
Copy link
Contributor

Thanks for the write-up! I don't think number 1 is feasible with how the engine is designed. Both startup and shutdown are messy order-wise, and with extensions being a mechanism to put your fingers into almost every aspect of the engine, I don't know if there is a way to describe either routine in a non-conflicting way. Anything short of a manifest for each extension, which particular system it uses, won't let us make any assumption.

So I think that number 2 is more interesting and promising. To work around the C# problem, can we add a flag to the bindings struct that would indicate the user of the struct, be it an extension or the C# module? All we'd need is one bit, and then you could apply your proposed solution conditionally. Although C# may also have such issues. I remember fixing some bugs related to the order in which things happen where C#-scripted classes might've been accessed by a core system before scripting was ready. Sounds related to the overarching issue here.

@akien-mga
Copy link
Member

akien-mga commented May 7, 2024

I ran into this too. In case it's useful, here's another "MRP" that adds the crash to this repo's test project, for ease of testing with pre-existing SCons scaffolding.

diff --git a/test/src/example.cpp b/test/src/example.cpp
index 3ec8bca..89e5d12 100644
--- a/test/src/example.cpp
+++ b/test/src/example.cpp
@@ -11,6 +11,7 @@
 #include <godot_cpp/classes/label.hpp>
 #include <godot_cpp/classes/multiplayer_api.hpp>
 #include <godot_cpp/classes/multiplayer_peer.hpp>
+#include <godot_cpp/classes/time.hpp>
 #include <godot_cpp/variant/utility_functions.hpp>
 
 using namespace godot;
@@ -297,6 +298,8 @@ Example::~Example() {
 
 // Methods.
 void Example::simple_func() {
+	int time = Time::get_singleton()->get_ticks_msec();
+	printf("time: %d\n", time);
 	emit_custom_signal("simple_func", 3);
 }

I only skimmed through the discussion, but if we're not sure about how to handle this properly for all potential problematic cases, but most users seem to just be affected by using singletons, I think a temporary fix like the one @mihe is using for godot-jolt might be worth considering, until we have a better idea.

@dsnopek
Copy link
Collaborator

dsnopek commented May 7, 2024

I've just posted PR #1458, which should fix this issue specifically for singletons. It doesn't address the general case where a GDExtension is unloaded before an Object can clean up its instance binding to that GDExtension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This has been identified as a bug confirmed crash topic:gdextension This relates to the new Godot 4 extension implementation
Projects
None yet
7 participants