Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDScript engine destruction causes deadlock when destroying lambdas #85151

Closed
Efimero opened this issue Nov 20, 2023 · 7 comments · Fixed by #85248
Closed

GDScript engine destruction causes deadlock when destroying lambdas #85151

Efimero opened this issue Nov 20, 2023 · 7 comments · Fixed by #85248

Comments

@Efimero
Copy link

Efimero commented Nov 20, 2023

Godot version

v4.2.rc.custom_build [dfd61cd]

System information

Debian, Vulkan (Forward+)

Issue description

Stack Trace
#0  futex_wait (private=0, expected=2, futex_word=0x55556474d280) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0x55556474d280, private=0) at ./nptl/lowlevellock.c:49
#2  0x00007ffff7d61682 in lll_mutex_lock_optimized (mutex=0x55556474d280) at ./nptl/pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=0x55556474d280) at ./nptl/pthread_mutex_lock.c:93
#4  0x0000555557f673cd in __gthread_mutex_lock (__mutex=0x55556474d280) at /usr/include/x86_64-linux-gnu/c++/13/bits/gthr-default.h:749
#5  0x0000555557f6741d in __gthread_recursive_mutex_lock (__mutex=0x55556474d280) at /usr/include/x86_64-linux-gnu/c++/13/bits/gthr-default.h:811
#6  0x0000555557f67452 in std::recursive_mutex::lock (this=0x55556474d280) at /usr/include/c++/13/mutex:120
#7  0x0000555557f67d4f in std::unique_lock<std::recursive_mutex>::lock (this=0x7fffffffcc70) at /usr/include/c++/13/bits/unique_lock.h:141
#8  0x0000555557f67c7b in std::unique_lock<std::recursive_mutex>::unique_lock (this=0x7fffffffcc70, __m=...)
    at /usr/include/c++/13/bits/unique_lock.h:71
#9  0x00005555582a1d67 in MutexLock<MutexImpl<std::recursive_mutex> >::MutexLock (p_mutex=..., this=<optimized out>) at ./core/os/mutex.h:122
#10 GDScript::_remove_func_ptr_to_update (p_func_ptr_element=0x555564749d30) at modules/gdscript/gdscript.cpp:1424
#11 0x000055555848002c in GDScriptLambdaCallable::~GDScriptLambdaCallable (this=0x5555646bd690, __in_chrg=<optimized out>)
    at modules/gdscript/gdscript_lambda_callable.cpp:155
#12 0x000055555bf85196 in memdelete<CallableCustom> (p_class=0x5555646bd690) at ./core/os/memory.h:109
#13 0x000055555bf83e7a in Callable::~Callable (this=0x5555647355d8, __in_chrg=<optimized out>) at core/variant/callable.cpp:382
#14 0x000055555bf8b8a7 in Variant::_clear_internal (this=0x5555647355d0) at core/variant/variant.cpp:1373
#15 0x0000555557f58605 in Variant::clear (this=0x5555647355d0) at core/variant/variant.h:302
#16 0x0000555557f58672 in Variant::~Variant (this=0x5555647355d0, __in_chrg=<optimized out>) at core/variant/variant.h:788
#17 0x0000555557f9d0da in CowData<Variant>::_unref (this=0x555564749a18, p_data=0x5555647355a0) at ./core/templates/cowdata.h:216
#18 0x0000555557f985b0 in CowData<Variant>::~CowData (this=0x555564749a18, __in_chrg=<optimized out>) at ./core/templates/cowdata.h:415
#19 0x0000555557f92f28 in Vector<Variant>::~Vector (this=0x555564749a10, __in_chrg=<optimized out>) at ./core/templates/vector.h:290
#20 0x00005555582a6070 in GDScriptInstance::~GDScriptInstance (this=0x5555647499c0, __in_chrg=<optimized out>) at modules/gdscript/gdscript.cpp:2096
#21 0x000055555c2d7d5a in memdelete<ScriptInstance> (p_class=0x5555647499c0) at ./core/os/memory.h:109
#22 0x000055555c2d5493 in Object::~Object (this=0x555564749830, __in_chrg=<optimized out>) at core/object/object.cpp:1979
#23 0x000055555800d898 in RefCounted::~RefCounted (this=0x555564749830, __in_chrg=<optimized out>) at ./core/object/ref_counted.h:53
#24 0x000055555805c0a1 in memdelete<RefCounted> (p_class=0x555564749830) at ./core/os/memory.h:109
#25 0x000055555bf8b85a in Variant::_clear_internal (this=0x555564749530) at core/variant/variant.cpp:1360
#26 0x0000555557f58605 in Variant::clear (this=0x555564749530) at ./core/variant/variant.h:302
#27 0x0000555557f58672 in Variant::~Variant (this=0x555564749530, __in_chrg=<optimized out>) at ./core/variant/variant.h:788
#28 0x0000555557f9d0da in CowData<Variant>::_unref (this=0x55556474edb0, p_data=0x555564749530) at ./core/templates/cowdata.h:216
#29 0x0000555557f985b0 in CowData<Variant>::~CowData (this=0x55556474edb0, __in_chrg=<optimized out>) at ./core/templates/cowdata.h:415
#30 0x0000555557f92f28 in Vector<Variant>::~Vector (this=0x55556474eda8, __in_chrg=<optimized out>) at ./core/templates/vector.h:290
#31 0x000055555bf7f1c6 in ArrayPrivate::~ArrayPrivate (this=0x55556474eda0, __in_chrg=<optimized out>) at core/variant/array.cpp:44
#32 0x000055555bf7f1f5 in memdelete<ArrayPrivate> (p_class=0x55556474eda0) at ./core/os/memory.h:109
#33 0x000055555bf794cf in Array::_unref (this=0x5555645eca48) at core/variant/array.cpp:79
#34 0x000055555bf7d9d0 in Array::~Array (this=0x5555645eca48, __in_chrg=<optimized out>) at core/variant/array.cpp:823
#35 0x000055555bf8b8e6 in Variant::_clear_internal (this=0x5555645eca40) at core/variant/variant.cpp:1382
#36 0x0000555557f58605 in Variant::clear (this=0x5555645eca40) at ./core/variant/variant.h:302
#37 0x0000555557f58672 in Variant::~Variant (this=0x5555645eca40, __in_chrg=<optimized out>) at ./core/variant/variant.h:788
#38 0x0000555557f9d0da in CowData<Variant>::_unref (this=0x5555643a65d8, p_data=0x5555645ec8c0) at ./core/templates/cowdata.h:216
#39 0x0000555557f988a2 in CowData<Variant>::resize<false> (this=0x5555643a65d8, p_size=0) at ./core/templates/cowdata.h:275
#40 0x0000555557f931ea in Vector<Variant>::resize (this=0x5555643a65d0, p_size=0) at ./core/templates/vector.h:94
#41 0x00005555582b475d in Vector<Variant>::clear (this=0x5555643a65d0) at ./core/templates/vector.h:87
#42 0x00005555582a277a in GDScript::clear (this=0x5555643a6330, p_clear_data=0x0) at modules/gdscript/gdscript.cpp:1545
#43 0x00005555582a6c9d in GDScriptLanguage::finish (this=0x5555607fc2e0) at modules/gdscript/gdscript.cpp:2221
#44 0x000055555c2e7f82 in ScriptServer::finish_languages () at core/object/script_language.cpp:259
#45 0x0000555557fef64b in Main::cleanup (p_force=false) at main/main.cpp:3783
#46 0x0000555557f57961 in main (argc=1, argv=0x7fffffffdee8) at platform/linuxbsd/godot_linuxbsd.cpp:76

Not sure what causes this, but it seems having a vector/array of Callables/lambdas or of something that contains lambdas is causing a deadlock.

The deadlock only happens AFTER calling get_tree().quit() (or Alt+F4) and AFTER the tree has been removed. I am using static variables in classes to hold lists of callables, but even manually removing those references before quitting doesn't change the outcome.

It is impossible to pin it from the godot debugger as it blocks the keyboard interrupt signal, thus, a GDB backtrace. This causes the project to hang on the last frame and become unresponsive to anything save a kill.

I don't know how to provide more information on this, sorry.

Steps to reproduce

No clue!

Minimal reproduction project

N/A

@Efimero
Copy link
Author

Efimero commented Nov 20, 2023

I managed to figure out how to stop it from happening. It is definitely one of the many lists (vector/dictionary) of Callables or classes that hold a Callable. I explicitly clear()'d all the lists I had in static variables and now it closes properly. While this may be a circular reference problem, I would have assumed there's a way to run the logic in the right order to deallocate the static variables first? I don't know what that could break, tho.
I hope this helps, even if it's not enough for a MRP.

@YuriSizov
Copy link
Contributor

Judging by the stack trace, this may be fixed by #85170. Could you give it a test?

@Efimero
Copy link
Author

Efimero commented Nov 21, 2023

I tested with v4.2.rc.custom_build.1ed691914 from that PR and got the same deadlock and a similar stack trace. I think it could be related to lambdas capturing state in closures causing a circular reference, but I still can't get a MRP that does the same.
My project is public https://git.lubar.me/efi/cursed-farm-godot
To remove the workaround just comment out https://git.lubar.me/efi/cursed-farm-godot/src/commit/32441f0beac092bf94684df769f51f338e907978/Game.gd#L18 lines 18 and 19 and it will hang during quit.
Sorry that I can't reproduce it otherwise, it's beyond me how it happens.

@YuriSizov
Copy link
Contributor

YuriSizov commented Nov 21, 2023

I tested with v4.2.rc.custom_build.1ed691914 from that PR and got the same deadlock and a similar stack trace.

I can actually see a difference from that commit. I can reproduce the issue with a mutex lock before applying 1ed6919, and I get a distinct crash after applying it. Interesting enough, the lock issue doesn't seem to be consistent. But the new crash is consistent and happens on every exit. Which may indicate a new regression.

cc @RandomShaper

@akien-mga
Copy link
Member

I can reproduce the deadlock on Linux with GCC compiled binary from 1faf2f5, with the steps outlined in #85151 (comment).

Almost the same stacktrace as in the OP.

@adamscott
Copy link
Member

I'm trying to create a MRP for the issue, because while I could reproduce the issue cloning the cursed-farm-godot project, I could not reproduce minimally the issue yet.

@RandomShaper
Copy link
Member

I'm actively investigating this based on the Cursed Farm project. I already know what's going on, quite. I should be able to come up with a fix soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment