Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround resource loading crashes due to buggy TLS #78977

Merged
merged 1 commit into from
Jul 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 13 additions & 8 deletions core/io/resource_loader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -239,15 +239,15 @@ ResourceLoader::LoadToken::~LoadToken() {

Ref<Resource> ResourceLoader::_load(const String &p_path, const String &p_original_path, const String &p_type_hint, ResourceFormatLoader::CacheMode p_cache_mode, Error *r_error, bool p_use_sub_threads, float *r_progress) {
load_nesting++;
if (load_paths_stack.size()) {
if (load_paths_stack->size()) {
thread_load_mutex.lock();
HashMap<String, ThreadLoadTask>::Iterator E = thread_load_tasks.find(load_paths_stack[load_paths_stack.size() - 1]);
HashMap<String, ThreadLoadTask>::Iterator E = thread_load_tasks.find(load_paths_stack->get(load_paths_stack->size() - 1));
if (E) {
E->value.sub_tasks.insert(p_path);
}
thread_load_mutex.unlock();
}
load_paths_stack.push_back(p_path);
load_paths_stack->push_back(p_path);

// Try all loaders and pick the first match for the type hint
bool found = false;
Expand All @@ -263,7 +263,7 @@ Ref<Resource> ResourceLoader::_load(const String &p_path, const String &p_origin
}
}

load_paths_stack.resize(load_paths_stack.size() - 1);
load_paths_stack->resize(load_paths_stack->size() - 1);
load_nesting--;

if (!res.is_null()) {
Expand Down Expand Up @@ -296,8 +296,10 @@ void ResourceLoader::_thread_load_function(void *p_userdata) {
// Thread-safe either if it's the current thread or a brand new one.
CallQueue *mq_override = nullptr;
if (load_nesting == 0) {
load_paths_stack = memnew(Vector<String>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this seems to be the main potential behavior change here, where a new stack will be allocated each time this function is called.

For my peace of mind, can you confirm whether this function is only called once for each ResourceLoader instance, so it's equivalent to when the vector was a member of it? I.e. there's no risk of losing a previous load path when de-allocating this at the end of the function, compared to the previous behavior where it would subsist as class data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be allocated per outermost stack frame, per resource load thread. The overhead of that is not ideal, but it's minimal. In any case, the data block of the vector will have to be allocated and resized normally, so this is just adding one additional allocation to the mix.

Regarding the persistence of data, I could have added a DEV_ASSERT(load_paths_stack.size()) == 0 to this function before this PR. Therefore, there no loss of data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, this is a shot in the dark, I would need a production-like build to be kind of sure it does the trick. The other idea of making TLS data compilation-unit local may be better. I will make another PR and hopefully you can make builds with both ideas?


if (!load_task.dependent_path.is_empty()) {
load_paths_stack.push_back(load_task.dependent_path);
load_paths_stack->push_back(load_task.dependent_path);
}
if (!Thread::is_main_thread()) {
mq_override = memnew(CallQueue);
Expand Down Expand Up @@ -356,8 +358,11 @@ void ResourceLoader::_thread_load_function(void *p_userdata) {

thread_load_mutex.unlock();

if (load_nesting == 0 && mq_override) {
memdelete(mq_override);
if (load_nesting == 0) {
if (mq_override) {
memdelete(mq_override);
}
memdelete(load_paths_stack);
Comment on lines +362 to +365
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good practice to set pointers to nullptr after calling memdelete to reduce risk of a dangling pointer being inadvertently referenced.

That said, I assume there must already be some sort of guarantee that these variables are not referenced outside of threaded load, so it's not something to change for 4.1 ... just an observation. (I'm not too familiar with this code)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. At that point the thread is about to die and no one could access the variable anymore. That said, it is when assumptions fail to be true for whatever reason (code evolution, etc.) when one would have wished to have put guidelines like set-to-null-after-free in practice without further consideration.

}
}

Expand Down Expand Up @@ -1166,7 +1171,7 @@ bool ResourceLoader::timestamp_on_load = false;

thread_local int ResourceLoader::load_nesting = 0;
thread_local WorkerThreadPool::TaskID ResourceLoader::caller_task_id = 0;
thread_local Vector<String> ResourceLoader::load_paths_stack;
thread_local Vector<String> *ResourceLoader::load_paths_stack;

template <>
thread_local uint32_t SafeBinaryMutex<ResourceLoader::BINARY_MUTEX_TAG>::count = 0;
Expand Down
2 changes: 1 addition & 1 deletion core/io/resource_loader.h
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ class ResourceLoader {

static thread_local int load_nesting;
static thread_local WorkerThreadPool::TaskID caller_task_id;
static thread_local Vector<String> load_paths_stack;
static thread_local Vector<String> *load_paths_stack; // A pointer to avoid broken TLS implementations from double-running the destructor.
static SafeBinaryMutex<BINARY_MUTEX_TAG> thread_load_mutex;
static HashMap<String, ThreadLoadTask> thread_load_tasks;
static bool cleaning_tasks;
Expand Down