-
-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add threading support to Scene Nodes #4962
Comments
For another use-case, I've been running different threads under separate branches in the SceneTree when WAT is being run in a multi-threaded mode, so I'm sure this could improve it. |
This is very clever. In generic terms, I am going to say My other observation is that I am old. To me, it would be more intuitive to have a separate explicit function for the work and await its function state. Instead of awaiting In summary, you know better if coroutines that skip threads are something that Game folks find intuitive. I'm not representative. Maybe some syntactic sugar could be added to the language that allows you to make the "middle block" that runs in a thread look like some type of scope (Lambda.) Kinda like the scope around an auto lock that tells you where to be careful. Also, if you do this, I am going to use the living hell out of it :). It's very sexy, and maybe I will just grow to love the style. |
Follow-up: Yes, I can just wrap calling a function and awaiting it around this mechanism. My comment is more about what code style would be most intuitive and safe to bring to your user base. |
@derammo well, the proper way to do it would be simply overriding the three functions, so this is mostly a hack of the kind GDScript users like to do to keep everything in one function. |
Oh :). Yeah that would be explained in the docs :). I do think people in the general purpose programming world need help with the passing of state in and out of threads. The structure of the "normal" way of doing things that they copy from StackOverflow should somehow make it clear which things they need to safely make available to their thread and what they get back and when that result is available. JavaScript async routines still stump people, even though they made it so easy! I guess Game folks can probably handle it, but it does look dangerously like the code they are used to "just working" where you await states in your game and then write straight code and never worry about threads. WIth this, suddenly something that looks exactly the same is threaded. Something to think about maybe. |
If I override the three functions, then I lose all support for getting my result back, right? Can I please return user data from |
set_thread_process_dependency et alia could return an error if you can statically determine there is a cycle created by adding this extra link? |
Otherwise, you are "preparing" something you don't even need to know exists. Yes that would mean renaming the signals somehow too. |
It seems this proposal doesn't allow a node to process both in thread and not in thread. It might be an issue that would force users to separate some of their logic to another node for the sake of it, or not do it at all due to the effort required. If one node does a lot of work, it seems it will only be able to use one thread. Is it then intented to have to create multiple nodes just to subdivide the work? Dependencies help structuring the flow of what must run in serial and what can run in parallel within a set of threaded tasks, but requiring nodes means that you require access to them in the scene, forcing you to have a monolithic node structure. An alternative is to allow specifying which "processing group" a node belongs to, so it can be depended upon without having to have access to specific node instances. It might be possible to use both. Mistakes in threaded code are incredibly complex to figure out. Bugs can be wild and random. It matters here because the scene tree is where most Godot users live in. Dependencies help but they dont prevent such mistakes. In Unity the way jobs are defined allows to write annotations to tell what is read-only and write-only, which helps detecting errors, but in Godot a node can read and modify pretty much anything anywhere, which makes tracking mistakes quite hard. Curious to know why the whole thing needs to be core and not just the worker system. The only reason I can see is wanting to couple to the scene tree and nodes in particular, but if a "threaded task" can be defined as any class with an "execute" function, it becomes much easier to be a more decoupled system, more similar to existing job systems (which is what I mainly use). But I guess you wanted to avoid that because it is complex. Not sure if this is less complex though. It just sounds like the same thing with some of the features being fixed to scene tree features so it might be less code to write, with less flexibility. I have experience in other multithread task systems so I might prefer using something else. The optimal number of active concurrent threads is a small number, usually twice the number of CPU cores. Beyond that number, performance may be affected negatively. I see this proposal is based on a worker thread pool. I use a thread pool for my voxel engine too, but I cannot switch to using the same pool because it works differently and would force a dependency on Godot. Will there be a way to decide how many threads are dedicated to Godot's worker pool? The problem is if it uses all of them, it might lead to contention as more threads are used than what the CPU can handle. More generally, it's also a good idea to not use all CPU cores because players often have something else running than the game they are playing (music, voicechat, stream...) and in multiplayer games it's not rare to have to run multiple instances of the game on the same machine. If threaded process becomes common, it asks for Godot to also provide multithreaded debugger and profiler support. On top of that, a timeline/flamegraph view in the profiler becomes even more relevant, because only that can give a visual insight of what happens in parallel or serial over multiple threads (example of one with Tracy). Minor: would be nice if Godot named its threads. When debugging the engine there are a lot of them with a cryptic default name so it's time consuming to spot which one is which. |
No reason why you can't do this with the above proposal.
The new WorkerThreadPool lets you spawn worker threads from worker threads, so this is not a problem.
The above proposal is for nodes, to aid on the whole thing being more organized, but nothing avoids you from creating tasks and having them be dependent on your own outside of this.
I think this proposal needs to be accompanied with some added thread safety in the base node classes (Node, CanvasItem,Node2D, Node3D, etc). but that said, the whole point of the above proposal is that this happens after (or before) all the regular processing happened, and that you do all the scene access in either the pre or post stages, not the process stage.
Thats entirely how it works already, you can organize tasks in WorkerThreadPool on your own if you don't want to use nodes. This proposal is explicitly for optimizing nodes.
Yes, when you create a group you can already specify how many elements, solver task and threads for it. You can also specify if you want them to be real-time (expected to be for intra-frame work) or not (its fine if takes more frames).
You can name your tasks, so eventually the plan is to show a proper timeline with tasks executed over time in the Godot profiler/debugger. |
This would look deceively simple. I mean, many not very advanced users would be appealed to it, not aware of the many potential traps. For games that can really benefit from heavy threading, I'd rather let them use the same low and high level threading building blocks the engine has for itself, including the thread pool. However, threaded internal processing for some node types for the engine itself doesn't sound bad, being transparent for users. |
@RandomShaper To avoid problems this is why I think its important that there are three steps: setup, process and done. Only the middle one runs in a thread and user is not expected to not access anything from it (this can be very well specified in the docs). this would need to be accompanied to some changes to the base Node classes so their are more thread safe. |
@reduz that doesnt apply if you are profiling/debugging the engine though. Naming tasks is great, but I meant naming threads because that's the only thing that appears in a debugger/profiler. It's not essential to this proposal though, just figured I'd add a quick mention of it.
Would that only be specific to this system? I wish the profiler was able to show the full picture and not just a specific system only. |
@Zylann All thread usage in Godot is being moved to this system. |
As long as users are made aware how advanced this is and that their fate is in their hands, I'm not against it. Aside, despite this doesn't need a lot of code to be added to the engine, I still wonder how many real use cases will there be to justify the maintenance. |
[I broke up my previous questions as a bunch of separate posts to avoid TL;DR, not sure if that worked out. Hopefully when you have some time you can respond to them.] To RandomShaper's comment regarding it being deceptively simple, I agree 100%. Sometimes shorter code isn't better, like when you are actually doing something that is intrinsically hard to get right. Sometimes it's better to expose that and make people make choices consciously. Here are some alternate ideas for the "process" phase. Just throwing them out there to see if anyone likes it better.
If you ARE able to mutex things or make sure you don't access them from main during the time you are threaded, then you can make that clear by passing them in. On the way out, you get something passed back to you (the work product) as a result value that you don't have to mutex because the thread is done with it.
[edit] To clarify, this means no calls to _process, and all deferred calls would be queued. Any synchronous calls would block, which we could detect and report as probably bad design.
|
@reduz so if I (or a third party library) dont use a Godot worker pool I don't get to see anything in the profiler? If thats the case I feel it's the wrong justification (for profiling)... a profiler should allow instrumenting any code region and not litterally depend on on something that specific |
@reduz so if I (or a third party library) dont use a Godot worker pool I don't get to see anything in the profiler? I think that is to be expected |
@reduz It is not. When I use Tracy, a third-party profiler, I can report scopes to the profiler without Tracy depencing on Godot in the first place. So it follows that I should be able to report functions running in my own threads to Godot's profiler. And so can functions running in worker threads. Hence why I believe that dependency is unnecessary. |
@Zylann Oh sorry, I misunderstood what you meant. Sure, it should be fully possible to add this. |
Emitted from the worker thread, right? |
@derammo yeah |
TLDR, will finish reading the proposal and then come back with apropriate feedback. |
I have some more questins not clarified in the QA section ofthis proposal:
Q: this mock says the process method is emmited from the main thread, when i expected it to be running in a job thread, was this a mistake or i am just not understanding the call chain? my understanding was "prepare and done happens on main thread, and process happens on job thread"
|
@MarianoGnu for the last question, look up 3 posts. Confirmed typo, just not corrected. |
Hi, All this looks cool except for the inherent Dark Side of threads. One usually doesn't look into threading, unless there is a crying need for performance optimization and concurrent execution. So if we ever engage into threading, we have determined that we are in need of performance optimization. If the multi-threading delivers on its promise, we overcome the Dark Side and we come out as winners. If it doesn't, we remain stuck in questionable optimizations. Here is a classic Godot use case in need of performance optimization. I would be interested to know how this proposal is going to help in such elementary use case. Take it as a little challenge to elaborate on the proposal. When we have a BigScene, we have 2 hot spots which are a killer in a mono-threaded environment:
The optimizations available to us today, wrt the above two points, are:
With background loading, we use successfully threading techniques to parallelize processing and optimize performance. We load and build the scene while we render something else, entertaining, on the screen, coupled with some nice background music. With regard to adding the scene to the tree, we are stuck big time. As soon as we do The obvious optimization is to break up the BigScene into smaller ones, and/or serialize some of the BigScene component construction. That's the classic immediate approach. But if you think about it, what we really want is to say So how is this proposal going to help me in this case? While I see fairly well the threading aspect of already initialized nodes, I fail to see how this is going to help with the problem of adding BigScenes to the tree, currently done atomically within the main thread. Thanks. |
My engine is completely multi threaded and I use Godot nodes as a rendering system but I had to queue of events that will then run on the main thread at the start of the next frame How I have to pass values to nodes currently |
Would this proposal also aid physics performance by parallelizing |
Describe the project you are working on
Godot
Describe the problem or limitation you are having in your project
Godot 4 does a lot of improvements on the threading side for performance, but this happens mostly in the servers. Scene node processing performance is unaffected and all processing is still single threaded.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
Allow scene nodes to, optionally, process on threads so processing can happen for many at the same time.
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
Here is a proposed implementation:
C++ API
GDScript usage example
Nodes that would benefit from processing on threads
Applicable nodes:
Godot 4.0 or 4.1?
This was planned for 4.1, but with the implementation of WorkerThreadPool (that had to be made in order to have proper threaded HTML5 support), implementing this became relatively low hanging fruit and it could be done for 4.0 as part of the optimization effort planned after Beta, which will consist in creating a large amount of benchmarks and optimizing as much as possible before RC. It remains to be discussed and ultimately decided.
FAQ:
Q: What is the difference between this an an ECS?
A: This approach is not as cache-friendly per se, because in general Godot nodes do complex, higher level processing of not as many nodes. Because of this, they don't benefit as much from memory bandwidth improvement as they benefit from threading. Something more Akin to ECS style improvements is proposed in #2380.
Q: What is the difference between this and a job scheduler?
A: Entirely job scheduler based engines are far more complex to use, so Godot does some cheats by relying a bit more in the underlying operating system. Physics, Rendering and Scene are able to run in parallel in separate threads, so there is no need to schedule physics, rendering and game logic into a same graph by the user (although they do get scheduled internally efficiently inside WorkerThreadPool). This allows to greatly simplify threading code.
If this enhancement will not be used often, can it be worked around with a few lines of script?
No, this needs to be core.
Is there a reason why this should be core and not an add-on in the asset library?
N/A.
The text was updated successfully, but these errors were encountered: