When a task scope produces <= 1 task to run, run it on the calling thread immediately #932
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an additional change continuing on from #892
The rationale for this change is that if we have just a single task to perform, it's almost always better to perform it immediately rather than pass the work to another thread and blocking-wait on it (thereby tying up two threads.)
I don't see significant gains (or losses) from this in practice when profiling. To me the data indicates that the main thread is already consistently picking up the task before a worker thread can (and doesn't appear that we are even waking another thread?) While the benefits from thread behavior are inconclusive, this does remove some allocations when hitting the fast path of <= 1 task to perform in a scope.
Original
This is a baseline from before we fixed the deadlock. It shows what it looks like when the main thread is constantly blocked waiting on other threads. Frames averaged ~6.98ms
Recent Commit
The main thread has very high occupancy and there are clear runs where the main thread executed a bunch of systems without going out to other threads. (The top bar labeled "main" is almost fully green.) Frames averaged ~5.96ms
This PR
I don't see a significant difference. Frames averaged ~5.72ms. ~4% difference is too narrow for me to draw any conclusion at this point.
Methodology
Ran a single time in release, profiled with superluminal, code was instrumented as shown here: aclysma@0782fe4