Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve par_iter and Parallel #12904

Merged
merged 14 commits into from
Apr 23, 2024
Merged

Conversation

re0312
Copy link
Contributor

@re0312 re0312 commented Apr 8, 2024

Objective

  • bevy usually use Parallel::scope to collect items from par_iter, but scope will be called with every satifified items. it will cause a lot of unnecessary lookup.

Solution

  • similar to Rayon ,we introduce for_each_init for par_iter which only be invoked when spawn a task for a group of items.

Changelog

  • added for_each_init

Performance

check_visibility in many_foxes
image

~40% performance gain in check_visibility.

@SolarLiner SolarLiner added A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times labels Apr 8, 2024
@P-Asta
Copy link

P-Asta commented Apr 8, 2024

its cool

@alice-i-cecile alice-i-cecile added this to the 0.14 milestone Apr 8, 2024
Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rendering parts look fine to me. And the benchmarks show it. I defer the parallel iteration / for each init parts to James :)

crates/bevy_ecs/src/query/par_iter.rs Outdated Show resolved Hide resolved
crates/bevy_ecs/src/query/par_iter.rs Outdated Show resolved Hide resolved
crates/bevy_utils/src/parallel_queue.rs Outdated Show resolved Hide resolved
@@ -9,6 +13,37 @@ pub struct Parallel<T: Send> {
locals: ThreadLocal<Cell<T>>,
}

/// A scope guard of a `Parallel`, when this struct is dropped ,the value will writeback to its `Parallel`
pub struct ParallelGuard<'a, T: Send + Default> {
Copy link
Member

@james7132 james7132 Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was rejected from the initial Parallel PR because of the potential for surprising results if a thread panics while holding the guard, the guard is not dropped, or if multiple guards are retrieved from the same thread. We may need to switch this to use RefCell instead of Cell to avoid these problems. This should be fine since we don't allow iteration without a &mut Parallel, so the Sync bound on the ThreadLocal inner value isn't required.

Copy link
Member

@JoJoJet JoJoJet Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally against this style of guard (due to the reasons you listed) -- however if enough people think it would be useful my opinions on it aren't that strong. Using a closure to limit the scope technically doesn't alleviate the problems with a guard I don't think, it just makes you less likely to run into them since the scope is much more explicit.

I do lean towards avoiding a drop impl though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also not a fan of using this drop implementation because it provides a gun to the user. In theory, another method to achieve the same performance is by using par_chunk. It is much safer and clearer, but it requires some refactoring in the QueryParIter .

@james7132 james7132 requested a review from JoJoJet April 8, 2024 17:27
@re0312 re0312 requested a review from james7132 April 9, 2024 16:35
Copy link
Member

@james7132 james7132 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a few nits, LGTM!

crates/bevy_utils/src/parallel_queue.rs Outdated Show resolved Hide resolved
crates/bevy_utils/src/parallel_queue.rs Outdated Show resolved Hide resolved
crates/bevy_ecs/src/query/par_iter.rs Outdated Show resolved Hide resolved
@james7132 james7132 added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Apr 23, 2024
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Apr 23, 2024
@alice-i-cecile alice-i-cecile added the M-Needs-Release-Note Work that should be called out in the blog due to impact label Apr 23, 2024
Merged via the queue into bevyengine:main with commit 0f27500 Apr 23, 2024
29 checks passed
@alice-i-cecile
Copy link
Member

Thank you to everyone involved with the authoring or reviewing of this PR! This work is relatively important and needs release notes! Head over to bevyengine/bevy-website#1302 if you'd like to help out.

@alice-i-cecile alice-i-cecile removed the M-Needs-Release-Note Work that should be called out in the blog due to impact label Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants