-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve par_iter and Parallel #12904
Conversation
its cool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rendering parts look fine to me. And the benchmarks show it. I defer the parallel iteration / for each init parts to James :)
@@ -9,6 +13,37 @@ pub struct Parallel<T: Send> { | |||
locals: ThreadLocal<Cell<T>>, | |||
} | |||
|
|||
/// A scope guard of a `Parallel`, when this struct is dropped ,the value will writeback to its `Parallel` | |||
pub struct ParallelGuard<'a, T: Send + Default> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was rejected from the initial Parallel
PR because of the potential for surprising results if a thread panics while holding the guard, the guard is not dropped, or if multiple guards are retrieved from the same thread. We may need to switch this to use RefCell
instead of Cell
to avoid these problems. This should be fine since we don't allow iteration without a &mut Parallel
, so the Sync bound on the ThreadLocal
inner value isn't required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was originally against this style of guard (due to the reasons you listed) -- however if enough people think it would be useful my opinions on it aren't that strong. Using a closure to limit the scope technically doesn't alleviate the problems with a guard I don't think, it just makes you less likely to run into them since the scope is much more explicit.
I do lean towards avoiding a drop impl though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also not a fan of using this drop implementation because it provides a gun to the user. In theory, another method to achieve the same performance is by using par_chunk
. It is much safer and clearer, but it requires some refactoring in the QueryParIter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than a few nits, LGTM!
Co-authored-by: James Liu <[email protected]>
Co-authored-by: James Liu <[email protected]>
Co-authored-by: James Liu <[email protected]>
Thank you to everyone involved with the authoring or reviewing of this PR! This work is relatively important and needs release notes! Head over to bevyengine/bevy-website#1302 if you'd like to help out. |
Objective
Parallel::scope
to collect items frompar_iter
, butscope
will be called with every satifified items. it will cause a lot of unnecessary lookup.Solution
for_each_init
forpar_iter
which only be invoked when spawn a task for a group of items.Changelog
for_each_init
Performance
check_visibility
inmany_foxes
~40% performance gain in
check_visibility
.