-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store only the IDs needed for Query iteration #12476
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I can see the motivation here. I don't think this is a meaningful regression in terms of readability, and saving memory here is a useful application.
To confirm my understanding:
- Whether or not a query uses dense or sparse iteration can be determined at constant time, based on whether or not both the data and filter are dense (aka don't contain any sparse-set components).
- We can reuse the same space to store the table (dense) and archetype (sparse) IDs, since we always care about exactly one of the two.
- We don't want to use an
enum
here, because the discriminant would take an extra byte, which we don't need since we can always just reconstruct this based on the constants already stored in ourQueryState
.
It's actually an extra 4 bytes due to padding, which isn't a huge deal, but the branch on something that should be an invariant is potentially worse in hot loops. It could be an enum and then use DebugCheckedUnwrap, but that might negatively impact readability, may add some optimization blockers to the compiler if widespread enough, and it would always negate any memory savings from this PR. |
Co-authored-by: Alice Cecile <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read through this code recently to try to finally understand ECS internals, so i challenged myself to review this
} | ||
let table_index = archetype.table_id().as_usize(); | ||
if !self.matched_tables.contains(table_index) { | ||
self.matched_tables.grow_and_insert(table_index); | ||
self.matched_table_ids.push(archetype.table_id()); | ||
if D::IS_DENSE && F::IS_DENSE { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still also need separate self.matched_tables
and self.matched_archetypes
bitsets?
It looks like only one of the two will be used, depending on the value of D::IS_DENSE
and F::IS_DENSE
Maybe it's still needed for things like join
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep exactly, join needs them, and get_unchecked_manual needs specifically matched_archetypes (right now).
@@ -21,6 +21,17 @@ use super::{ | |||
QuerySingleError, ROQueryItem, | |||
}; | |||
|
|||
/// An ID for either a table or an archetype. Used for Query iteration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it would be useful to specify that this is used for optimizing query iteration; in the case where all components are in tables, we can iterate through table_ids directly instead of archetypes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also maybe add a comment on why this being a union
is ok here, as unions are rarer than enums. It's because we know which variant to use depending on QueryState::IS_DENSE
so we can skip storing the variant type?
@@ -650,8 +648,7 @@ impl<'w, 's, D: ReadOnlyQueryData, F: QueryFilter, const K: usize> FusedIterator | |||
} | |||
|
|||
struct QueryIterationCursor<'w, 's, D: QueryData, F: QueryFilter> { | |||
table_id_iter: std::slice::Iter<'s, TableId>, | |||
archetype_id_iter: std::slice::Iter<'s, ArchetypeId>, | |||
storage_id_iter: std::slice::Iter<'s, StorageId>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean that the table_entities
and archetype_entities
below could have a similar optiimization as storage_entities: &'w [StorageEntities]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what #5085 does, however the benefit may be limited as this struct doesn't really exist at runtime: it's never formally materialized to the stack or the heap under normal use cases and any inlined iteration will decompose the fetches and updates.
Compare this with the Vec in QueryState, which requires both stack and heap space due to being a persisted heap allocated backing for Query. There's real memory savings by using the union.
Doable but I'm not sure if the savings are worth the introduction of more unsafe and readability impact.
Objective
Other than the exposed functions for reading matched tables and archetypes, a
QueryState
does not actually need both internal Vecs for storing matched archetypes and tables. In practice, it will only use one of the two depending on if it uses dense or archetypal iteration.Same vein as #12474. The goal is to reduce the memory overhead of using queries, which Bevy itself, ecosystem plugins, and end users are already fairly liberally using.
Solution
Add
StorageId
, which is a union overTableId
andArchetypeId
, and store only one of the two at runtime. Read the slice as if it was one ID depending on whether the query is dense or not.This follows in the same vein as #5085; however, this one directly impacts heap memory usage at runtime, while #5085 primarily targeted transient pointers that might not actually exist at runtime.
Changelog
Changed:
QueryState::matched_tables
now returns an iterator instead of a reference to a slice.Changed:
QueryState::matched_archetypes
now returns an iterator instead of a reference to a slice.Migration Guide
QueryState::matched_tables
andQueryState::matched_archetypes
does not return a reference to a slice, but an iterator instead. You may need to use iterator combinators or collect them into a Vec to use it as a slice.