-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intern mesh vertex buffer layouts so that we don't have to compare them over and over. #12216
Intern mesh vertex buffer layouts so that we don't have to compare them over and over. #12216
Conversation
over and over. Although we cached hashes of `MeshVertexBufferLayout`, we were paying the cost of `PartialEq` on `InnerMeshVertexBufferLayout` for every entity, every frame. This patch changes that logic to place `MeshVertexBufferLayout`s in `Arc`s so that they can be compared and hashed by pointer. This results in a 28% speedup in the `queue_material_meshes` phase of `many_cubes`, with frustum culling disabled. Additionally, this patch contains two minor changes: 1. This commit flattens the specialized mesh pipeline cache to one level of hash tables instead of two. This saves a hash lookup. 2. The example `many_cubes` has been given a `--no-frustum-culling` flag, to aid in benchmarking.
It looks like your PR is a breaking change, but you didn't provide a migration guide. Could you add some context on what users should update when this change get released in a new version of Bevy? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing a 17.3% improvement with queue_material_meshes in many_cubes with frustrum culling turned on. This saves a bit of memory on every mesh with a duplicate layout, and the best thing is that this is largely transparent to the user. This looks good to me.
crates/bevy_render/src/mesh/mod.rs
Outdated
pub struct MeshVertexBufferLayoutRef(pub Arc<MeshVertexBufferLayout>); | ||
|
||
#[derive(Clone, Default, Resource)] | ||
pub struct MeshVertexBufferLayouts(HashSet<MeshVertexBufferLayoutRef>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be pub? For all intents and purposes, this is an implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does because otherwise rustc
complains that it's being used in a public interface (specifically, <Mesh as RenderAsset>::Param
).
crates/bevy_render/src/mesh/mod.rs
Outdated
// the `MeshVertexBufferLayoutRef`, this compares the mesh vertex buffer | ||
// layout structurally, not by pointer. | ||
self.0 | ||
.get_or_insert_with(&layout, |layout| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was momentarily concerned that get_or_insert_with was nightly-only but we are using hashbrown, where it isn't. Still, something to note for possible migration to core Rust HashSet/HashMap at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small request to add a comment about not adding Deref and why.
Also CI failed:
You can run |
|
CI failures look real, could you check them out and then ping us once you have a fix? |
table. The trick with `Borrow` was too clever and ended up actually hashing the pointer, causing lookups to sometimes spuriously fail. This manifested itself as a crash in `alien_cake_addict`. Removing the `MeshVertexBufferLayoutRef` wrapper from the entries in the store prevents its implementation of `Hash` from being called, solving the problem.
Should be fixed now. |
…em over and over. (bevyengine#12216) Although we cached hashes of `MeshVertexBufferLayout`, we were paying the cost of `PartialEq` on `InnerMeshVertexBufferLayout` for every entity, every frame. This patch changes that logic to place `MeshVertexBufferLayout`s in `Arc`s so that they can be compared and hashed by pointer. This results in a 28% speedup in the `queue_material_meshes` phase of `many_cubes`, with frustum culling disabled. Additionally, this patch contains two minor changes: 1. This commit flattens the specialized mesh pipeline cache to one level of hash tables instead of two. This saves a hash lookup. 2. The example `many_cubes` has been given a `--no-frustum-culling` flag, to aid in benchmarking. See the Tracy profile: <img width="1064" alt="Screenshot 2024-02-29 144406" src="https://github.com/bevyengine/bevy/assets/157897/18632f1d-1fdd-4ac7-90ed-2d10306b2a1e"> ## Migration guide * Duplicate `MeshVertexBufferLayout`s are now combined into a single object, `MeshVertexBufferLayoutRef`, which contains an atomically-reference-counted pointer to the layout. Code that was using `MeshVertexBufferLayout` may need to be updated to use `MeshVertexBufferLayoutRef` instead.
[12216](bevyengine/bevy#12216) introduced an argument `&mut MeshVertexBufferLayouts` to `get_mesh_vertex_buffer_layout`, which bevy_ecs_tilemap calls in `RenderChunk2d::prepare`
* Update to 0.14.0-rc.2 * [12997](bevyengine/bevy#12997): rename `multi-threaded` to `multi_threaded` * RenderAssets<Image> is now RenderAssets<GpuImage> Implemented in [12827](bevyengine/bevy#12827) * FloatOrd is now in bevy_math implemented in [12732](bevyengine/bevy#12732) * convert Transparent2d::dynamic_offset to extra_index [12889](bevyengine/bevy#12889) Gpu Frustum Culling removed the dynamic_offset of Transparent2d and it became `extra_index` with the special value `PhaseItemExtraIndex::NONE`, which indicates the `None` that was here previously * RenderPhase<Transparent2d> -> ViewSortedRenderPhases<Transparent2d> [12453](https://github.com/StarArawn/bevy_ecs_tilemap/pull/bevyengine/bevy#12453): Render phases are now binned or sorted. Following the changes in the `mesh2d_manual` [example](https://github.com/bevyengine/bevy/blob/ecdd1624f302c5f71aaed95b0984cbbecf8880b7/examples/2d/mesh2d_manual.rs#L357-L358): use the `ViewSortedRenderPhases` resource. * get_sub_app_mut is now an Option in [9202](https://github.com/StarArawn/bevy_ecs_tilemap/pull/bevyengine/bevy/pull/9202) SubApp access has changed * GpuImage::size f32 -> u32 via UVec2 [11698](bevyengine/bevy#11698) changed `GpuImage::size` to `UVec2`. Right above this, `Extent3d` does the same thing, so I'm taking a small leap and assuming can `as`. * GpuMesh::primitive_topology -> key_bits/BaseMeshPipeline [12791](bevyengine/bevy#12791) the `primitive_topology` field on `GpuMesh` was removed in favor of `key_bits` which can be constructed using `BaseMeshPipeline::from_primitive_topology` * RenderChunk2d::prepare requires &mut MeshVertexBufferLayouts now [12216](bevyengine/bevy#12216) introduced an argument `&mut MeshVertexBufferLayouts` to `get_mesh_vertex_buffer_layout`, which bevy_ecs_tilemap calls in `RenderChunk2d::prepare` * into_linear_f32 -> color.0.linear().to_f32_array(), [12163](bevyengine/bevy#12163) bevy_color was created and Color handling has changed. Specifically Color::as_linear_rgba_f32 has been removed. LinearRgba is now its own type that can be accessed via [`linear()`](https://docs.rs/bevy/0.14.0-rc.2/bevy/color/enum.Color.html#method.linear) and then converted. * Must specify type of VisibleEntities when accessing [12582](bevyengine/bevy#12582) divided `VisibleEntities` into separate lists. So now we have to specify which kind of entity we want. I think we want the Mesh here, and I think we can get rid of the `.index` calls on Entity since Entity [already compares bits](https://docs.rs/bevy_ecs/0.14.0-rc.2/src/bevy_ecs/entity/mod.rs.html#173) for optimized codegen purposes. Waiting to do that until the other changes are in though so as to not change functionality until post-upgrade. * app.world access is functions now - [9202](bevyengine/bevy#9202) changed world access to functions. [relevent line](https://github.com/bevyengine/bevy/pull/9202/files#diff-b2fba3a0c86e496085ce7f0e3f1de5960cb754c7d215ed0f087aa556e529f97fR640) - This also surfaced [12655](bevyengine/bevy#12655) which removed `Into<AssetId<T>>` for `Handle<T>`. using a reference or .id() is the solution here. * We don't need `World::cell`, and it doesn't exist anymore In [12551](bevyengine/bevy#12551) `WorldCell` was removed. ...but it turns out we don't need it or its replacement anyway. * examples error out unless this bevy bug is addressed with these features being added bevyengine/bevy#13728 * check_visibility is required for the entity that is renderable As a result of [12582](bevyengine/bevy#12582) `check_visibility` must be implemented for the "renderable" tilemap entities. Doing this is trivial by taking advantage of the existing `check_visibility` type arguments, which accept a [`QF: QueryFilter + 'static`](https://docs.rs/bevy/0.14.0-rc.2/bevy/render/view/fn.check_visibility.html). The same `QueryFilter`` is used when checking `VisibleEntities`. I've chosen `With<TilemapRenderSettings` because presumably if the entity doesn't have a `TilemapRenderSettings` then it will not be rendering, but this could be as sophisticated or simple as we want. For example `WithLight` is currently implemented as ```rust pub type WithLight = Or<(With<PointLight>, With<SpotLight>, With<DirectionalLight>)>; ``` * view.view_proj -> view.clip_from_world [13289](bevyengine/bevy#13489) introduced matrix naming changes, including `view_proj` which becomes `clip_from_world` * color changes to make tests runnable * clippy fix * Update Cargo.toml Co-authored-by: Rob Parrett <[email protected]> * Update Cargo.toml Co-authored-by: Rob Parrett <[email protected]> * final clippy fixes * Update Cargo.toml Co-authored-by: Rob Parrett <[email protected]> * Simplify async loading in ldtk/tiled helpers See Bevy #12550 * remove second allow lint * rc.3 bump * bump version for major release * remove unused features --------- Co-authored-by: Rob Parrett <[email protected]>
Although we cached hashes of
MeshVertexBufferLayout
, we were paying the cost ofPartialEq
onInnerMeshVertexBufferLayout
for every entity, every frame. This patch changes that logic to placeMeshVertexBufferLayout
s inArc
s so that they can be compared and hashed by pointer. This results in a 28% speedup in thequeue_material_meshes
phase ofmany_cubes
, with frustum culling disabled.Additionally, this patch contains two minor changes:
This commit flattens the specialized mesh pipeline cache to one level of hash tables instead of two. This saves a hash lookup.
The example
many_cubes
has been given a--no-frustum-culling
flag, to aid in benchmarking.See the Tracy profile:
Migration guide
MeshVertexBufferLayout
s are now combined into a single object,MeshVertexBufferLayoutRef
, which contains an atomically-reference-counted pointer to the layout. Code that was usingMeshVertexBufferLayout
may need to be updated to useMeshVertexBufferLayoutRef
instead.