Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize sequential draws of the same pipeline #2539

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

Firestar99
Copy link
Contributor

@Firestar99 Firestar99 commented Jun 27, 2024

  1. Update documentation to reflect any user-facing changes - in this repository.

  2. Make sure that the changes are covered by unit-tests.

  3. Run cargo clippy on the changes.

  4. Run cargo +nightly fmt on the changes.

  5. Please put changelog entries in the description of this Pull Request
    if knowledge of this change could be valuable to users. No need to put the
    entries to the changelog directly, they will be transferred to the changelog
    file by maintainers right after the Pull Request merge.

    Please remove any items from the template below that are not applicable.

  6. Describe in common words what is the purpose of this change, related
    Github Issues, and highlight important implementation aspects.

Profiling based on my meshlet demo on sponza. Before at ~40fps and fully CPU limited:
image

With this PR I get ~550fps and am most likely CPU GPU sync limited, due not having a working frame in flight system.
image

And the best part is: It's absolutely free and still safe!*
*(if I didn't mess up)

Current master evaluates all accessed resources (buffers, images) if you call any draw or dispatch immediately. My change defers the evaluation specifically of descriptor sets for their resources to the end of the cmd buffer recording. This allows me to deduplicate draws and dispatches using the same pipeline, merging their resources and descriptor sets together, and then deduplicate the descriptor sets again before evaluating each unique one for their actual resources. You may even be able to merge more than that, but I'd rather be on the conservative side with different pipelines.

Changelog:

### Additions
- Optimized performance of back to back draws/dispatches using the same pipeline significantly

@Firestar99
Copy link
Contributor Author

Firestar99 commented Jun 27, 2024

I'm seeing some weird performance behavior in bistro I would like to investigate first before officially submitting this Resolved: attached RenderDoc hates super large cmd buffers of 3000+ draws.

@Firestar99 Firestar99 changed the title Optimize sequential draws Optimize sequential draws of the same pipeline Jun 27, 2024
Some(x) => x,
None => return,
};
.cloned()
Copy link
Contributor Author

@Firestar99 Firestar99 Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another theoretical performance optimization that could be done: remove this clone. It clones the entire DescriptorSetState, which contains a HashMap, and if push descriptors are used each one of those contains another HashMap. Instead it could be some Cow<Arc<_>> so that if state does not change between draws it isn't cloned again.
But practically it's insignificant, it's done once for every draw/dispatch and 103 draws take just 0.12ms. There's other more significant bottlenecks.

@Firestar99
Copy link
Contributor Author

Firestar99 commented Jun 28, 2024

I'm uncertain how secondary cmd buffers are handled. I could imagine a usecase, when they contain many draws with always changing pipelines (which is generally frowned upon anyways) and with very few buffers used, that could maybe be slower with this PR. But I dunno, that may need some testing.

Result: secondary cmd buffers have also improved significantly, 40fps to 120fps (meshlet on sponza), but it still is significantly better to just record everything into the primary cmd buffer, where I could reach 550fps.

@Firestar99 Firestar99 marked this pull request as ready for review June 29, 2024 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant