Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free command encoders' platform resources on drop on _all_ platforms #5251

Merged

Conversation

ErichDonGubler
Copy link
Member

@ErichDonGubler ErichDonGubler commented Feb 14, 2024

Connections
Link to the issues addressed by this PR, or dependent PRs in other repositories

This is my attempt to investigate and resolve #3193. I believe this code is ready to go, but have not yet confirmed internal details to my satisfaction. I'm leaving this as WIP for now because I will be unavailable to work on finalizing details until ~2024-02-21. If somebody wants to pick it up before then, great!

Description
Describe what problem this is solving, and how it's solved.

I believe that #3193 is caused by not freeing command encoders early enough. This is apparent on the DX12 backend, where the DX12 API actually emits an error to inform us of our mistake. This is likely causing issues in other backends (i.e., Vulkan), where the symptoms aren't as obvious. Metal has already been cared for by @bradwerth with #2077, so only the other backends need attention here.

Interestingly, we don't currently do any clean-up of command encoders when they are dropped, but we do when they're finished. This seems wrong. So, this change adds a CommandEncoder::discard_encoding call for command encoders within wgpu_core::Global::command_encoder_drop. This change also adds a similar call to the Drop implementation of backends that appear to be idempotent in their CommandEncoder::discard_encoder calls (N.B., this does not include Vulkan). This appears to resolve DX12 API errors.

To be completely clear about how each backend is changed here:

Backend Status before this PR Status after this PR
Vulkan Possibly has issues ✅Fixed.
DX12 Reporting hard errors ✅No more errors!
Metal Already handled by #2077. -
GLES Possibly has issues ✅Fixed.

Testing
Explain how this change is tested.

A test (wgpu_test::buffer::command_encoder_without_finish_fine) has been added, demonstrating both the expected failure before the fix and resolution to "just" passing.

This change changes numerous expected error cases for tests for DX12, most notably wgpu_tests::encoder::drop_encoder_after_error, to simply be passing.

Checklist

  • Run cargo fmt.
  • Run cargo clippy. If applicable, add:
    • --target wasm32-unknown-unknown
    • --target wasm32-unknown-emscripten
  • Run cargo xtask test to run tests.
  • Add change to CHANGELOG.md. See simple instructions inside file.

@ErichDonGubler ErichDonGubler force-pushed the discard-cmd-encdr-on-drop branch 3 times, most recently from c9a7cbb to 4515fcb Compare February 14, 2024 19:25
@ErichDonGubler ErichDonGubler added the area: correctness We're behaving incorrectly label Feb 14, 2024
@ErichDonGubler ErichDonGubler changed the title WIP: fix crashes on DX12 when CommandEncoder is created but finish isn't called WIP: Free command encoders' platform resources on drop on _all_ platforms Feb 14, 2024
@ErichDonGubler ErichDonGubler changed the title WIP: Free command encoders' platform resources on drop on _all_ platforms Free command encoders' platform resources on drop on _all_ platforms Feb 14, 2024
@ErichDonGubler ErichDonGubler marked this pull request as ready for review February 14, 2024 19:54
@ErichDonGubler ErichDonGubler requested a review from a team as a code owner February 14, 2024 19:54
@ErichDonGubler ErichDonGubler enabled auto-merge (rebase) February 15, 2024 15:44
@ErichDonGubler ErichDonGubler merged commit 18b7904 into gfx-rs:trunk Feb 15, 2024
27 checks passed
@ErichDonGubler ErichDonGubler deleted the discard-cmd-encdr-on-drop branch February 15, 2024 15:46
@torokati44
Copy link
Contributor

torokati44 commented Feb 26, 2024

anxiously waiting for a release with this fix in, preferably in the 0.19 series
https://github.com/ruffle-rs/ruffle/actions/runs/8046091966/job/21972653869

@ErichDonGubler
Copy link
Member Author

@torokati44: Have you confirmed that this fix addresses an issue in Ruffle? It's unclear if the CI issue you're linking is caused by this fix, or fixed by this fix. 😅

@torokati44
Copy link
Contributor

Since Ruffle is on v0.19.1, which was released before this got merged, the CI issue is definitely not caused by this PR, but hopefully fixed by it.
And, no, haven't verified it yet, but I'd give it more than 60% chance, just by gut feeling. 😶

@torokati44

This comment was marked as resolved.

@ErichDonGubler

This comment was marked as resolved.

@cwfitzgerald
Copy link
Member

Agh - I honestly thought this PR was caught up in all the other internal work, I didn't realize it was so backportable. There will be a 0.19.3 decently soon, so I'll include this in that

@cwfitzgerald cwfitzgerald removed the PR: needs back-porting PR with a fix that needs to land on crates label Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: correctness We're behaving incorrectly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Early frees on CPU Implementations
4 participants