wgpu-core
submission tracking fails to avoid resource management overhead
#5560
Labels
area: performance
How fast things go
kind: refactor
Making existing function faster or nicer
type: enhancement
New feature or request
wgpu-core
's submission tracking goes to great pains to avoid refcount traffic for resources used by commands in flight, but those efforts are ineffective, and the traffic occurs anyway. Since the complexity fails to deliver, it should be removed.In more detail:
Our backend APIs require that resources (buffers, textures, etc.) used by commands in flight not be freed until the commands have finished execution. (Quite reasonable.)
One of the striking things about
wgpu_core::device::life::ActiveSubmission
is that it does not hold strong references to the resources used by the commands it represents. Rather, theActiveSubmission::last_resources
set is initially empty (ignoreDevice::pending_writes
for now), and resources are only added to it if they are destroyed while their most recent queue submission is still in flight, as a way of deferring their actual destruction until after the command has finished.As far as I understand, it's done this way because it's rare in practice for applications to free resources while they are still in use by the GPU, and thus any time spent fiddling with ownership on each queue submission is almost always wasted. With the approach above, in the common case, most
ActiveSubmission::last_resources
sets remain empty throughout their lifetime, and no resources need to be touched on completion.However, this strategy is ineffective.
Compare this approach to a simpler one where
ActiveSubmission
simply holds strong references to the resources its commands use. Queue submission would move strong references fromCommandBuffer
s andPendingWrites
into the newActiveSubmission
. When execution is complete, those strong references would be dropped, and any unreferenced resources would be freed.First, note that, in the best case where nothing is actually freed, the only difference between the two approaches is the amount of refcount traffic at command completion. Dropping strong references to resources that are still alive is a pointer dereference and an atomic decrement.
Second, note that
wgpu_core::command::CommandBuffer
holds strong references to the resources its commands use. On queue submission, these strong references are packed into aBakedCommands
value, which is then dropped. This incurs exactly the same pointer dereference and refcount traffic, on every resource directly used by the command buffer, that would occur at execution completion ifActiveSubmission
held strong references to all resources. The only effect of the current approach is to move the pointer dereferences and refcount fiddling from queue draining time to submission time.Moving the traffic from later to earlier doesn't seem worth the complexity required to support it.
Instead, queue submission should move the owning references from the command buffers and pending writes into the
ActiveSubmission
, along with the encoders and command buffers. This will move the refcount traffic from queue submission time to queue draining time. We can then remove the delicate and complex code that populatesActiveSubmission::last_resources
.LifetimeTracker::triage_suspected
and everything that supports it could go away, replaced by idiomatic use ofArc
.cc @kvark
The text was updated successfully, but these errors were encountered: