Fix thread safety issues in UsdSkel_SkelDefinition. #2369

cameronwhite · 2023-03-30T14:47:42Z

Description of Change(s)

In methods like _ComputeJointWorldInverseBindTransforms(), check the compute flag again after acquiring the lock to avoid potentially recomputing the result again if multiple threads were waiting on the mutex. Although the computed values would not change, it is not safe to call mutable member functions of the VtArray (which can cause a copy-on-write detach) while other threads may be in the middle of making a copy of it. An example sequence that could lead to a crash:
- Thread A finishes the critical section, and continues on to make a copy of the array in _GetJointWorldInverseBindTransforms() (refcount == 2)
- Thread B was waiting on the mutex and starts to redo the computation. This calls non-const member functions and enters the body of _DetachIfNotUnique() since _IsUnique() is false
- Thread C enters _GetJointWorldInverseBindTransforms() and observes that the compute flag is set, and starts to make its own copy of the array, but hasn't bumped the refcount yet.
- Thread A finishes with its copy and decrements the refcount (refcount == 1)
- Thread B decrements the refcount, taking it to zero and destroying the data, before switching to its new copy of the data
- Thread C now has an array with a reference to the deleted data block
Prefer using operator|= to atomically set the flag rather than doing a read -> bitwise OR -> atomic store sequence which could cause flags to be lost if there are concurrent writes. Currently the writes are all guarded by the same mutex so the previous approach was not actually problematic, but this is safer if e.g. in the future there are separate locks for each cached array.

This isn't the easiest to reproduce (e.g. Storm will not encounter this, since it computes the skinning transforms during the single-threaded sprim sync), but I had a customer file that reproduced this very reliably in the Houdini GL delegate with a 32-core machine, and verified that this fix resolved the issue.

Fixes Issue(s)

UsdSkel _GetJointWorldInverseBindTransforms has a race, and may compute the transforms multiple times #1742

I have verified that all unit tests pass with the proposed changes

I have submitted a signed Contributor License Agreement

- In methods like _ComputeJointWorldInverseBindTransforms(), check the compute flag again after acquiring the lock to avoid potentially recomputing the result again if multiple threads were waiting on the mutex. Although the computed result would not change, it is not safe to call mutable member functions of the VtArray (which can cause a copy-on-write detach) while other threads may be in the middle of making a copy of it. - Prefer using operator|= to atomically set the flag rather than doing a read -> bitwise OR -> atomic store sequence which could cause flags to be lost if there are concurrent writes. Currently the writes are all guarded by the same mutex so the previous approach was not problematic, but the new approach is safer if e.g. in the future there are separate locks for each cached array. Bug: PixarAnimationStudios#1742

sunyab · 2023-03-31T17:32:44Z

Filed as internal issue #USD-8176

Fix thread safety issues in UsdSkel_SkelDefinition. (Internal change: 2282798)

pixar-oss merged commit ab33461 into PixarAnimationStudios:dev Jul 3, 2023

pixar-oss added a commit that referenced this pull request Jul 3, 2023

Merge pull request #2369 from cameronwhite/dev_skeldefinition_fix

9f7dfcf

Fix thread safety issues in UsdSkel_SkelDefinition. (Internal change: 2282798)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix thread safety issues in UsdSkel_SkelDefinition. #2369

Fix thread safety issues in UsdSkel_SkelDefinition. #2369

cameronwhite commented Mar 30, 2023

sunyab commented Mar 31, 2023

Fix thread safety issues in UsdSkel_SkelDefinition. #2369

Fix thread safety issues in UsdSkel_SkelDefinition. #2369

Conversation

cameronwhite commented Mar 30, 2023

Description of Change(s)

Fixes Issue(s)

sunyab commented Mar 31, 2023