Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve global bone pose calculation in Skeleton3D #97538

Merged
merged 1 commit into from
Oct 14, 2024

Conversation

detomon
Copy link
Contributor

@detomon detomon commented Sep 27, 2024

This change introduces a way to calculate global bone poses without recalculating all bones every time a pose changes. It defines the bone hierarchy as a nested set. This makes it possible to represent bone subtrees continuously within an array. With regard to the introduction of SkeletonModifier3D, this should improve performance when using many modifiers that depend on global poses.

Two fields are added to Skeleton3D::Bone:

  • nested_set_offset: defines the position within the nested set1
  • nested_set_span: defines the size of the subtree (the bone itself and all descendants)

This shows the nested set values for a simplified skeleton:

nested_set

How it works

Skeleton3D::bone_global_pose_dirty is an array containing a flag for each bone indicating whether the global pose needs to be recalculated. It can be indexed with Bone::nested_set_offset, followed by the descendant bones.

When changing the pose of a bone, the dirty flags of its subtree need to be set to indicate that the global poses of the bone and its descendants need to be recalculated. This is done by setting all flags in the range from Bone::nested_set_offset with length Bone::nested_set_span to true. This has some overhead2. But it can be skipped if the bone is already dirty at index Bone::nested_set_offset. Because if a global pose of a bone is dirty, all of its descendants are dirty too.

Getting the global pose with get_bone_global_pose

When requesting the global pose of a bone, only the global poses of its parents that are dirty are recalculated and the dirty flag is cleared. In the best case (if the parent global poses have already been calculated) only a few global poses need to be recalculated.

Recalculating global poses with force_update_bone_children_transforms

By looping through Skeleton3D::nested_set_offset_to_bone_index it is possible to update the bone hierarchy without the need for recursion. By checking the dirty flag at the current index, only required bones are updated. The order also ensures, that parent bone poses are calculated before child bone poses.

Benchmarks

skeleton_3d.gd.zip

Test Before After
get_global_poses 12.67ms 12.61ms
set_all_poses 49.83ms 51.98ms
set_all_poses_reverse 87.56ms 89.03ms
set_and_get_global_poses_forward 747.6ms 27.8ms
set_and_get_global_poses_reverse 788.4ms 57.37ms
set_and_get_some_global_poses 29.64ms 15.22ms

Note: I ran the tests individually. Some values were way too high even on master when run in series.

TODO

  • Improve performance of get_bone_global_pose
  • Improve performance of force_update_bone_children_transforms
  • Provide some performance tests
  • Thread safety?

Footnotes

  1. For imported scenes, nested_set_offset seems to be the same as the bone index. Presumably bones are already ordered this way. However, this may be different for manually created skeletons.

  2. It could be implemented as some kind of bit field structure instead of an array, if it turns out that performance is a problem at that point.

@Chaosus Chaosus added this to the 4.x milestone Sep 27, 2024
@fire fire requested review from a team, SaracenOne, lyuma and TokageItLab September 27, 2024 21:42
@Nazarwadim
Copy link
Contributor

Tested on Animation_test.zip.

  • Master: 70 fps
  • This PR: 75 fps

As for thread safe (as I understand it you are talking about using the thread_local keyword) it is multi-thread safe. Although I do not think that this can cause bugs, since the multi-threaded scene tree is not implemented.

@detomon detomon force-pushed the optimize-skeleton3d-global-pose branch from c86266b to bd70bc1 Compare September 30, 2024 18:06
@detomon
Copy link
Contributor Author

detomon commented Sep 30, 2024

As for thread safe (as I understand it you are talking about using the thread_local keyword) it is multi-thread safe. Although I do not think that this can cause bugs, since the multi-threaded scene tree is not implemented.

I was wondering if there is a problem now that the logic is more complex. But if the scene tree is not multi-threaded then that's ok I guess.

@fire
Copy link
Member

fire commented Sep 30, 2024

As far as I know multi-threaded scene tree is implemented. I'm unable to get you the details, but it was done in an earlier release disabled by default.

@Saul2022
Copy link

Saul2022 commented Oct 3, 2024

As far as I know multi-threaded scene tree is implemented.

Right, the implementation is here tho, it would be cool in look in this for skeleton, anim player/tree multithreading improvements. #75901

@fire
Copy link
Member

fire commented Oct 3, 2024

Is this ready for review? I noticed it is still in draft stage but all the items are checked.

@detomon detomon marked this pull request as ready for review October 3, 2024 18:38
@detomon detomon requested a review from a team as a code owner October 3, 2024 18:38
@detomon
Copy link
Contributor Author

detomon commented Oct 3, 2024

Is this ready for review? I noticed it is still in draft stage but all the items are checked.

Yes, I forgot.

Copy link
Member

@TokageItLab TokageItLab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested PhysicalBone (#96270), Retarget (#97824) and some IK, it looks good as it is working fine.

I appreciate that you send this PR as this was one of the task of our todo list, as I commented in #92931 (comment), thanks.

@TokageItLab TokageItLab modified the milestones: 4.x, 4.4 Oct 7, 2024
@detomon detomon force-pushed the optimize-skeleton3d-global-pose branch from bd70bc1 to 9ab8488 Compare October 12, 2024 09:43
@detomon
Copy link
Contributor Author

detomon commented Oct 12, 2024

Fixed conflicts.

@Repiteo Repiteo merged commit 47d45f1 into godotengine:master Oct 14, 2024
20 checks passed
@Repiteo
Copy link
Contributor

Repiteo commented Oct 14, 2024

Thanks!

@fire
Copy link
Member

fire commented Oct 23, 2024

@detomon Do you think this can be done to the general 3d node caching system too?

@detomon
Copy link
Contributor Author

detomon commented Oct 25, 2024

@fire I'm not too familiar with how the Node3D hierarchy works. I will look into that.

@fire
Copy link
Member

fire commented Oct 25, 2024

https://github.com/godotengine/godot/blob/master/scene/3d/node_3d.cpp#L38

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants