Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MP] Various performance optimizations #82777

Merged
merged 4 commits into from
Oct 10, 2023
Merged

Conversation

Faless
Copy link
Collaborator

@Faless Faless commented Oct 4, 2023

As pointed out in #81593 , watched/on change variables are quite heavy on the CPU side, mostly due to the way network IDs are exchanged and stored for non-spawned node.

This PR introduce a series of optimizations to reduce the computational cost of the replication system:

  • Optimize the data structures used by the network IDs cache (using ObjectIDs locally, while still transferring relative paths).
  • Optimize internal authority checks to avoid going through the SceneTree (searching the appropriate MultiplayerAPI as we already know the correct one)
  • Avoid unnecessary ref/unrefs between the various internal components (Replication/Cache/RPC)
  • Optimize internal SceneReplicationConfig access (using direct pointers instead of refs)

flamegraph

Closes #81593 , might need some more testing.

@akien-mga
Copy link
Member

CC @DennisManaa

@DennisManaa
Copy link
Contributor

Many thanks for addressing this issue :) This PR looks great 🎉

Unfortunately, I won't be able to look into this topic until next Tuesday/Wednesday because I'm currently on vacation. But it will be the first thing I'll do when I'm home again :)

@DennisManaa
Copy link
Contributor

DennisManaa commented Oct 10, 2023

@Faless, I just wanted to let you know that I can confirm the performance benefits of your changes. Additionally, I didn't notice any further problems while testing them with our game.

I measured the cycles for the last commit in this PR and for the last commit before the changes in this PR (commit 0ca8542). It seems that you reduced the needed cycles for _send_delta to approximately 31% and the needed cycles for _send_sync to approximately 41% in our specific project. Many thanks for your great work! ❤️
(I think that others should probably experience similar performance-improvements, when using a lot of MultiplayerSynchronizers in their projects) :)

Here are the results of my measurements:

Function before after "after" divided by "before"
Main::iteration 1.714E+11 cycles 1.262E+11 cycles ≈ 0.74
SceneTree::process 1.385E+11 cycles 0.9372E+11 cycles ≈ 0.67
SceneMultiplayer::poll 0.8381E+11 cycles 0.3825E+11 cycles ≈ 0.46
SceneReplicationInterface::on_network_process 0.8064E+11 cycles 0.3475E+11 cycles ≈ 0.43
SceneReplicationInterface::_send_delta 0.4717E+11 cycles 0.1448E+11 cycles ≈ 0.31
SceneReplicationInterface::_send_sync 0.2401E+11 cycles 0.09872E+11 cycles ≈ 0.41

Before

grafik

After

grafik

Copy link
Member

@akien-mga akien-mga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did a code style review.

This was successfully tested, and the implementation changes seem sensible, so I think this can be merged.

Only use paths during network transfer.
Use ObjectID instead of NodePaths for storing the Node <-> NetID
relations locally.
We already know which MultiplayerAPI a certain Node uses, so we don't
need to retrieve it via SceneTree every time.
Access the various internal components (cache/replicator) via pointer,
to avoid unnecessary overhead.
Use direct pointer addressing to avoid unnecessary refs/unrefs
@akien-mga akien-mga modified the milestones: 4.x, 4.2 Oct 10, 2023
@akien-mga akien-mga merged commit 8ea8842 into godotengine:master Oct 10, 2023
15 checks passed
@akien-mga
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants