Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve processRewardsAndPenalties #6229

Closed
twoeths opened this issue Dec 22, 2023 · 5 comments
Closed

Improve processRewardsAndPenalties #6229

twoeths opened this issue Dec 22, 2023 · 5 comments
Assignees
Labels
meta-feature-request Issues to track feature requests. scope-performance Performance issue and ideas to improve performance.

Comments

@twoeths
Copy link
Contributor

twoeths commented Dec 22, 2023

Problem description

The processRewardsAndPenalties time is inconsistent, on most of test mainnet nodes it's all good

Screenshot 2023-12-22 at 14 27 48

However on a lot of other mainnet nodes it's so different

Screenshot 2023-12-22 at 14 29 13

Note that all of these nodes are with v1.13.0 and are all deployed at the same. Having a look at profile it seems the differences come from how much gc run mainly due to packedRootsBytesToLeafNodes function
Screenshot 2023-12-22 at 14 31 27

Solution description

  1. Reuse the whole balances nodes branch to avoid gc, this helps avoid recreating the whole balances branch per epoch transition

Update: not possible to do this as branch nodes are shared with other states, should not mutate them.

  1. Right now we serialize the whole balances before recreating the branch. We can bypass it by setting each balances[i] to LeafNodes[Math.floor(i/4)], h[(i % 4) * 2] and h[(i % 4) * 2 + 1] directly

Additional context

No response

@twoeths twoeths added the meta-feature-request Issues to track feature requests. label Dec 22, 2023
@twoeths
Copy link
Contributor Author

twoeths commented Dec 22, 2023

idea 1 is not possible for now because LeftNodes are shared across states, can revisit once we're done with n-historical state #5968 work
given a deleted check point state, can balances tree be reused for the next epoch transition?

@twoeths
Copy link
Contributor Author

twoeths commented Jan 4, 2024

epoch transition time is getting worse and may cause missed attestations at slot 0 as it could be up to 6s
Just take a look at gossipsub metric, bandwidth is in subscribe-all-subnets test mainnet node is even bigger than that in production mainnet node.

Screenshot 2024-01-04 at 11 20 05

vs our test node subscribing to all subnets

Screenshot 2024-01-04 at 11 23 23

The difference I see is in publish flow where production mainnet node has to serialize/prefix data while test mainnet node does not have to do that, this may cause gc to run more in processRewardsAndPenalties function (see the profile above)

we should prioritize gossipsub publish in batch issue which only require to serialize once per message (instead of per peer) ChainSafe/js-libp2p-gossipsub#344 cc @wemeetagain @philknows

@twoeths
Copy link
Contributor Author

twoeths commented Jan 8, 2024

in progress PR ChainSafe/js-libp2p-gossipsub#480

@twoeths
Copy link
Contributor Author

twoeths commented Jan 9, 2024

PR was merged, need to update gossipsub option when new gossipsub is released

@philknows philknows added the scope-performance Performance issue and ideas to improve performance. label Jan 11, 2024
@twoeths
Copy link
Contributor Author

twoeths commented Jul 12, 2024

processRewardsAndPenalties is great since we merge ssz v0.15.1 ChainSafe/ssz#352

this is what I see on a mainnet node as of Jul 2024

Screenshot 2024-07-12 at 16 09 58

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta-feature-request Issues to track feature requests. scope-performance Performance issue and ideas to improve performance.
Projects
None yet
Development

No branches or pull requests

2 participants