Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve state transition metrics #5171

Merged
merged 5 commits into from
Feb 20, 2023
Merged

Conversation

dapplion
Copy link
Contributor

@dapplion dapplion commented Feb 20, 2023

Motivation

Current time metrics miss 2 important time spenders in state transition function:

  • commit step
  • hashTreeRoot step

Description

  • Increase the timer range to capture commit step
  • Time hashTreeRoot step
  • Time commit step individually
  • Track if caches are populated after running state transition function
  • Rename to more "normal" names

@tuyennhv note that time metrics for the state transition will likely be worse since those were not accounting for all steps

@dapplion dapplion requested a review from a team as a code owner February 20, 2023 04:28
@dapplion dapplion added the scope-metrics All issues with regards to the exposed metrics. label Feb 20, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 20, 2023

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 97b1098 Previous: d1cddb7 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 1.0306 ms/op 921.33 us/op 1.12
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 57.091 us/op 44.953 us/op 1.27
BLS verify - blst-native 1.2667 ms/op 1.1882 ms/op 1.07
BLS verifyMultipleSignatures 3 - blst-native 2.5624 ms/op 2.4612 ms/op 1.04
BLS verifyMultipleSignatures 8 - blst-native 5.6589 ms/op 5.2841 ms/op 1.07
BLS verifyMultipleSignatures 32 - blst-native 20.612 ms/op 19.071 ms/op 1.08
BLS aggregatePubkeys 32 - blst-native 26.945 us/op 25.568 us/op 1.05
BLS aggregatePubkeys 128 - blst-native 107.83 us/op 99.603 us/op 1.08
getAttestationsForBlock 68.887 ms/op 52.476 ms/op 1.31
isKnown best case - 1 super set check 295.00 ns/op 264.00 ns/op 1.12
isKnown normal case - 2 super set checks 285.00 ns/op 259.00 ns/op 1.10
isKnown worse case - 16 super set checks 282.00 ns/op 257.00 ns/op 1.10
CheckpointStateCache - add get delete 6.3330 us/op 5.2510 us/op 1.21
validate gossip signedAggregateAndProof - struct 2.8805 ms/op 2.7454 ms/op 1.05
validate gossip attestation - struct 1.3885 ms/op 1.2943 ms/op 1.07
pickEth1Vote - no votes 1.4641 ms/op 1.2330 ms/op 1.19
pickEth1Vote - max votes 14.523 ms/op 10.586 ms/op 1.37
pickEth1Vote - Eth1Data hashTreeRoot value x2048 10.297 ms/op 8.9057 ms/op 1.16
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 17.577 ms/op 14.668 ms/op 1.20
pickEth1Vote - Eth1Data fastSerialize value x2048 968.89 us/op 628.88 us/op 1.54
pickEth1Vote - Eth1Data fastSerialize tree x2048 9.4079 ms/op 7.5048 ms/op 1.25
bytes32 toHexString 745.00 ns/op 488.00 ns/op 1.53
bytes32 Buffer.toString(hex) 457.00 ns/op 340.00 ns/op 1.34
bytes32 Buffer.toString(hex) from Uint8Array 931.00 ns/op 552.00 ns/op 1.69
bytes32 Buffer.toString(hex) + 0x 496.00 ns/op 334.00 ns/op 1.49
Object access 1 prop 0.22400 ns/op 0.16400 ns/op 1.37
Map access 1 prop 0.19300 ns/op 0.15800 ns/op 1.22
Object get x1000 10.496 ns/op 6.2400 ns/op 1.68
Map get x1000 0.68800 ns/op 0.53600 ns/op 1.28
Object set x1000 82.639 ns/op 52.662 ns/op 1.57
Map set x1000 56.026 ns/op 42.497 ns/op 1.32
Return object 10000 times 0.26490 ns/op 0.23590 ns/op 1.12
Throw Error 10000 times 4.5292 us/op 4.1765 us/op 1.08
fastMsgIdFn sha256 / 200 bytes 3.7970 us/op 3.5900 us/op 1.06
fastMsgIdFn h32 xxhash / 200 bytes 347.00 ns/op 281.00 ns/op 1.23
fastMsgIdFn h64 xxhash / 200 bytes 532.00 ns/op 381.00 ns/op 1.40
fastMsgIdFn sha256 / 1000 bytes 12.845 us/op 11.423 us/op 1.12
fastMsgIdFn h32 xxhash / 1000 bytes 467.00 ns/op 403.00 ns/op 1.16
fastMsgIdFn h64 xxhash / 1000 bytes 569.00 ns/op 450.00 ns/op 1.26
fastMsgIdFn sha256 / 10000 bytes 115.63 us/op 103.19 us/op 1.12
fastMsgIdFn h32 xxhash / 10000 bytes 2.1570 us/op 1.8740 us/op 1.15
fastMsgIdFn h64 xxhash / 10000 bytes 1.5260 us/op 1.3300 us/op 1.15
enrSubnets - fastDeserialize 64 bits 1.9380 us/op 1.2560 us/op 1.54
enrSubnets - ssz BitVector 64 bits 675.00 ns/op 477.00 ns/op 1.42
enrSubnets - fastDeserialize 4 bits 227.00 ns/op 170.00 ns/op 1.34
enrSubnets - ssz BitVector 4 bits 765.00 ns/op 483.00 ns/op 1.58
prioritizePeers score -10:0 att 32-0.1 sync 2-0 120.47 us/op 92.062 us/op 1.31
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 142.76 us/op 120.85 us/op 1.18
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 233.48 us/op 170.80 us/op 1.37
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 370.08 us/op 330.79 us/op 1.12
prioritizePeers score 0:0 att 64-1 sync 4-1 451.08 us/op 375.61 us/op 1.20
array of 16000 items push then shift 1.8717 us/op 1.6778 us/op 1.12
LinkedList of 16000 items push then shift 10.715 ns/op 9.2050 ns/op 1.16
array of 16000 items push then pop 125.25 ns/op 112.89 ns/op 1.11
LinkedList of 16000 items push then pop 10.696 ns/op 8.6160 ns/op 1.24
array of 24000 items push then shift 2.5009 us/op 2.3245 us/op 1.08
LinkedList of 24000 items push then shift 11.264 ns/op 8.8380 ns/op 1.27
array of 24000 items push then pop 88.110 ns/op 87.126 ns/op 1.01
LinkedList of 24000 items push then pop 10.312 ns/op 8.5650 ns/op 1.20
intersect bitArray bitLen 8 14.947 ns/op 13.338 ns/op 1.12
intersect array and set length 8 127.59 ns/op 78.475 ns/op 1.63
intersect bitArray bitLen 128 53.498 ns/op 43.942 ns/op 1.22
intersect array and set length 128 1.4343 us/op 1.0597 us/op 1.35
Buffer.concat 32 items 3.5430 us/op 2.7490 us/op 1.29
Uint8Array.set 32 items 2.6970 us/op 2.4020 us/op 1.12
pass gossip attestations to forkchoice per slot 3.1270 ms/op 4.0145 ms/op 0.78
computeDeltas 3.3174 ms/op 3.3486 ms/op 0.99
computeProposerBoostScoreFromBalances 1.8710 ms/op 1.8254 ms/op 1.03
altair processAttestation - 250000 vs - 7PWei normalcase 3.2582 ms/op 3.4777 ms/op 0.94
altair processAttestation - 250000 vs - 7PWei worstcase 5.8743 ms/op 4.4096 ms/op 1.33
altair processAttestation - setStatus - 1/6 committees join 148.93 us/op 146.91 us/op 1.01
altair processAttestation - setStatus - 1/3 committees join 297.94 us/op 286.26 us/op 1.04
altair processAttestation - setStatus - 1/2 committees join 390.33 us/op 376.61 us/op 1.04
altair processAttestation - setStatus - 2/3 committees join 521.37 us/op 482.01 us/op 1.08
altair processAttestation - setStatus - 4/5 committees join 721.11 us/op 660.93 us/op 1.09
altair processAttestation - setStatus - 100% committees join 797.17 us/op 773.14 us/op 1.03
altair processBlock - 250000 vs - 7PWei normalcase 18.863 ms/op 18.046 ms/op 1.05
altair processBlock - 250000 vs - 7PWei normalcase hashState 30.574 ms/op 26.638 ms/op 1.15
altair processBlock - 250000 vs - 7PWei worstcase 57.822 ms/op 52.388 ms/op 1.10
altair processBlock - 250000 vs - 7PWei worstcase hashState 73.402 ms/op 70.749 ms/op 1.04
phase0 processBlock - 250000 vs - 7PWei normalcase 1.9994 ms/op 2.3551 ms/op 0.85
phase0 processBlock - 250000 vs - 7PWei worstcase 29.096 ms/op 29.480 ms/op 0.99
altair processEth1Data - 250000 vs - 7PWei normalcase 477.31 us/op 532.38 us/op 0.90
vc - 250000 eb 1 eth1 1 we 0 wn 0 - smpl 15 9.6890 us/op 9.8170 us/op 0.99
vc - 250000 eb 0.95 eth1 0.1 we 0.05 wn 0 - smpl 219 27.507 us/op 25.603 us/op 1.07
vc - 250000 eb 0.95 eth1 0.3 we 0.05 wn 0 - smpl 42 11.965 us/op 13.332 us/op 0.90
vc - 250000 eb 0.95 eth1 0.7 we 0.05 wn 0 - smpl 18 9.6940 us/op 9.5570 us/op 1.01
vc - 250000 eb 0.1 eth1 0.1 we 0 wn 0 - smpl 1020 100.69 us/op 117.86 us/op 0.85
vc - 250000 eb 0.03 eth1 0.03 we 0 wn 0 - smpl 11777 684.53 us/op 667.80 us/op 1.03
vc - 250000 eb 0.01 eth1 0.01 we 0 wn 0 - smpl 16384 903.94 us/op 964.39 us/op 0.94
vc - 250000 eb 0 eth1 0 we 0 wn 0 - smpl 16384 914.56 us/op 890.73 us/op 1.03
vc - 250000 eb 0 eth1 0 we 0 wn 0 nocache - smpl 16384 2.5295 ms/op 2.3957 ms/op 1.06
vc - 250000 eb 0 eth1 1 we 0 wn 0 - smpl 16384 1.5299 ms/op 1.7409 ms/op 0.88
vc - 250000 eb 0 eth1 1 we 0 wn 0 nocache - smpl 16384 3.9302 ms/op 4.6821 ms/op 0.84
Tree 40 250000 create 347.97 ms/op 319.89 ms/op 1.09
Tree 40 250000 get(125000) 191.19 ns/op 188.47 ns/op 1.01
Tree 40 250000 set(125000) 887.61 ns/op 979.19 ns/op 0.91
Tree 40 250000 toArray() 20.838 ms/op 21.185 ms/op 0.98
Tree 40 250000 iterate all - toArray() + loop 21.433 ms/op 20.331 ms/op 1.05
Tree 40 250000 iterate all - get(i) 73.780 ms/op 75.791 ms/op 0.97
MutableVector 250000 create 10.166 ms/op 10.160 ms/op 1.00
MutableVector 250000 get(125000) 6.4770 ns/op 6.5480 ns/op 0.99
MutableVector 250000 set(125000) 259.17 ns/op 303.23 ns/op 0.85
MutableVector 250000 toArray() 3.3361 ms/op 3.6428 ms/op 0.92
MutableVector 250000 iterate all - toArray() + loop 3.4634 ms/op 3.6405 ms/op 0.95
MutableVector 250000 iterate all - get(i) 1.5851 ms/op 1.5902 ms/op 1.00
Array 250000 create 2.8398 ms/op 2.8512 ms/op 1.00
Array 250000 clone - spread 1.0610 ms/op 1.1862 ms/op 0.89
Array 250000 get(125000) 0.54500 ns/op 0.56200 ns/op 0.97
Array 250000 set(125000) 0.63400 ns/op 0.64700 ns/op 0.98
Array 250000 iterate all - loop 93.413 us/op 86.274 us/op 1.08
effectiveBalanceIncrements clone Uint8Array 300000 30.484 us/op 43.033 us/op 0.71
effectiveBalanceIncrements clone MutableVector 300000 323.00 ns/op 331.00 ns/op 0.98
effectiveBalanceIncrements rw all Uint8Array 300000 170.05 us/op 180.90 us/op 0.94
effectiveBalanceIncrements rw all MutableVector 300000 80.185 ms/op 92.533 ms/op 0.87
phase0 afterProcessEpoch - 250000 vs - 7PWei 116.19 ms/op 128.04 ms/op 0.91
phase0 beforeProcessEpoch - 250000 vs - 7PWei 44.303 ms/op 48.447 ms/op 0.91
altair processEpoch - mainnet_e81889 340.93 ms/op 380.09 ms/op 0.90
mainnet_e81889 - altair beforeProcessEpoch 66.391 ms/op 70.824 ms/op 0.94
mainnet_e81889 - altair processJustificationAndFinalization 16.810 us/op 20.147 us/op 0.83
mainnet_e81889 - altair processInactivityUpdates 6.1724 ms/op 6.0544 ms/op 1.02
mainnet_e81889 - altair processRewardsAndPenalties 48.755 ms/op 56.972 ms/op 0.86
mainnet_e81889 - altair processRegistryUpdates 2.3200 us/op 4.7180 us/op 0.49
mainnet_e81889 - altair processSlashings 457.00 ns/op 1.1530 us/op 0.40
mainnet_e81889 - altair processEth1DataReset 532.00 ns/op 633.00 ns/op 0.84
mainnet_e81889 - altair processEffectiveBalanceUpdates 1.2501 ms/op 1.2716 ms/op 0.98
mainnet_e81889 - altair processSlashingsReset 4.5460 us/op 7.7460 us/op 0.59
mainnet_e81889 - altair processRandaoMixesReset 4.4330 us/op 7.0700 us/op 0.63
mainnet_e81889 - altair processHistoricalRootsUpdate 574.00 ns/op 1.5290 us/op 0.38
mainnet_e81889 - altair processParticipationFlagUpdates 2.4690 us/op 4.4650 us/op 0.55
mainnet_e81889 - altair processSyncCommitteeUpdates 515.00 ns/op 851.00 ns/op 0.61
mainnet_e81889 - altair afterProcessEpoch 122.06 ms/op 137.27 ms/op 0.89
phase0 processEpoch - mainnet_e58758 319.33 ms/op 431.62 ms/op 0.74
mainnet_e58758 - phase0 beforeProcessEpoch 122.58 ms/op 160.52 ms/op 0.76
mainnet_e58758 - phase0 processJustificationAndFinalization 15.830 us/op 25.625 us/op 0.62
mainnet_e58758 - phase0 processRewardsAndPenalties 46.461 ms/op 68.393 ms/op 0.68
mainnet_e58758 - phase0 processRegistryUpdates 7.9900 us/op 12.035 us/op 0.66
mainnet_e58758 - phase0 processSlashings 506.00 ns/op 994.00 ns/op 0.51
mainnet_e58758 - phase0 processEth1DataReset 538.00 ns/op 851.00 ns/op 0.63
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 1.0273 ms/op 1.5095 ms/op 0.68
mainnet_e58758 - phase0 processSlashingsReset 2.9360 us/op 7.9780 us/op 0.37
mainnet_e58758 - phase0 processRandaoMixesReset 4.4630 us/op 8.1650 us/op 0.55
mainnet_e58758 - phase0 processHistoricalRootsUpdate 729.00 ns/op 1.1090 us/op 0.66
mainnet_e58758 - phase0 processParticipationRecordUpdates 5.3210 us/op 5.8440 us/op 0.91
mainnet_e58758 - phase0 afterProcessEpoch 95.132 ms/op 102.42 ms/op 0.93
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.2139 ms/op 1.2862 ms/op 0.94
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.4770 ms/op 1.6351 ms/op 0.90
altair processInactivityUpdates - 250000 normalcase 24.443 ms/op 26.574 ms/op 0.92
altair processInactivityUpdates - 250000 worstcase 24.420 ms/op 20.726 ms/op 1.18
phase0 processRegistryUpdates - 250000 normalcase 6.8830 us/op 7.0600 us/op 0.97
phase0 processRegistryUpdates - 250000 badcase_full_deposits 278.78 us/op 267.87 us/op 1.04
phase0 processRegistryUpdates - 250000 worstcase 0.5 122.82 ms/op 123.51 ms/op 0.99
altair processRewardsAndPenalties - 250000 normalcase 67.852 ms/op 63.162 ms/op 1.07
altair processRewardsAndPenalties - 250000 worstcase 66.861 ms/op 69.250 ms/op 0.97
phase0 getAttestationDeltas - 250000 normalcase 7.0736 ms/op 6.3368 ms/op 1.12
phase0 getAttestationDeltas - 250000 worstcase 6.8717 ms/op 6.4128 ms/op 1.07
phase0 processSlashings - 250000 worstcase 3.5959 ms/op 3.5288 ms/op 1.02
altair processSyncCommitteeUpdates - 250000 178.23 ms/op 173.73 ms/op 1.03
BeaconState.hashTreeRoot - No change 314.00 ns/op 258.00 ns/op 1.22
BeaconState.hashTreeRoot - 1 full validator 54.081 us/op 53.096 us/op 1.02
BeaconState.hashTreeRoot - 32 full validator 567.13 us/op 528.07 us/op 1.07
BeaconState.hashTreeRoot - 512 full validator 5.7869 ms/op 5.1867 ms/op 1.12
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 64.790 us/op 62.498 us/op 1.04
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 944.67 us/op 863.39 us/op 1.09
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 12.180 ms/op 11.572 ms/op 1.05
BeaconState.hashTreeRoot - 1 balances 54.177 us/op 48.590 us/op 1.11
BeaconState.hashTreeRoot - 32 balances 459.54 us/op 455.40 us/op 1.01
BeaconState.hashTreeRoot - 512 balances 4.2701 ms/op 4.2369 ms/op 1.01
BeaconState.hashTreeRoot - 250000 balances 76.511 ms/op 73.022 ms/op 1.05
aggregationBits - 2048 els - zipIndexesInBitList 16.375 us/op 15.817 us/op 1.04
regular array get 100000 times 43.379 us/op 31.995 us/op 1.36
wrappedArray get 100000 times 33.162 us/op 31.954 us/op 1.04
arrayWithProxy get 100000 times 15.532 ms/op 14.991 ms/op 1.04
ssz.Root.equals 574.00 ns/op 573.00 ns/op 1.00
byteArrayEquals 573.00 ns/op 522.00 ns/op 1.10
shuffle list - 16384 els 6.9942 ms/op 6.6387 ms/op 1.05
shuffle list - 250000 els 102.46 ms/op 97.562 ms/op 1.05
processSlot - 1 slots 8.7210 us/op 8.5140 us/op 1.02
processSlot - 32 slots 1.4317 ms/op 1.3131 ms/op 1.09
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 206.34 us/op 188.59 us/op 1.09
getCommitteeAssignments - req 1 vs - 250000 vc 2.9408 ms/op 2.8349 ms/op 1.04
getCommitteeAssignments - req 100 vs - 250000 vc 4.2232 ms/op 4.0464 ms/op 1.04
getCommitteeAssignments - req 1000 vs - 250000 vc 4.5192 ms/op 4.3507 ms/op 1.04
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 5.1400 ns/op 4.5600 ns/op 1.13
state getBlockRootAtSlot - 250000 vs - 7PWei 1.1725 us/op 941.74 ns/op 1.25
computeProposers - vc 250000 11.882 ms/op 10.456 ms/op 1.14
computeEpochShuffling - vc 250000 106.69 ms/op 100.80 ms/op 1.06
getNextSyncCommittee - vc 250000 179.05 ms/op 168.17 ms/op 1.06

by benchmarkbot/action

twoeths
twoeths previously approved these changes Feb 20, 2023
wemeetagain
wemeetagain previously approved these changes Feb 20, 2023
@wemeetagain wemeetagain enabled auto-merge (squash) February 20, 2023 16:18
@wemeetagain wemeetagain merged commit f1b8c5f into unstable Feb 20, 2023
@wemeetagain wemeetagain deleted the dapplion/stfn-metrics branch February 20, 2023 21:02
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.6.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope-metrics All issues with regards to the exposed metrics.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants