Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only call regen.getBlockSlotState() if necessary #5401

Merged
merged 1 commit into from
Apr 23, 2023

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Apr 22, 2023

Motivation

Description

  • regen.getBlockSlotState() always use queue implementation so we use it with care, only call it if necessary
  • Modify/add getHeadStateAtCurrentEpoch() and getHeadStateAtEpoch() apis in chain so it only calls regen if necessary, using the checkpointStateCache (used inside getHeadState()) it'll reach the queue 1 time at most per epoch
  • Add RegenCaller for voluntary_exit validation

close #5400

@github-actions
Copy link
Contributor

github-actions bot commented Apr 22, 2023

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 72e3f8e Previous: 99d4944 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 693.28 us/op 738.15 us/op 0.94
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 49.187 us/op 45.981 us/op 1.07
BLS verify - blst-native 1.2215 ms/op 1.1973 ms/op 1.02
BLS verifyMultipleSignatures 3 - blst-native 2.4764 ms/op 2.4427 ms/op 1.01
BLS verifyMultipleSignatures 8 - blst-native 5.3309 ms/op 5.3459 ms/op 1.00
BLS verifyMultipleSignatures 32 - blst-native 19.297 ms/op 19.422 ms/op 0.99
BLS aggregatePubkeys 32 - blst-native 25.752 us/op 26.654 us/op 0.97
BLS aggregatePubkeys 128 - blst-native 100.80 us/op 102.73 us/op 0.98
getAttestationsForBlock 51.074 ms/op 57.232 ms/op 0.89
isKnown best case - 1 super set check 249.00 ns/op 261.00 ns/op 0.95
isKnown normal case - 2 super set checks 237.00 ns/op 256.00 ns/op 0.93
isKnown worse case - 16 super set checks 235.00 ns/op 256.00 ns/op 0.92
CheckpointStateCache - add get delete 4.7760 us/op 5.2570 us/op 0.91
validate gossip signedAggregateAndProof - struct 2.6446 ms/op 2.8352 ms/op 0.93
validate gossip attestation - struct 1.2666 ms/op 1.3531 ms/op 0.94
pickEth1Vote - no votes 1.1822 ms/op 1.3725 ms/op 0.86
pickEth1Vote - max votes 9.4737 ms/op 12.690 ms/op 0.75
pickEth1Vote - Eth1Data hashTreeRoot value x2048 8.5381 ms/op 9.7723 ms/op 0.87
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 14.333 ms/op 15.520 ms/op 0.92
pickEth1Vote - Eth1Data fastSerialize value x2048 613.10 us/op 698.58 us/op 0.88
pickEth1Vote - Eth1Data fastSerialize tree x2048 8.0575 ms/op 7.5487 ms/op 1.07
bytes32 toHexString 471.00 ns/op 562.00 ns/op 0.84
bytes32 Buffer.toString(hex) 338.00 ns/op 410.00 ns/op 0.82
bytes32 Buffer.toString(hex) from Uint8Array 541.00 ns/op 571.00 ns/op 0.95
bytes32 Buffer.toString(hex) + 0x 336.00 ns/op 371.00 ns/op 0.91
Object access 1 prop 0.16000 ns/op 0.16900 ns/op 0.95
Map access 1 prop 0.15400 ns/op 0.16500 ns/op 0.93
Object get x1000 6.3840 ns/op 7.3280 ns/op 0.87
Map get x1000 0.58800 ns/op 0.63300 ns/op 0.93
Object set x1000 48.538 ns/op 56.034 ns/op 0.87
Map set x1000 41.723 ns/op 47.197 ns/op 0.88
Return object 10000 times 0.22400 ns/op 0.24540 ns/op 0.91
Throw Error 10000 times 4.0413 us/op 4.3266 us/op 0.93
fastMsgIdFn sha256 / 200 bytes 3.2910 us/op 3.5530 us/op 0.93
fastMsgIdFn h32 xxhash / 200 bytes 263.00 ns/op 300.00 ns/op 0.88
fastMsgIdFn h64 xxhash / 200 bytes 381.00 ns/op 401.00 ns/op 0.95
fastMsgIdFn sha256 / 1000 bytes 11.287 us/op 11.829 us/op 0.95
fastMsgIdFn h32 xxhash / 1000 bytes 396.00 ns/op 428.00 ns/op 0.93
fastMsgIdFn h64 xxhash / 1000 bytes 452.00 ns/op 511.00 ns/op 0.88
fastMsgIdFn sha256 / 10000 bytes 101.31 us/op 105.64 us/op 0.96
fastMsgIdFn h32 xxhash / 10000 bytes 1.8880 us/op 1.9460 us/op 0.97
fastMsgIdFn h64 xxhash / 10000 bytes 1.3450 us/op 1.3710 us/op 0.98
enrSubnets - fastDeserialize 64 bits 1.2440 us/op 1.3300 us/op 0.94
enrSubnets - ssz BitVector 64 bits 469.00 ns/op 499.00 ns/op 0.94
enrSubnets - fastDeserialize 4 bits 170.00 ns/op 181.00 ns/op 0.94
enrSubnets - ssz BitVector 4 bits 464.00 ns/op 557.00 ns/op 0.83
prioritizePeers score -10:0 att 32-0.1 sync 2-0 106.20 us/op 113.19 us/op 0.94
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 129.28 us/op 163.36 us/op 0.79
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 158.98 us/op 196.99 us/op 0.81
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 286.06 us/op 357.70 us/op 0.80
prioritizePeers score 0:0 att 64-1 sync 4-1 352.12 us/op 411.62 us/op 0.86
array of 16000 items push then shift 1.5998 us/op 1.6916 us/op 0.95
LinkedList of 16000 items push then shift 8.6360 ns/op 9.1110 ns/op 0.95
array of 16000 items push then pop 80.841 ns/op 111.09 ns/op 0.73
LinkedList of 16000 items push then pop 8.3520 ns/op 9.3100 ns/op 0.90
array of 24000 items push then shift 2.3148 us/op 2.4024 us/op 0.96
LinkedList of 24000 items push then shift 8.4300 ns/op 9.0900 ns/op 0.93
array of 24000 items push then pop 77.538 ns/op 83.101 ns/op 0.93
LinkedList of 24000 items push then pop 8.2590 ns/op 9.0130 ns/op 0.92
intersect bitArray bitLen 8 12.883 ns/op 13.682 ns/op 0.94
intersect array and set length 8 75.175 ns/op 86.889 ns/op 0.87
intersect bitArray bitLen 128 42.817 ns/op 44.447 ns/op 0.96
intersect array and set length 128 1.0305 us/op 1.2533 us/op 0.82
Buffer.concat 32 items 2.5970 us/op 2.8360 us/op 0.92
Uint8Array.set 32 items 2.9810 us/op 2.8410 us/op 1.05
pass gossip attestations to forkchoice per slot 2.6130 ms/op 3.5158 ms/op 0.74
computeDeltas 3.3976 ms/op 3.1484 ms/op 1.08
computeProposerBoostScoreFromBalances 1.7533 ms/op 1.8024 ms/op 0.97
altair processAttestation - 250000 vs - 7PWei normalcase 2.1497 ms/op 2.5485 ms/op 0.84
altair processAttestation - 250000 vs - 7PWei worstcase 3.2194 ms/op 4.4601 ms/op 0.72
altair processAttestation - setStatus - 1/6 committees join 135.82 us/op 145.22 us/op 0.94
altair processAttestation - setStatus - 1/3 committees join 269.58 us/op 284.43 us/op 0.95
altair processAttestation - setStatus - 1/2 committees join 365.33 us/op 380.41 us/op 0.96
altair processAttestation - setStatus - 2/3 committees join 456.57 us/op 481.23 us/op 0.95
altair processAttestation - setStatus - 4/5 committees join 637.42 us/op 681.25 us/op 0.94
altair processAttestation - setStatus - 100% committees join 753.87 us/op 807.67 us/op 0.93
altair processBlock - 250000 vs - 7PWei normalcase 18.662 ms/op 17.947 ms/op 1.04
altair processBlock - 250000 vs - 7PWei normalcase hashState 25.865 ms/op 27.513 ms/op 0.94
altair processBlock - 250000 vs - 7PWei worstcase 50.667 ms/op 50.886 ms/op 1.00
altair processBlock - 250000 vs - 7PWei worstcase hashState 66.407 ms/op 70.349 ms/op 0.94
phase0 processBlock - 250000 vs - 7PWei normalcase 2.0304 ms/op 2.3649 ms/op 0.86
phase0 processBlock - 250000 vs - 7PWei worstcase 27.552 ms/op 31.293 ms/op 0.88
altair processEth1Data - 250000 vs - 7PWei normalcase 463.36 us/op 562.95 us/op 0.82
vc - 250000 eb 1 eth1 1 we 0 wn 0 - smpl 15 6.6900 us/op 10.711 us/op 0.62
vc - 250000 eb 0.95 eth1 0.1 we 0.05 wn 0 - smpl 219 19.166 us/op 28.048 us/op 0.68
vc - 250000 eb 0.95 eth1 0.3 we 0.05 wn 0 - smpl 42 8.2470 us/op 11.780 us/op 0.70
vc - 250000 eb 0.95 eth1 0.7 we 0.05 wn 0 - smpl 18 6.3430 us/op 11.368 us/op 0.56
vc - 250000 eb 0.1 eth1 0.1 we 0 wn 0 - smpl 1020 74.074 us/op 115.90 us/op 0.64
vc - 250000 eb 0.03 eth1 0.03 we 0 wn 0 - smpl 11777 596.47 us/op 677.86 us/op 0.88
vc - 250000 eb 0.01 eth1 0.01 we 0 wn 0 - smpl 16384 909.43 us/op 946.42 us/op 0.96
vc - 250000 eb 0 eth1 0 we 0 wn 0 - smpl 16384 884.88 us/op 944.27 us/op 0.94
vc - 250000 eb 0 eth1 0 we 0 wn 0 nocache - smpl 16384 2.2054 ms/op 2.4432 ms/op 0.90
vc - 250000 eb 0 eth1 1 we 0 wn 0 - smpl 16384 1.6688 ms/op 1.6162 ms/op 1.03
vc - 250000 eb 0 eth1 1 we 0 wn 0 nocache - smpl 16384 3.7407 ms/op 4.1113 ms/op 0.91
Tree 40 250000 create 313.53 ms/op 365.66 ms/op 0.86
Tree 40 250000 get(125000) 175.12 ns/op 196.06 ns/op 0.89
Tree 40 250000 set(125000) 894.50 ns/op 1.0613 us/op 0.84
Tree 40 250000 toArray() 16.718 ms/op 23.667 ms/op 0.71
Tree 40 250000 iterate all - toArray() + loop 17.032 ms/op 23.862 ms/op 0.71
Tree 40 250000 iterate all - get(i) 65.552 ms/op 77.271 ms/op 0.85
MutableVector 250000 create 9.6400 ms/op 11.574 ms/op 0.83
MutableVector 250000 get(125000) 6.3310 ns/op 6.5410 ns/op 0.97
MutableVector 250000 set(125000) 257.24 ns/op 294.67 ns/op 0.87
MutableVector 250000 toArray() 2.7547 ms/op 3.8719 ms/op 0.71
MutableVector 250000 iterate all - toArray() + loop 2.8383 ms/op 4.1638 ms/op 0.68
MutableVector 250000 iterate all - get(i) 1.4960 ms/op 1.5316 ms/op 0.98
Array 250000 create 2.7542 ms/op 3.3074 ms/op 0.83
Array 250000 clone - spread 1.0537 ms/op 1.1967 ms/op 0.88
Array 250000 get(125000) 0.52400 ns/op 0.55900 ns/op 0.94
Array 250000 set(125000) 0.59500 ns/op 0.63400 ns/op 0.94
Array 250000 iterate all - loop 95.160 us/op 95.937 us/op 0.99
effectiveBalanceIncrements clone Uint8Array 300000 22.288 us/op 37.430 us/op 0.60
effectiveBalanceIncrements clone MutableVector 300000 322.00 ns/op 348.00 ns/op 0.93
effectiveBalanceIncrements rw all Uint8Array 300000 164.09 us/op 175.44 us/op 0.94
effectiveBalanceIncrements rw all MutableVector 300000 74.139 ms/op 82.576 ms/op 0.90
phase0 afterProcessEpoch - 250000 vs - 7PWei 108.47 ms/op 116.54 ms/op 0.93
phase0 beforeProcessEpoch - 250000 vs - 7PWei 42.089 ms/op 39.446 ms/op 1.07
altair processEpoch - mainnet_e81889 323.56 ms/op 318.46 ms/op 1.02
mainnet_e81889 - altair beforeProcessEpoch 62.596 ms/op 65.735 ms/op 0.95
mainnet_e81889 - altair processJustificationAndFinalization 17.098 us/op 18.709 us/op 0.91
mainnet_e81889 - altair processInactivityUpdates 5.0206 ms/op 6.5831 ms/op 0.76
mainnet_e81889 - altair processRewardsAndPenalties 66.171 ms/op 70.068 ms/op 0.94
mainnet_e81889 - altair processRegistryUpdates 3.0020 us/op 4.3450 us/op 0.69
mainnet_e81889 - altair processSlashings 449.00 ns/op 1.4720 us/op 0.31
mainnet_e81889 - altair processEth1DataReset 535.00 ns/op 633.00 ns/op 0.85
mainnet_e81889 - altair processEffectiveBalanceUpdates 1.1916 ms/op 1.2618 ms/op 0.94
mainnet_e81889 - altair processSlashingsReset 3.7430 us/op 4.3090 us/op 0.87
mainnet_e81889 - altair processRandaoMixesReset 7.0100 us/op 7.3850 us/op 0.95
mainnet_e81889 - altair processHistoricalRootsUpdate 607.00 ns/op 720.00 ns/op 0.84
mainnet_e81889 - altair processParticipationFlagUpdates 3.5560 us/op 3.1120 us/op 1.14
mainnet_e81889 - altair processSyncCommitteeUpdates 496.00 ns/op 942.00 ns/op 0.53
mainnet_e81889 - altair afterProcessEpoch 122.14 ms/op 131.01 ms/op 0.93
phase0 processEpoch - mainnet_e58758 357.10 ms/op 370.77 ms/op 0.96
mainnet_e58758 - phase0 beforeProcessEpoch 132.44 ms/op 141.81 ms/op 0.93
mainnet_e58758 - phase0 processJustificationAndFinalization 16.037 us/op 16.524 us/op 0.97
mainnet_e58758 - phase0 processRewardsAndPenalties 63.542 ms/op 68.020 ms/op 0.93
mainnet_e58758 - phase0 processRegistryUpdates 8.5120 us/op 7.7560 us/op 1.10
mainnet_e58758 - phase0 processSlashings 508.00 ns/op 514.00 ns/op 0.99
mainnet_e58758 - phase0 processEth1DataReset 574.00 ns/op 529.00 ns/op 1.09
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 1.2201 ms/op 1.3105 ms/op 0.93
mainnet_e58758 - phase0 processSlashingsReset 4.5320 us/op 4.6150 us/op 0.98
mainnet_e58758 - phase0 processRandaoMixesReset 4.8810 us/op 4.8120 us/op 1.01
mainnet_e58758 - phase0 processHistoricalRootsUpdate 557.00 ns/op 636.00 ns/op 0.88
mainnet_e58758 - phase0 processParticipationRecordUpdates 4.1260 us/op 4.1720 us/op 0.99
mainnet_e58758 - phase0 afterProcessEpoch 93.664 ms/op 102.54 ms/op 0.91
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.2062 ms/op 1.2953 ms/op 0.93
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.4597 ms/op 1.5921 ms/op 0.92
altair processInactivityUpdates - 250000 normalcase 26.154 ms/op 23.543 ms/op 1.11
altair processInactivityUpdates - 250000 worstcase 26.828 ms/op 29.189 ms/op 0.92
phase0 processRegistryUpdates - 250000 normalcase 7.0580 us/op 7.7110 us/op 0.92
phase0 processRegistryUpdates - 250000 badcase_full_deposits 228.61 us/op 282.49 us/op 0.81
phase0 processRegistryUpdates - 250000 worstcase 0.5 122.57 ms/op 130.07 ms/op 0.94
altair processRewardsAndPenalties - 250000 normalcase 65.475 ms/op 68.536 ms/op 0.96
altair processRewardsAndPenalties - 250000 worstcase 68.995 ms/op 68.911 ms/op 1.00
phase0 getAttestationDeltas - 250000 normalcase 6.4686 ms/op 6.6744 ms/op 0.97
phase0 getAttestationDeltas - 250000 worstcase 6.3715 ms/op 6.8047 ms/op 0.94
phase0 processSlashings - 250000 worstcase 3.5291 ms/op 3.3918 ms/op 1.04
altair processSyncCommitteeUpdates - 250000 174.12 ms/op 187.40 ms/op 0.93
BeaconState.hashTreeRoot - No change 254.00 ns/op 363.00 ns/op 0.70
BeaconState.hashTreeRoot - 1 full validator 51.119 us/op 52.948 us/op 0.97
BeaconState.hashTreeRoot - 32 full validator 503.27 us/op 565.37 us/op 0.89
BeaconState.hashTreeRoot - 512 full validator 5.4433 ms/op 5.3592 ms/op 1.02
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 61.718 us/op 65.084 us/op 0.95
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 871.83 us/op 929.00 us/op 0.94
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 11.232 ms/op 13.222 ms/op 0.85
BeaconState.hashTreeRoot - 1 balances 50.203 us/op 50.711 us/op 0.99
BeaconState.hashTreeRoot - 32 balances 436.02 us/op 481.68 us/op 0.91
BeaconState.hashTreeRoot - 512 balances 4.4373 ms/op 4.6453 ms/op 0.96
BeaconState.hashTreeRoot - 250000 balances 75.963 ms/op 73.030 ms/op 1.04
aggregationBits - 2048 els - zipIndexesInBitList 14.901 us/op 17.686 us/op 0.84
regular array get 100000 times 32.026 us/op 45.391 us/op 0.71
wrappedArray get 100000 times 32.143 us/op 33.833 us/op 0.95
arrayWithProxy get 100000 times 15.788 ms/op 15.732 ms/op 1.00
ssz.Root.equals 530.00 ns/op 560.00 ns/op 0.95
byteArrayEquals 520.00 ns/op 590.00 ns/op 0.88
shuffle list - 16384 els 6.6726 ms/op 7.0643 ms/op 0.94
shuffle list - 250000 els 97.823 ms/op 105.42 ms/op 0.93
processSlot - 1 slots 8.4140 us/op 8.9270 us/op 0.94
processSlot - 32 slots 1.3148 ms/op 1.4015 ms/op 0.94
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 37.867 ms/op 39.024 ms/op 0.97
getCommitteeAssignments - req 1 vs - 250000 vc 2.8728 ms/op 3.0540 ms/op 0.94
getCommitteeAssignments - req 100 vs - 250000 vc 4.0875 ms/op 4.4554 ms/op 0.92
getCommitteeAssignments - req 1000 vs - 250000 vc 4.3685 ms/op 4.8923 ms/op 0.89
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 4.3300 ns/op 5.4800 ns/op 0.79
state getBlockRootAtSlot - 250000 vs - 7PWei 960.65 ns/op 768.55 ns/op 1.25
computeProposers - vc 250000 10.263 ms/op 12.112 ms/op 0.85
computeEpochShuffling - vc 250000 99.331 ms/op 110.87 ms/op 0.90
getNextSyncCommittee - vc 250000 169.02 ms/op 196.98 ms/op 0.86
computeSigningRoot for AttestationData 13.584 us/op 15.549 us/op 0.87
hash AttestationData serialized data then Buffer.toString(base64) 2.3527 us/op 2.6036 us/op 0.90
toHexString serialized data 1.0402 us/op 1.1995 us/op 0.87
Buffer.toString(base64) 316.74 ns/op 359.90 ns/op 0.88

by benchmarkbot/action

@twoeths
Copy link
Contributor Author

twoeths commented Apr 22, 2023

Tested in the last 5h on test mainnet node (feat2 - this branch vs unstable)

  • feat2: voluntary_exit Job Time is < 200ms consistently

Screenshot 2023-04-22 at 21 10 38

  • unstable: some takes up to 3.5s

Screenshot 2023-04-22 at 21 10 58

  • feat2: Regen Job Wait Time: no item > 500ms

Screenshot 2023-04-22 at 21 35 01

  • unstable: Regen Job Wait Time: a couple of items > 500ms

Screenshot 2023-04-22 at 21 35 59

  • feat2: Regen job queue length is 2 most of the time (1 for PrepareNextSlot and 1 for validateGossipBlock)

Screenshot 2023-04-22 at 21 40 08

  • unstable: randomly has job queue length >2 depending on voluntary_exit messages

Screenshot 2023-04-22 at 21 41 56

@twoeths twoeths marked this pull request as ready for review April 22, 2023 14:44
@twoeths twoeths requested a review from a team as a code owner April 22, 2023 14:44
Copy link
Member

@wemeetagain wemeetagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@wemeetagain wemeetagain merged commit ad0c58f into unstable Apr 23, 2023
@wemeetagain wemeetagain deleted the tuyen/get_block_slot_state branch April 23, 2023 15:07
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.8.0 🎉

// using getHeadState() means we'll use checkpointStateCache if it's available
const headState = this.getHeadState();
// head state is in the same epoch, or we pulled up head state already from past epoch
if (epoch <= computeEpochAtSlot(headState.slot)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuyennhv shouldn't this be a strict equals check? I noticed this method always returns the same state if a previous epoch is passed in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nflaig no this should be "<=", the function means to get head state dialing to an epoch. If provided epoch < head state epoch just return head state

also if we let it pass through the below regen.getBlockSlotState call, if provided slot < head state slot it'll throw error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I was trying to figure out what would be required to get proposer duties of a finalized epoch. The right method to use in that case would be getStateBySlot I'd assume.

async getStateBySlot(

You also noted in #5846 is that Lodestar does not persists finalized states very frequently which would be an issue in that case. We could solve this issue if we add --chain.archiveStateEpochFrequency 1 flag but this significantly increases storage requirements (but that might be fine for someone that really needs that data).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High voluntary_exit validation time
3 participants