Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor sync hashBlocks #4343

Merged
merged 3 commits into from
Aug 1, 2022
Merged

Refactor sync hashBlocks #4343

merged 3 commits into from
Aug 1, 2022

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Jul 25, 2022

Motivation

Using Buffer.concat is better than concatenating Uint8Arrays manually both in term of memory and performance

bytes utils
    ✔ byteArrayConcat 32 items                                         4.916421e+8 ops/s    2.034000 ns/op        -    1639474 runs   3.96 s
    ✔ Buffer.concat 32                                                 7.434944e+8 ops/s    1.345000 ns/op        -     476782 runs  0.808 s

Description

  • Buffer.from(Uint8Array.buffer) gives no memory allocation
  • then do Buffer.concat

@twoeths twoeths requested a review from a team as a code owner July 25, 2022 04:11
@github-actions
Copy link
Contributor

github-actions bot commented Jul 25, 2022

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 263cdba Previous: 3320c20 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 2.4173 ms/op 2.0988 ms/op 1.15
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 87.719 us/op 72.500 us/op 1.21
BLS verify - blst-native 2.2699 ms/op 1.8568 ms/op 1.22
BLS verifyMultipleSignatures 3 - blst-native 4.6055 ms/op 3.8053 ms/op 1.21
BLS verifyMultipleSignatures 8 - blst-native 10.379 ms/op 8.1928 ms/op 1.27
BLS verifyMultipleSignatures 32 - blst-native 36.636 ms/op 29.702 ms/op 1.23
BLS aggregatePubkeys 32 - blst-native 48.097 us/op 39.264 us/op 1.22
BLS aggregatePubkeys 128 - blst-native 184.98 us/op 152.83 us/op 1.21
getAttestationsForBlock 58.729 ms/op 47.782 ms/op 1.23
isKnown best case - 1 super set check 510.00 ns/op 440.00 ns/op 1.16
isKnown normal case - 2 super set checks 491.00 ns/op 441.00 ns/op 1.11
isKnown worse case - 16 super set checks 507.00 ns/op 448.00 ns/op 1.13
CheckpointStateCache - add get delete 10.941 us/op 8.7630 us/op 1.25
validate gossip signedAggregateAndProof - struct 5.3195 ms/op 4.3798 ms/op 1.21
validate gossip attestation - struct 2.4670 ms/op 2.0889 ms/op 1.18
altair verifyImport mainnet_s3766816:31 10.269 s/op 8.9376 s/op 1.15
pickEth1Vote - no votes 2.6734 ms/op 2.0797 ms/op 1.29
pickEth1Vote - max votes 26.863 ms/op 22.261 ms/op 1.21
pickEth1Vote - Eth1Data hashTreeRoot value x2048 14.460 ms/op 11.216 ms/op 1.29
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 24.439 ms/op 21.032 ms/op 1.16
pickEth1Vote - Eth1Data fastSerialize value x2048 1.7770 ms/op 1.6299 ms/op 1.09
pickEth1Vote - Eth1Data fastSerialize tree x2048 18.474 ms/op 16.645 ms/op 1.11
bytes32 toHexString 1.2740 us/op 1.1280 us/op 1.13
bytes32 Buffer.toString(hex) 849.00 ns/op 685.00 ns/op 1.24
bytes32 Buffer.toString(hex) from Uint8Array 1.0960 us/op 987.00 ns/op 1.11
bytes32 Buffer.toString(hex) + 0x 817.00 ns/op 726.00 ns/op 1.13
Object access 1 prop 0.44000 ns/op 0.36500 ns/op 1.21
Map access 1 prop 0.33800 ns/op 0.29400 ns/op 1.15
Object get x1000 20.686 ns/op 18.025 ns/op 1.15
Map get x1000 1.3250 ns/op 0.99800 ns/op 1.33
Object set x1000 134.23 ns/op 118.31 ns/op 1.13
Map set x1000 82.666 ns/op 71.432 ns/op 1.16
Return object 10000 times 0.42820 ns/op 0.36700 ns/op 1.17
Throw Error 10000 times 6.9106 us/op 5.8871 us/op 1.17
enrSubnets - fastDeserialize 64 bits 3.0340 us/op 2.9370 us/op 1.03
enrSubnets - ssz BitVector 64 bits 1.0010 us/op 752.00 ns/op 1.33
enrSubnets - fastDeserialize 4 bits 452.00 ns/op 415.00 ns/op 1.09
enrSubnets - ssz BitVector 4 bits 813.00 ns/op 739.00 ns/op 1.10
prioritizePeers score -10:0 att 32-0.1 sync 2-0 122.40 us/op 93.048 us/op 1.32
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 143.37 us/op 135.75 us/op 1.06
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 256.85 us/op 227.48 us/op 1.13
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 537.10 us/op 393.01 us/op 1.37
prioritizePeers score 0:0 att 64-1 sync 4-1 530.14 us/op 466.15 us/op 1.14
RateTracker 1000000 limit, 1 obj count per request 227.80 ns/op 188.92 ns/op 1.21
RateTracker 1000000 limit, 2 obj count per request 172.71 ns/op 142.90 ns/op 1.21
RateTracker 1000000 limit, 4 obj count per request 139.34 ns/op 116.47 ns/op 1.20
RateTracker 1000000 limit, 8 obj count per request 124.18 ns/op 105.51 ns/op 1.18
RateTracker with prune 5.1730 us/op 4.4910 us/op 1.15
array of 16000 items push then shift 3.7214 us/op 3.2036 us/op 1.16
LinkedList of 16000 items push then shift 29.580 ns/op 29.065 ns/op 1.02
array of 16000 items push then pop 292.68 ns/op 274.49 ns/op 1.07
LinkedList of 16000 items push then pop 24.001 ns/op 23.875 ns/op 1.01
array of 24000 items push then shift 5.2390 us/op 4.5810 us/op 1.14
LinkedList of 24000 items push then shift 30.571 ns/op 30.031 ns/op 1.02
array of 24000 items push then pop 245.12 ns/op 220.40 ns/op 1.11
LinkedList of 24000 items push then pop 24.287 ns/op 23.656 ns/op 1.03
intersect bitArray bitLen 8 13.717 ns/op 11.762 ns/op 1.17
intersect array and set length 8 192.94 ns/op 167.58 ns/op 1.15
intersect bitArray bitLen 128 85.413 ns/op 62.045 ns/op 1.38
intersect array and set length 128 2.6725 us/op 2.2282 us/op 1.20
Buffer.concat 32 items 2.2290 ns/op
pass gossip attestations to forkchoice per slot 7.1681 ms/op 3.2418 ms/op 2.21
computeDeltas 3.6284 ms/op 3.6937 ms/op 0.98
computeProposerBoostScoreFromBalances 1.0864 ms/op 908.49 us/op 1.20
altair processAttestation - 250000 vs - 7PWei normalcase 4.9456 ms/op 3.9213 ms/op 1.26
altair processAttestation - 250000 vs - 7PWei worstcase 6.5199 ms/op 6.0644 ms/op 1.08
altair processAttestation - setStatus - 1/6 committees join 259.15 us/op 208.26 us/op 1.24
altair processAttestation - setStatus - 1/3 committees join 484.00 us/op 396.00 us/op 1.22
altair processAttestation - setStatus - 1/2 committees join 672.07 us/op 561.41 us/op 1.20
altair processAttestation - setStatus - 2/3 committees join 870.41 us/op 706.09 us/op 1.23
altair processAttestation - setStatus - 4/5 committees join 1.1767 ms/op 997.91 us/op 1.18
altair processAttestation - setStatus - 100% committees join 1.3755 ms/op 1.1830 ms/op 1.16
altair processBlock - 250000 vs - 7PWei normalcase 32.361 ms/op 30.000 ms/op 1.08
altair processBlock - 250000 vs - 7PWei normalcase hashState 46.512 ms/op 41.701 ms/op 1.12
altair processBlock - 250000 vs - 7PWei worstcase 99.858 ms/op 85.829 ms/op 1.16
altair processBlock - 250000 vs - 7PWei worstcase hashState 126.32 ms/op 98.974 ms/op 1.28
phase0 processBlock - 250000 vs - 7PWei normalcase 5.5375 ms/op 4.3909 ms/op 1.26
phase0 processBlock - 250000 vs - 7PWei worstcase 62.710 ms/op 47.163 ms/op 1.33
altair processEth1Data - 250000 vs - 7PWei normalcase 950.17 us/op 862.09 us/op 1.10
Tree 40 250000 create 992.11 ms/op 733.00 ms/op 1.35
Tree 40 250000 get(125000) 342.47 ns/op 289.16 ns/op 1.18
Tree 40 250000 set(125000) 3.1506 us/op 2.3984 us/op 1.31
Tree 40 250000 toArray() 37.721 ms/op 34.721 ms/op 1.09
Tree 40 250000 iterate all - toArray() + loop 37.822 ms/op 34.743 ms/op 1.09
Tree 40 250000 iterate all - get(i) 129.68 ms/op 113.93 ms/op 1.14
MutableVector 250000 create 21.793 ms/op 18.435 ms/op 1.18
MutableVector 250000 get(125000) 15.099 ns/op 13.094 ns/op 1.15
MutableVector 250000 set(125000) 652.13 ns/op 642.28 ns/op 1.02
MutableVector 250000 toArray() 8.3482 ms/op 7.8485 ms/op 1.06
MutableVector 250000 iterate all - toArray() + loop 8.6305 ms/op 8.0488 ms/op 1.07
MutableVector 250000 iterate all - get(i) 4.0389 ms/op 3.2846 ms/op 1.23
Array 250000 create 8.0845 ms/op 7.2277 ms/op 1.12
Array 250000 clone - spread 4.1655 ms/op 4.0619 ms/op 1.03
Array 250000 get(125000) 1.6770 ns/op 1.6150 ns/op 1.04
Array 250000 set(125000) 1.6890 ns/op 1.6190 ns/op 1.04
Array 250000 iterate all - loop 192.76 us/op 169.72 us/op 1.14
effectiveBalanceIncrements clone Uint8Array 300000 98.212 us/op 86.406 us/op 1.14
effectiveBalanceIncrements clone MutableVector 300000 1.2610 us/op 1.2380 us/op 1.02
effectiveBalanceIncrements rw all Uint8Array 300000 292.04 us/op 252.61 us/op 1.16
effectiveBalanceIncrements rw all MutableVector 300000 242.34 ms/op 219.24 ms/op 1.11
phase0 afterProcessEpoch - 250000 vs - 7PWei 223.08 ms/op 185.00 ms/op 1.21
phase0 beforeProcessEpoch - 250000 vs - 7PWei 105.91 ms/op 79.922 ms/op 1.33
altair processEpoch - mainnet_e81889 596.09 ms/op 597.39 ms/op 1.00
mainnet_e81889 - altair beforeProcessEpoch 162.87 ms/op 167.85 ms/op 0.97
mainnet_e81889 - altair processJustificationAndFinalization 39.486 us/op 23.089 us/op 1.71
mainnet_e81889 - altair processInactivityUpdates 12.554 ms/op 11.844 ms/op 1.06
mainnet_e81889 - altair processRewardsAndPenalties 109.45 ms/op 101.70 ms/op 1.08
mainnet_e81889 - altair processRegistryUpdates 7.6440 us/op 4.1380 us/op 1.85
mainnet_e81889 - altair processSlashings 1.9600 us/op 976.00 ns/op 2.01
mainnet_e81889 - altair processEth1DataReset 2.1580 us/op 1.2140 us/op 1.78
mainnet_e81889 - altair processEffectiveBalanceUpdates 2.9519 ms/op 2.4334 ms/op 1.21
mainnet_e81889 - altair processSlashingsReset 14.095 us/op 10.137 us/op 1.39
mainnet_e81889 - altair processRandaoMixesReset 12.322 us/op 9.3420 us/op 1.32
mainnet_e81889 - altair processHistoricalRootsUpdate 2.3980 us/op 1.2780 us/op 1.88
mainnet_e81889 - altair processParticipationFlagUpdates 7.8510 us/op 3.0650 us/op 2.56
mainnet_e81889 - altair processSyncCommitteeUpdates 1.5920 us/op 979.00 ns/op 1.63
mainnet_e81889 - altair afterProcessEpoch 252.55 ms/op 194.01 ms/op 1.30
phase0 processEpoch - mainnet_e58758 725.40 ms/op 560.44 ms/op 1.29
mainnet_e58758 - phase0 beforeProcessEpoch 309.00 ms/op 267.70 ms/op 1.15
mainnet_e58758 - phase0 processJustificationAndFinalization 35.462 us/op 25.608 us/op 1.38
mainnet_e58758 - phase0 processRewardsAndPenalties 95.294 ms/op 159.34 ms/op 0.60
mainnet_e58758 - phase0 processRegistryUpdates 18.592 us/op 11.877 us/op 1.57
mainnet_e58758 - phase0 processSlashings 1.8420 us/op 978.00 ns/op 1.88
mainnet_e58758 - phase0 processEth1DataReset 2.1190 us/op 1.0280 us/op 2.06
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 2.3000 ms/op 2.1547 ms/op 1.07
mainnet_e58758 - phase0 processSlashingsReset 12.217 us/op 5.8130 us/op 2.10
mainnet_e58758 - phase0 processRandaoMixesReset 14.473 us/op 6.9770 us/op 2.07
mainnet_e58758 - phase0 processHistoricalRootsUpdate 2.4640 us/op 1.2190 us/op 2.02
mainnet_e58758 - phase0 processParticipationRecordUpdates 13.247 us/op 5.7580 us/op 2.30
mainnet_e58758 - phase0 afterProcessEpoch 197.89 ms/op 158.81 ms/op 1.25
phase0 processEffectiveBalanceUpdates - 250000 normalcase 3.1435 ms/op 2.5653 ms/op 1.23
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 3.5384 ms/op 3.1037 ms/op 1.14
altair processInactivityUpdates - 250000 normalcase 55.820 ms/op 46.760 ms/op 1.19
altair processInactivityUpdates - 250000 worstcase 49.301 ms/op 56.383 ms/op 0.87
phase0 processRegistryUpdates - 250000 normalcase 15.482 us/op 11.172 us/op 1.39
phase0 processRegistryUpdates - 250000 badcase_full_deposits 608.45 us/op 409.90 us/op 1.48
phase0 processRegistryUpdates - 250000 worstcase 0.5 296.11 ms/op 233.63 ms/op 1.27
altair processRewardsAndPenalties - 250000 normalcase 178.73 ms/op 142.12 ms/op 1.26
altair processRewardsAndPenalties - 250000 worstcase 106.99 ms/op 130.08 ms/op 0.82
phase0 getAttestationDeltas - 250000 normalcase 16.019 ms/op 12.863 ms/op 1.25
phase0 getAttestationDeltas - 250000 worstcase 16.363 ms/op 12.911 ms/op 1.27
phase0 processSlashings - 250000 worstcase 6.7371 ms/op 5.5865 ms/op 1.21
altair processSyncCommitteeUpdates - 250000 331.30 ms/op 277.68 ms/op 1.19
BeaconState.hashTreeRoot - No change 645.00 ns/op 475.00 ns/op 1.36
BeaconState.hashTreeRoot - 1 full validator 77.992 us/op 62.393 us/op 1.25
BeaconState.hashTreeRoot - 32 full validator 1.1060 ms/op 650.56 us/op 1.70
BeaconState.hashTreeRoot - 512 full validator 8.1159 ms/op 6.7646 ms/op 1.20
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 96.724 us/op 81.436 us/op 1.19
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 1.4151 ms/op 1.2239 ms/op 1.16
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 18.600 ms/op 16.390 ms/op 1.13
BeaconState.hashTreeRoot - 1 balances 66.891 us/op 64.280 us/op 1.04
BeaconState.hashTreeRoot - 32 balances 724.32 us/op 646.80 us/op 1.12
BeaconState.hashTreeRoot - 512 balances 7.0530 ms/op 5.8644 ms/op 1.20
BeaconState.hashTreeRoot - 250000 balances 107.41 ms/op 94.808 ms/op 1.13
aggregationBits - 2048 els - zipIndexesInBitList 40.223 us/op 42.427 us/op 0.95
regular array get 100000 times 78.209 us/op 67.458 us/op 1.16
wrappedArray get 100000 times 78.883 us/op 67.471 us/op 1.17
arrayWithProxy get 100000 times 38.871 ms/op 29.058 ms/op 1.34
ssz.Root.equals 545.00 ns/op 564.00 ns/op 0.97
byteArrayEquals 555.00 ns/op 556.00 ns/op 1.00
shuffle list - 16384 els 13.782 ms/op 11.197 ms/op 1.23
shuffle list - 250000 els 199.25 ms/op 164.53 ms/op 1.21
processSlot - 1 slots 15.329 us/op 14.050 us/op 1.09
processSlot - 32 slots 2.2134 ms/op 1.9349 ms/op 1.14
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 1.1442 ms/op 437.83 us/op 2.61
getCommitteeAssignments - req 1 vs - 250000 vc 6.3066 ms/op 5.3011 ms/op 1.19
getCommitteeAssignments - req 100 vs - 250000 vc 9.0318 ms/op 7.3553 ms/op 1.23
getCommitteeAssignments - req 1000 vs - 250000 vc 9.7522 ms/op 7.8562 ms/op 1.24
computeProposers - vc 250000 20.544 ms/op 18.772 ms/op 1.09
computeEpochShuffling - vc 250000 200.82 ms/op 166.11 ms/op 1.21
getNextSyncCommittee - vc 250000 339.52 ms/op 275.77 ms/op 1.23

by benchmarkbot/action

@g11tech
Copy link
Contributor

g11tech commented Jul 25, 2022

didn't we move away from using buffers? not sure though why we did it?

@twoeths
Copy link
Contributor Author

twoeths commented Jul 25, 2022

didn't we move away from using buffers?

I think for modules that run in Browser like as-sha256 or lightclient, we have to use Uint8Array. For modules that we're sure they run in Node.js environment, we can use Buffer if performance matters in that context

not sure though why we did it?

Buffer turns out to be better than Uint8Array in some scenarios:

Looking into the Buffer doc, I found that Buffer.concat is better because it uses the internal preallocated Buffer instances, see https://nodejs.org/api/buffer.html
The Buffer module pre-allocates an internal Buffer instance of size [Buffer.poolSize] that is used as a pool for the fast allocation of new Buffer instances created using [Buffer.allocUnsafe()], [Buffer.from(array)], [Buffer.concat()], and the deprecated new Buffer(size)

Copy link
Contributor

@dapplion dapplion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuyennhv we should be very explicit about this function being meant for NodeJS only. And until we don't have browser tests in Lodestar merging this is not safe as it can break everything again.

@g11tech
Copy link
Contributor

g11tech commented Jul 25, 2022

@tuyennhv we should be very explicit about this function being meant for NodeJS only. And until we don't have browser tests in Lodestar merging this is not safe as it can break everything again.

I think its fine now, because in eth2-light-client demo, we inject buffer using craco https://github.com/ChainSafe/eth2-light-client-demo/blob/master/craco.config.js#L37:L39

@twoeths
Copy link
Contributor Author

twoeths commented Jul 26, 2022

@dapplion this function is only used in range sync of the BeaconNode, we can make it clear by:

  • have a separate nodejs util folder
  • or add more comments to it

what do you think?

@dapplion
Copy link
Contributor

dapplion commented Jul 26, 2022

@tuyennhv Since byteArrayConcat was used only in hashBlocks I've refactored the function to not even need it.

@wemeetagain wemeetagain changed the title Use Buffer.concat for bytes utils Refactor sync hashBlocks Aug 1, 2022
@wemeetagain wemeetagain merged commit f0a5f41 into unstable Aug 1, 2022
@wemeetagain wemeetagain deleted the tuyen/Buffer-concat branch August 1, 2022 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants