-
-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve sync speed #3989
Improve sync speed #3989
Conversation
41a90dc
to
2e0c6c0
Compare
2e0c6c0
to
d14740b
Compare
Performance Report✔️ no performance regression detected Full benchmark results
|
Codecov Report
@@ Coverage Diff @@
## unstable #3989 +/- ##
================================
================================
|
failing tests and merge conflicts |
I know this is lower priority, but it would be nice to not let this rot |
63a0be6
to
332bfc0
Compare
54714f8
to
a2d70e8
Compare
|
Still needs a lot of work to be review-able, putting as draft till then |
3c02268
to
b1d406e
Compare
@@ -69,8 +69,15 @@ export class QueuedStateRegenerator implements IStateRegenerator { | |||
// Check the checkpoint cache (if the pre-state is a checkpoint state) | |||
if (parentEpoch < blockEpoch) { | |||
const checkpointState = this.checkpointStateCache.getLatest(parentRoot, blockEpoch); | |||
if (checkpointState) { | |||
if (checkpointState && computeEpochAtSlot(checkpointState.slot) === blockEpoch) { | |||
// TODO: Miss-use of checkpointStateCache here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this comment mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't conform to the method description: "state dialed to block.slot"
}); | ||
} | ||
|
||
postStates[i] = postState; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to keep all states?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, current importBlock flow needs the state of every block. To only require the last we would need to refactor importBlock
I've restarted two contabos to sync Prater from genesis: With unstable Heavy use of a single worker. Does not compete for resources so the time for sig is lower, doing 480 sigs / sec With this branch Node is using the four cores with all the workers, cost per sig is higher, but can do 700 sigs / sec |
try { | ||
const [{postStates, proposerBalanceDeltas}, , {executionStatuses, mergeBlockFound}] = await Promise.all([ | ||
// Run state transition only | ||
// TODO: Ensure it yields to allow flushing to workers and engine API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this TODO
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goal is not accomplished, but it's just a hope. I haven't found a way to ensure from JS land that data has been sent to the worker which is the ultimate issue that cap throughput.
assertValidTerminalPowBlock(chain.config, mergeBlock, {executionStatus, powBlock, powBlockParent}); | ||
if (mergeBlockFound !== null) { | ||
// merge block found and is fully valid = state transition + signatures + execution payload. | ||
// TODO: Will this banner be logged during syncing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it'd be nice to log the merge block so remove this TODO
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it should be logged during sync, tho would prefer to debate in a separate issue. This TODO was more to note that this can happen
* - if all valid, await all and return | ||
* - if one invalid, abort immediately and return index of invalid | ||
*/ | ||
export function rejectFirstInvalidResolveAllValid(isValidPromises: Promise<boolean>[]): Promise<AllValidRes> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you move this to utils module or any utils file? this could be reused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The counter argument is that if we stop using this function it will stay in utils forever un-used. It's very specific to this code, if it gets re-used in the future then we can move to utils
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks great 👍
Motivation
Lodestar doesn't have great sync speed compared to other clients. I've been analyzing why and what can we do better.
Verifying blocks requires 3 separate tasks:
Same info but in a table
As you can see the bulk of the work is for signature, which currently only the verifySig part is done in workers, getBlockSignatures happens in the main thread.
Current master processes blocks in series which submits 100 sigs per blocks to the workers. Since that's not a lot all signatures are submitted as a single job to worker id 0. So there's no use of parallelization.
Description
This PR attempts to parallelize better the tasks above.
As very rough estimate this PR reduces the time to process 32 blocks from 8s to 4s. From benchmark data below
Here are some CPU profiles that show this data plotted
1: Full block segment profile
Signatures are collected first (blue wide sections) with some state transitions runs in between (narrow tall pink sections). Then the rest of block's state transitions are run. Then main thread is idle waiting for workers to finish verifying signatures.
2: state transition run detail
hashTreeRoot dominates
TODO