-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warp sync zombienet tests: add basic BEEFY checks #2854
Conversation
3ae3245
to
9d2dc54
Compare
feaff7e
to
5768b5d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't go into detailed review of chain-spec.json
changes.
Everything else looks good.
|
||
chain_spec_path = "chain-spec.json" | ||
|
||
[[relaychain.nodes]] | ||
name = "alice" | ||
validator = true | ||
|
||
args = ["--state-pruning archive --blocks-pruning archive"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed? to my understanding we should need neither state nor block bodies for warp syncing..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why it's needed, but without it the nodes from which the warp proof is downloaded generate some errors that go something like State for block ... already discarded
when they are requested for warp sync fragments. And the warp sync freezes. But good point. I can investigate this separately. Will take a note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you see this log State for block ... already discarded
every time?
I was investigating that some time ago, and it was happening sometimes (10%, maybe 20% depending on CPU load):
#2568
When preparing then database snapshot for warp-sync test we should keep all blocks, so adding this option is desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you see this log
State for block ... already discarded
every time?
Yes, for me this was happening very often. Definitely more than 50% of the runs. Maybe also because the new db is bigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, second thought: we don't actually need this params. It will indirectly fix #2568, but the problem will still be there. The database actually does not need to be full archive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also re-upload db-snapshot created w/o archive mode.
@michalkucharczyk I just checked this and indeed it solves the warp sync issue, but the chain isn't finalizing blocks because bob can't talk to alice. Is it ok if I leave the archide db for the moment ? I added a comment that this is a workaround and that it should be removed after fixing #2568
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michalkucharczyk sorry for the ping. WDYT ? Would you have any concern about merging this PR as it is and then switching to a pruned snapshot in a future PR? After #2568 is fixed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to have DB without archive mode (because this is more close to scenario used widely in production).
But yeah, let's use archive DB here. I have some ideas for fixing #2568. Please also leave comment there (just not to forget about this). I would also appreciate if you leave the actual toml file that was used for generating DB in repo (with appropriate comment).
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks ! Done:
- Added comment to
0002-validators-warp-sync
test failing #2568 here:0002-validators-warp-sync
test failing #2568 (comment) - The toml file that was used for generating DB is already updated and it also has a TODO comment about removing the workaround
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ty!
substrate/zombienet/0001-basic-warp-sync/generate-warp-sync-database.toml
Show resolved
Hide resolved
@michalkucharczyk just adding you as a reviewer in case you want to take a look on this PR that brings some changes to the zombienet tests for warp sync, since judging by git blame looks like you added these tests |
substrate/zombienet/0003-block-building-warp-sync/test-block-building-warp-sync.zndsl
Show resolved
Hide resolved
substrate/zombienet/0003-block-building-warp-sync/test-block-building-warp-sync.zndsl
Show resolved
Hide resolved
Please also re-upload db-snapshot created w/o archive mode. |
This reverts commit f1673d8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm; would be nice if @skunert's suggestion can be integrated (in a followup)
@serban300 may it be that timeout for beefy is too small? |
Yes, looks like we should increase the timeout. Will do |
Part of paritytech#2787 This is an initial PR that adds some basic BEEFY checks to the warp sync zombienet tests. To be more specific, it does the following: - Changes the snapshot used by the warp sync zombienet tests to one built from an updated version of the kitchensink runtime, that supports BEEFY - Adds some basic BEEFY checks to the warp sync zombienet tests - Deduplicates some params of the warp sync zombienet tests, making them easier to extend
Part of #2787
This is an initial PR that adds some basic BEEFY checks to the warp sync zombienet tests. To be more specific, it does the following:
Note:
For the setup used by
0002-validators-warp-sync
, BEEFY doesn't catch up. It just detects the mandatory blocks, but can't get the proofs for them. Probably because no validator node is started from the snapshot (they are all started with warp sync). I would like to investigate this as part of a different issue and merge this PR as it is in order to have some basic BEEFY warp sync tests for the start.