Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CL<>EL interop: Peers are getting downscored when EL errors on payload execution #3537

Closed
g11tech opened this issue Dec 17, 2021 · 4 comments · Fixed by #3545
Closed

CL<>EL interop: Peers are getting downscored when EL errors on payload execution #3537

g11tech opened this issue Dec 17, 2021 · 4 comments · Fixed by #3545
Assignees

Comments

@g11tech
Copy link
Contributor

g11tech commented Dec 17, 2021

While importing blocks, EL errors with various kind of errors (bad merkle state, internal error etc) as well as possible timeout/connection refused error.

The treatment of all these currently is that that block verification errors and the peer is downscored, which leads to downscoring/banning of peer and soon making lodestar loosing all peers and going out of sync! The lodestar doesn't find peers till the peerstore directory is removed and then the same cycle happens again (based on EL again erroring)

Expected:
The EL executePayload call should be try catched for these errors, and a separate status responded from executionEngine, which should log the error, but lodesstar shouldn;t penalize these peers and also accept this blocks with syncing.

However there is debate in the devnet debug group is that CLs should try discriminate these error (like treat merkle state error as invalid). Waiting till that discussion resolves.

Observed with: lodestar <> nethermind

@g11tech g11tech self-assigned this Dec 17, 2021
@dapplion
Copy link
Contributor

Oh good points! With downgrade do you mean downscore?

@g11tech
Copy link
Contributor Author

g11tech commented Dec 17, 2021

Oh good points! With downgrade do you mean downscore?

yes 🙂

@g11tech g11tech changed the title CL<>EL interop: Peers are getting downgraded when EL errors on payload execution CL<>EL interop: Peers are getting downscored when EL errors on payload execution Dec 17, 2021
@twoeths
Copy link
Contributor

twoeths commented Jan 5, 2022

lodestar<>geth could not maintain a good number of peers too

Screen Shot 2022-01-05 at 12 13 36

the main reason is lodestar sent goodbye requests to other nodes with "peer score too low"

Screen Shot 2022-01-05 at 12 14 33

@g11tech is this the same issue or not, do I need to search for any specific logs to confirm?

@g11tech
Copy link
Contributor Author

g11tech commented Jan 5, 2022

@tuyennhv yes, this is the same issue, I have cleared peerstore and restarted nodes, they seem to now have good peer count. Once this PR #3545 gets merged, peers shouldn't get penalized for EL malfunctions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants