Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kintsugi 🍵 (the merge) devnets 2022 tracker #3452

Closed
27 tasks done
g11tech opened this issue Nov 22, 2021 · 1 comment
Closed
27 tasks done

Kintsugi 🍵 (the merge) devnets 2022 tracker #3452

g11tech opened this issue Nov 22, 2021 · 1 comment
Assignees
Labels
Epic Issues used as milestones and tracking multiple issues.

Comments

@g11tech
Copy link
Contributor

g11tech commented Nov 22, 2021

Following issues are encountered during the kintsugi devnet merge runs on local. The resolution could either belong to lodestar or the corresponding ELs. this is just a tracker.

Devnets

  • devnet 0 - multiple issues discovered, devnet stalled with attacking chains, EL <> CL integration issues, discovered edge case for merge transition attack with unavailable pow
  • devnet 1 - merge worked well for lodestar-geth, lighthouse-geth nodes till it was discovered that majority chain was on a buggy EL execution (triggered by a transaction) because of a wrong feature activation in geth build, minority lodestar-nethermind, lighthouse nethermind were on correct chain but not finalizing, devnet 1 was prudently saved with geth based nodes nuked and restarted with fixed geth images, entire network now finalizing correctly. meta mask based transactions were tested out and carried successfully!
  • devnet 2
  • devnet 3 December 7 2021 tl-dr: devnet 3 ran into multiple issues while booting up, because most of the clients didn't have the engine api V3 spec implemented in their build. Lodestar's PR was also delayed, while many other clients just died, lodestar kept on proposing pre-merge blocks (by design) since its execution API was also not working. After the clients got updated with the api change including lodestar, everything went back to normal. as of last observation lodestar <> geth and lodestar <> nethermind combos were doing good, proposing blocks every now and then, keeping up with the head and validators participating in the duties.
  • Kintsugi devnet : tl-dr: Lodestar sailed through genesis, merge block, TTD into post merge scenario, with lodestar-geth combo working well and producing blocks regularly. However some EL clients still were producing invalid blocks, and in some scenarios leading to lighthouse interpreting those errors as Syncing execution, which lead to series of nodes not proposing blocks with lots of skipped slots and reorgs happening. Prsym also got their CL working, and besu also fixing their issues with networking participation nearing ~90%
    • Update the merge devnet joining script with the kintsugi configuration (Add script based setup to join merge devnet #3514)
    • Inspected the nodes spun up by Paritosh and checking on how lodestar and the corresponding ELs are doing
    • Helped resolved the restarting of the lodestar beacon on devnet as multi bootnode enr args format was wrong (, based)
    • Observed and reported per-genesis epoch transition evaluation (fixed by @tuyennhv Don't precompute epoch transition at pre genesis #3533)
    • Lodetstar <> Nethermind interop issues (status: waiting for a fix)
      • Investigated, debugged and escalated block production issues on lodestar <> nethermind combo because of internal
      • As per nethermind team that is because of concurrency issues which they are currently patching. Coordinated and tested a new build .4 version but still got it. Nethermind team is currently looking and fixing
    • CL<>EL interop: Peers are getting downscored when EL errors on payload execution #3537

Lodestar
ignoring EL errors and issues and treating them as Syncing kept it moving forward on chain

  • handle connection refused (optimistic treatment?)
  • bad merkle root error (ignoring it like syncing let the chain to move forward, and ultimately geth starting responding with valid on further along the chain, should actually treat them as INVALID and let the optimistic do its thing)
  • timeout issues (optimistic treatment?)
  • EL on an attacking chain, doesnt move forward (optimistic ?)

[ ] on the merge net, the head is far ahead (4128) but lodestar keeps giving BLOCK_ERROR_PARENT_UNKOWN on few blocks (1510,1761), most probably because they and their parents are not part of canonical chain and finalization is still on epoch 35 (1120 slot): 06:43:52.316 [] info: Synced - slot: 27418 (skipped 1) - head: 27417 0xb2b8…8d52 - finalized: 0xcb7f…5d60:35 - peers: 8

Geth

  • bad merkel root: on the blocks after transition even when on correct terminal pow chain (needs debugging from Geth team)
  • got stuck on an attacking chain (might apply to other ELs as well)
  • segmentation faults and restarts

Nethermind
waiting for post bug fix build from the team to run on top

  • invalid executions even (update: scenario is CL is on merge block while EL hasn't synced till TTD, expected response syncing, escalated to team)

EthereumJs UPDATE 17 December 2021: OUT OF SCOPE as the newer compatible builds are not yet available.
(least integration issues)
- [ ] High CPU usage after executing few blocks
- [ ] timeouts and non responsive, becomes better on restart but only for again few blocks

Other

@g11tech g11tech self-assigned this Nov 22, 2021
@g11tech g11tech changed the title lodestar kintsugi merge devnet issues discovered kintsugi merge devnet issues Nov 22, 2021
@g11tech g11tech changed the title discovered kintsugi merge devnet issues kintsugi merge devnet (issues, observations and participation tracker) Nov 24, 2021
@g11tech g11tech changed the title kintsugi merge devnet (issues, observations and participation tracker) kintsugi merge devnets (issues, observations and participation tracker) Nov 24, 2021
@g11tech g11tech changed the title kintsugi merge devnets (issues, observations and participation tracker) kintsugi merge devnets tracker (issues, observations and participation) Nov 24, 2021
@philknows philknows added the Epic Issues used as milestones and tracking multiple issues. label Dec 10, 2021
@dapplion dapplion changed the title kintsugi merge devnets tracker (issues, observations and participation) Kintsugi 🍵 (the merge) devnets 2022 tracker Jan 29, 2022
@dapplion dapplion changed the title Kintsugi 🍵 (the merge) devnets 2022 tracker Kintsugi 🍵 (the merge) devnets 2022 tracker Jan 29, 2022
@dapplion dapplion pinned this issue Jan 29, 2022
@philknows
Copy link
Member

Closing this in lieu of #3731. Thanks!

@g11tech g11tech unpinned this issue Feb 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic Issues used as milestones and tracking multiple issues.
Projects
None yet
Development

No branches or pull requests

2 participants