Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add config option tenure_last_block_proposal_timeout_secs and do not allow reorgs at tenure boundary before it is exceeded #5425

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

jferrant
Copy link
Collaborator

@jferrant jferrant commented Nov 4, 2024

Closes #5423

…est that ensures reorgs can't happen before timeout

Signed-off-by: Jacinta Ferrant <[email protected]>
@jferrant jferrant force-pushed the feat/signer-times-out-end-of-tenure-blocks branch from 6608fb6 to 67e254e Compare November 4, 2024 23:27
@jferrant jferrant changed the title Add config option last_block_reorg_prevention_timeout_ms and add a te… Add config option tenure_last_block_proposal_timeout_secs and do not allow reorgs at tenure boundary before it is exceeded Nov 4, 2024
obycode
obycode previously approved these changes Nov 5, 2024
Copy link
Contributor

@obycode obycode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

jcnelson
jcnelson previously approved these changes Nov 5, 2024
Copy link
Member

@jcnelson jcnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for now, but I'm not sure this is guaranteed to prevent reorgs at tenure boundaries? It could be that some singers' slow networks cause them to:

  • Accept and broadcast a signature on the last block of tenure N B[N][-1]
  • Wait for the allotted timeout to pass, but never see B[N][-1] get broadcast and processed
  • Accept and broadcast signatures for the first block of tenure N+1, B[N+1][0], which conflicts with B[N][-1] (i.e. has the same height).

Am I understanding this right? Is this okay?

@jferrant
Copy link
Collaborator Author

jferrant commented Nov 5, 2024

Approving for now, but I'm not sure this is guaranteed to prevent reorgs at tenure boundaries? It could be that some singers' slow networks cause them to:

* Accept and broadcast a signature on the last block of tenure N `B[N][-1]`

* Wait for the allotted timeout to pass, but never see `B[N][-1]` get broadcast and processed

* Accept and broadcast signatures for the first block of tenure N+1, `B[N+1][0]`, which conflicts with `B[N][-1]` (i.e. has the same height).

Am I understanding this right? Is this okay?

Yep, your understanding is correct! it is not meant to prevent reorgs at the boundary. just to minimize them. Unfortunately, I do not think there is a way to prevent it without causing a chainstall :( We previously agreed that this was okay (see this comment)

This chain stall case happens in the following scenario:

Tenure A starts

  • ...some blocks happen...
  • block N proposed
  • x signers receive this proposal and approve it (70% > x > 30%)

Tenure B starts

  • y signers receive bock N and reject it (but due to non participating signers, y < 30%)
  • block N' proposed
  • x (> 30%) signers reject the proposal because it is a sibling of N which they have already approved

Chain Halts

  • No valid block can be built because majority signers (x) will always reject it as a sibling

(@kantai @obycode am I remembering the chainstall case correctly?)

Our response to this chainstall due to poorly timed block commits case was to enable signers to reorg the last block of the prior tenure. However, this caused legit globally accepted blocks due to network latency issues to be reorged when they really shouldn't have.

Maybe this default should be much higher? However, in the actual chainstall case, the chain will halt for up to that config amount so prob shouldn't set it too much higher.

@jferrant jferrant dismissed stale reviews from jcnelson and obycode via c80fa54 November 5, 2024 23:33
@@ -4501,7 +4501,7 @@ fn locally_rejected_blocks_overriden_by_global_acceptance() {

#[test]
#[ignore]
/// Test that signers that have accept a locally signed block N+1 built in tenure A can sign a block proposed during a
/// Test that signers that have accepedt a locally signed block N+1 built in tenure A can sign a block proposed during a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost got it 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants