Receiving 2f+1 Prevotes before Proposal might affect termination #1745

milosevic · 2018-06-14T11:02:58Z

Assume that we are in initial round with a correct proposer (or with a proposer who has locked value in the highest round). Furthermore, assume that we are in GST period and that timeouts are large enough so communication among correct processes is reliable and timely. If a correct process p receives 2f+1 prevotes (any set of prevotes) before receiving Proposal he will prevote for nil. If faulty processes also prevote for nil (or stay silent), no correct process will precommit v, and we will not terminate in this round. This is not good as this is scenario in which we should always ensure termination.

Similar reasoning applies also for Precommit messages. If a correct process receives 2f+1 Precommit messages, before Polka and Proposal, he will Precommit nil, and again, we will not be able to terminate.

ebuchman · 2018-06-15T18:50:07Z

If a correct process p receives 2f+1 prevotes (any set of prevotes) before receiving Proposal he will prevote for nil. If faulty processes also prevote for nil (or stay silent), no correct process will precommit v,

How does this follow ? If 2f+1 prevote for B, then surely some correct process might see that and precommit for B.

milosevic · 2018-06-15T19:03:27Z

There are no 2f+1 prevotes for B as faulty processes prevote nil.

ebuchman · 2018-06-16T00:35:46Z

Right. From discussion, what we need is the following:

on receive +2/3-any prevotes while in propose step, do nothing. this is so we wait out the rest of timeout propose and then we can prevote. if there's still only +2/3-any, we start timeout prevote then.

Note we want a similar mechanism on +2/3-any while in propose step, but it's more involved - we'd have to detect it, when we go to prevote, that actually we already received +2/3-any and can go to precommit.

jaekwon · 2018-06-25T03:06:51Z

I don't follow.

If a correct process p receives 2f+1 prevotes (any set of prevotes) before receiving Proposal he will prevote for nil. If faulty processes also prevote for nil (or stay silent), no correct process will precommit v, and we will not terminate in this round.

If at most f are byzantine, with 3f+1 total voting power, even with all faulty processes prevoting nil or staying silent, 2f+1 will receive the proposal assuming timeouts are sufficiently long.

ebuchman · 2018-06-25T14:16:25Z

2f+1 will receive the proposal assuming timeouts are sufficiently long

Maybe in a later round, but the idea here is to try and terminate as soon as we have some synchronous communication available, rather than wasting time with more rounds.

@milosevic is that right ? Or is this something that you think could be abused by an adversary to delay termination for ever?

milosevic · 2018-06-25T14:57:05Z

2f+1 correct might not Prevote for B even if timeouts are sufficiently long in the following case:
f+1 correct receives Proposal and Prevote B, f faulty prevote Nil, the other f correct processes receives 2f+1 Prevote messages before receiving Proposal, so they Prevote Nil (in the current implementation upon receiving any set of 2f+1 Prevote messages, we enter Prevote step, and as we haven't received Proposal yet, we prevote Nil). They will receive Proposal later, but as they already Prevote Nil, the correct processes will not decide in this round. This can repeat forever, so faulty processes can prevent termination forever if they manage to reach some correct process before it receives Proposal. We have addressed this issue in the latest version of spec by preventing processes to process Prevote messages before receiving Proposal.

jaekwon · 2018-06-26T01:13:56Z

Yeah, got it.

So, to deal with byzantine-prevotes from progressing prevotes prematurely, we need to wait for the proposal or block (e.g. no forwarding).

Currently, for byzantine-precommits, we wait the precommit_wait duration for stragglers in the case of +2/3 any precommits. But the same attack works in the precommit step, so we should to delay the precommit vote until after prevote_wait.

e.g.

see any +2/3 prevotes (but not a polka) -> goto propose (if not already) and set prevote_wait timer.
see any +2/3 precommits (but not a commit) -> goto prevote (if not already), set prevote_wait timer, AND precommit_wait timer sequentially (close to each other). The prevote_wait timeout will expire first, and get us to precommit. We will then either see a commit, or, precommit_wait will expire immediate after, getting us to the next round.

milosevic · 2018-06-26T10:37:22Z

Yes the same attack applies for precommits. But we can deal witt it in a slightly different way for precommits. In the latest version of the spec, we trigger precommit_wait timer unconditionally upon receiving any 2f+1 precommits. But we don't precomit nil in this case. If Precommit_wait timer expires and we are still in that round, we start next round. If in the meantine we commit we will start next height. It is very important to trigger precommit_wait timer and whole this logic unconditionally upon receiving any set of 2f+1 precommits due to round skipping. This logic let us quickly move to a higher round if a process lag behind, where we pay only precommit_wait timeout in some cases. Otherwise, round synchronisation could be very, very slow (if for example we would have step condition for precommit_wait like we have for prevotes).
This is a bit harder to explain here, but it is supported by the proofs in the spec.

It turns out that triggering prevote_wait only after process prevoted (received proposal or proposal_timeout expires), and precommit_wait unconditionally (upon receiving 2f+1 precommits) can guarantee termination (and fast round skipping) with proper timeout values (timeoutPropose must be greated than timeoutPrecommit). There is some math behind this claim in the spec.

jaekwon · 2018-06-27T23:12:12Z

But we don't precomit nil in this case

I think we can also precommit nil right before we go to the next round. Code snippet below.

I think we want the nil precommit because the validator is still signaling that they're online. Even if our protocol says not to precommit nil, validators will modify the logic because it makes it less likely for them to get unbonded due to network issues.

jaekwon · 2018-06-27T23:14:24Z

Here's a suggested consensus state machine change...

When we have +2/3 prevote any (but not a polka): (enterNewRound is implicit here)

tendermint/consensus/state.go

Lines 1592 to 1593 in f5d0b88

    
           cs.enterPropose(height, vote.Round) // we can't prevote until we wait for the proposal. 
        
           cs.enterPrevoteWait(height, vote.Round)

When we have +2/3 precommit any (but not a commit):

tendermint/consensus/state.go

Lines 1625 to 1628 in b51ed13

    
           cs.enterNewRound(height, vote.Round) 
        
           cs.enterPropose(height, vote.Round) 
        
           cs.enterPrevoteWait(height, vote.Round) 
        
           cs.enterPrecommitWait(height, vote.Round)

It's still not safe to prevote in the on-2/3-any-precommit snippet above, but we enterPropose in case so that we do prevote. If we're locked on a block we should probably prevote that as well.

The snippet sets prevote_wait and precommit_wait to expire sequentially at the same time, because we would precommit nil after prevote_wait is done, enter precommit, then enter finalizeCommit due to the precommit_wait, and immediately go to the next height.

ticktackim · 2018-08-23T07:40:08Z

@jaekwon @ebuchman @milosevic Hey guys, just chew the Tendermint thesis launched on Cornell site. It's pretty awesome for your work to achieve scalability based on PBFT and SBFT. But I have a question about the security. Since you mentioned in the paper the system parameters, the value of GST & timeout Delta are assumed to be unknown for the safety of algorithm. Do you think there will be an extreme circumstance that these parameter values are being acknowledged by faulty processes, and these Byzantine nodes will just remain silence and wait for a threshold number 2f, then proceed prevote/precommit in conspiracy in the next (r+1) round to take over the system?

jaekwon · 2018-09-01T02:19:55Z

@ticktackim GST/Delta are assumed unknown but some finite fixed number for all non-byzantine actors. So as long as < 1/3 Byzantine, the Byzantine actors can do whatever they want including remaining silent for some amount of time greater than Delta. If there exist >= 1/3 Byzantine, then liveness isn't guaranteed. One of the attack scenarios would involve >= 1/3 Byzantine actors delaying their votes. But they wouldn't be able to compromise safety without ultimately becoming punisheable, assuming that we require sharing justification (see the tendermint wiki BFT algo for "Proof of Accountability").

ebuchman · 2018-10-12T20:16:45Z

Addressed in #2540

See some minor follow up in #2625

…int#1730) (tendermint#1745) * fix: increase abci socket message size limit to 2GB (tendermint#1730) * fix: increase abci socket message size to 2GB * fix: added .changelog * Update .changelog/unreleased/improvements/1730-increase-abci-socket-message-size-limit Co-authored-by: Anton Kaliaev <[email protected]> * fix: use MaxInt32 as message size for 32-bit systems * Update .changelog/unreleased/improvements/1730-increase-abci-socket-message-size-limit --------- Co-authored-by: Anton Kaliaev <[email protected]> Co-authored-by: Thane Thomson <[email protected]> (cherry picked from commit 092b918) * Rename 1730-increase-abci-socket-message-size-limit to 1730-increase-abci-socket-message-size-limit.md --------- Co-authored-by: Troy Kessler <[email protected]> Co-authored-by: Thane Thomson <[email protected]>

milosevic added the C:consensus Component: Consensus label Jun 14, 2018

milosevic assigned ebuchman, jaekwon and xla Jun 14, 2018

ebuchman added T:bug Type Bug (Confirmed) prelaunch labels Jun 28, 2018

ebuchman added this to the launch milestone Jun 28, 2018

xla removed the prelaunch label Jul 18, 2018

milosevic mentioned this issue Aug 22, 2018

The consensus took 65 rounds #2251

Closed

jaekwon mentioned this issue Sep 1, 2018

Do we break validBlock property if a process can leave round upon receiving 2f+1 Precommit nil messages? #1690

Closed

ebuchman modified the milestones: v1.0, launch Sep 21, 2018

milosevic assigned milosevic and unassigned xla, jaekwon and ebuchman Sep 25, 2018

milosevic mentioned this issue Sep 26, 2018

consensus: Wait timeout precommit before starting new round #2493

Merged

4 tasks

goolAdapter mentioned this issue Sep 28, 2018

Some performance benchmark viewpoint #2504

Closed

milosevic mentioned this issue Oct 4, 2018

consensus: Wait for proposal or timeout before prevote #2540

Merged

goolAdapter mentioned this issue Oct 9, 2018

defaultSetProposal will set cs.ProposalBlockParts, this may lead to interrupt the block receiving process of PoLC's BlockID #2567

Closed

ebuchman mentioned this issue Oct 12, 2018

#2540 - +2/3-precommit-any liveness follow up #2625

Closed

ebuchman closed this as completed Oct 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Receiving 2f+1 Prevotes before Proposal might affect termination #1745

Receiving 2f+1 Prevotes before Proposal might affect termination #1745

milosevic commented Jun 14, 2018

ebuchman commented Jun 15, 2018

milosevic commented Jun 15, 2018

ebuchman commented Jun 16, 2018

jaekwon commented Jun 25, 2018

ebuchman commented Jun 25, 2018 •

edited

Loading

milosevic commented Jun 25, 2018 •

edited

Loading

jaekwon commented Jun 26, 2018 •

edited

Loading

milosevic commented Jun 26, 2018

jaekwon commented Jun 27, 2018 •

edited

Loading

jaekwon commented Jun 27, 2018 •

edited

Loading

ticktackim commented Aug 23, 2018

jaekwon commented Sep 1, 2018

ebuchman commented Oct 12, 2018

Receiving 2f+1 Prevotes before Proposal might affect termination #1745

Receiving 2f+1 Prevotes before Proposal might affect termination #1745

Comments

milosevic commented Jun 14, 2018

ebuchman commented Jun 15, 2018

milosevic commented Jun 15, 2018

ebuchman commented Jun 16, 2018

jaekwon commented Jun 25, 2018

ebuchman commented Jun 25, 2018 • edited Loading

milosevic commented Jun 25, 2018 • edited Loading

jaekwon commented Jun 26, 2018 • edited Loading

milosevic commented Jun 26, 2018

jaekwon commented Jun 27, 2018 • edited Loading

jaekwon commented Jun 27, 2018 • edited Loading

ticktackim commented Aug 23, 2018

jaekwon commented Sep 1, 2018

ebuchman commented Oct 12, 2018

ebuchman commented Jun 25, 2018 •

edited

Loading

milosevic commented Jun 25, 2018 •

edited

Loading

jaekwon commented Jun 26, 2018 •

edited

Loading

jaekwon commented Jun 27, 2018 •

edited

Loading

jaekwon commented Jun 27, 2018 •

edited

Loading