Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

htlcswitch: only pipeline settle if add is locked-in #6246

Merged
merged 2 commits into from
Mar 3, 2022

Conversation

Crypt-iQ
Copy link
Collaborator

@Crypt-iQ Crypt-iQ commented Feb 9, 2022

The counter-party shouldn't be doing this anyways as they would be giving away a preimage for free. Them doing this would bork their own channel due to open circuits not getting trimmed on startup. Removing this faulty behavior also makes it easier to reason about the circuit logic.

@Crypt-iQ Crypt-iQ added safety General label for issues/PRs related to the safety of using the software htlcswitch labels Feb 9, 2022
@Crypt-iQ Crypt-iQ added this to the v0.15.0 milestone Feb 9, 2022
@Roasbeef Roasbeef added the P1 MUST be fixed or reviewed label Feb 14, 2022
Copy link
Collaborator

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix, just some comments on the test.

Comment on lines 6744 to 6758
alice.coreLink.cfg.OnChannelFailure = func(_ lnwire.ChannelID,
_ lnwire.ShortChannelID, linkErr LinkFailureError) {
linkErrors <- linkErr
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to re-set this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes since restart will re-create a coreLink and use a new OnChannelFailure

case <-linkErrors:
t.Fatal("should not have received a link error")
case <-time.After(5 * time.Second):
// success
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're always going to have to wait 5 seconds here, could we update OnChannelFailure above to fail the test if it is called?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't understand the suggestion. Do you mean instead of waiting 5 seconds or in addition to waiting? We could update it to fail, but as-is when it's called it does fail

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think she means that rather than always wait 5 seconds, is it possible to instead use tighter synchronization (so fail the test if we get some error).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion - i was able to achieve this by checking if ForwardPackets was called instead of waiting 5 seconds. There is still a five second in one case which I couldn't think of how to get around, but it does not lengthen the test in the happy path

@Roasbeef Roasbeef requested review from Roasbeef and removed request for bhandras February 25, 2022 00:05
Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🦋

case <-linkErrors:
t.Fatal("should not have received a link error")
case <-time.After(5 * time.Second):
// success
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think she means that rather than always wait 5 seconds, is it possible to instead use tighter synchronization (so fail the test if we get some error).

@@ -6659,6 +6659,164 @@ func TestShutdownIfChannelClean(t *testing.T) {
}
}

// TestPipelineSettle tests that a link should only pipeline a settle if the
// related add is fully locked-in meaning it is on both sides' commitment txns.
func TestPipelineSettle(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent test case reproduction!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be cool to eventually translate these into some flat file format which can be used on the spec level to test implementations. IIRC we had a very early version of something like this, but it was never fully maintained.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes i think the spec could benefit from an exhaustive list of state transition diagrams (add, settle, fee, closing signed etc) that show valid and invalid sequences


if !lockedin {
l.fail(
LinkFailureError{code: ErrInvalidUpdate},
Copy link
Collaborator Author

@Crypt-iQ Crypt-iQ Feb 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bolt 2 language regarding update_fulfill_htlc: https://github.com/lightning/bolts/blob/master/02-peer-protocol.md#requirements-10

  • bullet point 3 says that the sending node MUST NOT send update_fulfill_htlc until the add is locked-in. The language of MUST NOT is confusing, does this mean the recipient has to force close? IMO following the spec here to the letter may be a little superfluous as the sender shouldn't be sending this in the first place, and the receiver did nothing wrong. So I believe it's fine to not force close and instead shut down the link.
  • Under the receiving node requirements "If the id does not correspond to an HTLC in its current commitment transaction:...MUST send a warning and close the connection, or send an error and fail the channel."
    • Again, the sender should not be doing this. We don't close the channel or send a warning, but we do shut down the link and send an error. From lnd's PoV errors are benign, but I believe that other implementations force close upon receiving the error.
  • If other implementations would force close anyways due to our error, does it make sense to just make this l.fail a force close?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I believe it's fine to not force close and instead shut down the link.

Given that if a commitment sig isn't sent after/during this early fulfill, IMO just taking down the link lets things resume vs the hard route of needing to eat those chain fees.

The counter-party shouldn't be doing this anyways as they would be
giving away a preimage for free. Them doing this would bork their
own channel due to open circuits not getting trimmed on startup.
Removing this faulty behavior also makes it easier to reason about
the circuit logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
htlcswitch P1 MUST be fixed or reviewed safety General label for issues/PRs related to the safety of using the software
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants