-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTLC upstream close debugging additions (and fix!) #5130
HTLC upstream close debugging additions (and fix!) #5130
Conversation
Signed-off-by: Rusty Russell <[email protected]>
c321c8a
to
941b2bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this is indeed surfacing an issue here:
2022-03-28T01:07:06.1316726Z lightningd-3: 2022-03-28T00:17:27.274Z **BROKEN** 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: Internal error ONCHAIN: onchaind_missing_htlc_output: htlc 4 is not local!
8641a5b
to
996c31d
Compare
… test. Signed-off-by: Rusty Russell <[email protected]>
1. Don't use dust HTLCs. 2. Make l3 unresponsive, like report. 3. Make l2-l3 fail because we time out on successive HTLC. We use sendpay rather than pay, because pay can do multiple attempts. Signed-off-by: Rusty Russell <[email protected]>
This doesn't actually fix anything, but may shed more clues if it happens again. The broken() logs are overzealous, see next patch. Signed-off-by: Rusty Russell <[email protected]>
These trip when anything weird happens; turns out that we tell onchaind about old htlcs (e.g. for penalties), so in that case we can actually have it tell us about missing HTLCs which we no longer have in memory. Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
996c31d
to
0a11930
Compare
And, indeed, this lead me to the smoking gun! Phew.... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 0a11930
Seems there is just a memleak that needs to be cleaned up, otherwise gtg :-) |
We get sent three corresponding arrays: 1. htlc stubs 2. whether we want to know if they're missing, 3. whether to wait 3 blocks or tell us immediately We then sorted the htlc stubs by CLTV, *but didn't sort the corresponding arrays*. This fixes that the simplest way possible, and probably also: Fixes: ElementsProject#4649 Signed-off-by: Rusty Russell <[email protected]> Changelog-Fixed: onchaind: we sometimes failed to close upstream htlcs if more than one HTLC is in flight during unilateral close.
0a11930
to
52521b0
Compare
Gah, allocated temporary off ctx not tmpctx. Trivial fix... Ack 52521b0 |
Firstly, I added an explicit test to try to mirror #4649
Then I added more debugging and checks if it happens again.