-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failing link: unable to update commitment: cannot add duplicate keystone with error: internal error #6485
Comments
Any reason this node is running the production release of this version? I wonder if this was actually an inadvertent HTLC replay attempt.... |
I checked the commits following rc1 and didn't see anything that interested me, so I didn't update (I might be wrong, of course, updating as we speak). What do you mean with "production release"? I'm running my own version with a few patches (logging, non-random channel picking for parallel channels). |
This could happen if c-lightning was sending settles earlier than expected, but I don't think they're doing that: #6246 Do you have more in-depth logs you can share leading up to this point? Are PEER debug logs off? |
I have debuglevel=info,CNCT=debug,DISC=warn,HSWC=debug,NANN=debug,WTCL=warn. What exactly do you need? The logs above show everything I have for the CHANPOINT (EDIT: In that timeframe). |
Did your logs rotate or something? The channel has a history of 4620 htlc's so hopefully there are more logs from before the reestablish message was received |
I have lots of logs, no worries. Just tell me what you need, I won't send over gigabytes of logs, though :) |
I think you should have at least one log line that is identical to the one pasted above (HTLC ID=4620 is what matters):
If you could send ~1min before the earliest identical log line until the point pasted in the OP, that would help narrow things down. I don't need the channel ids/payment hashes/ips/pubkeys etc, HSWC logs only. |
Yes.
|
In between those (22:12:08 and 01:26:09) I don't have any HSWC log message for that peer/channel. |
Do you see this and if so, what time? I might be missing a space or something - this will be under OTHER_CHANID's chanpoint
|
|
Ok, I might have an idea of what's going on. If I'm right this is an lnd problem, not a c-l problem |
Noted a related thread in #6482 where in some cases we may not be properly cancelling inbound HTLCs if we attempt to send a commitment but the remote peer never replies. This is a bit trickier since we've technically already sent out that valid commitment, so we need to be playing that HTLC (may lead to a force close since we want to be able to safely time out that incoming HTLC). |
We are cancelling back htlc's properly in #6482 |
Looked into it more, and #4183 did not introduce this, but I think it made it harder to recover from when peered with another lnd node. |
I have seen 4 or 5 of these force closures in the last few days! All of them with CLN nodes. Latest one: Please let me know if I can offer any additional info that may help. |
I have had about 20 of these over the last couple of days. This is a really pressing issue. |
Worth nothing that we'll send an error there, but won't actually attempt to force close the channel to see if a reconnection fixes things. Some CL nodes force close on any We have a fix here (for the issue that leads to us failing to process a certain HTLC action): #6518. |
AFAICT, something might have been introduced on the CL end inadvertently during their recent re-write of some peer/channel handling code. Also is everyone here running CL on this point release that fixed a force close issue? |
Is there any relation between the 'Internal Error' and 'Unable to Read Message' errors? Have been seeing both of these regularly over the past few days with my CLN peers. |
Unable to read message happens when the other peer "forcibly" hangs up the connection, that might have been in response to an |
I run the latest CLN (0.11.1) - my node has force closed multiple channels when my peers told me they had an According to The receiving node:
So the CLN behaviour is up to specs AFAICT. |
Ok good to know, just trying to pinpoint things here as I know there was a p2p overhaul for them and we haven't had any significant changes in this area lately. We're reviewing and testing #6518 to see if this addresses the issue. It also seems related to other reported issue when us writing to a zombie TCP socket (other side no longer listening on it) would cause eventually force closures. |
Is there any "Keystone" error in your log ? I doubt they are the same issue |
This one?
|
Did it happen to the same peers that you have the "Unable to Read Message" error ? |
Background
The error message "failing link: unable to update commitment: cannot add duplicate keystone with error: internal error" appeared in my logs, for reasons I don't understand. Afterwards the channel was unusable and my peer (running CLN) immediately force-closed the channel at 01:26.
The force-close transaction contains one outgoing HTLC (timeout, according to
lncli closedchannels
) with size 200003 sat.My peer says:
Relevant snippet from my logs:
Your environment
Steps to reproduce
Have non-anchor channel with CLN behind tor. Have somewhat flaky connection. Send HTLCs to peer.
The text was updated successfully, but these errors were encountered: