-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retransmission subsystem in channel link #231
Add retransmission subsystem in channel link #231
Conversation
Coverage decreased (-0.7%) to 73.298% when pulling 49aa23a263861a1a9c116ecdbac36406fd9284cf on AndrewSamokhvalov:retransmission_in_channellink into 2e6800e on lightningnetwork:master. |
0584e7e
to
e2eaf0c
Compare
Coverage decreased (-0.5%) to 73.52% when pulling e2eaf0cddaa4d720d7443941f54cb43d7bc3b937 on AndrewSamokhvalov:retransmission_in_channellink into 2e6800e on lightningnetwork:master. |
122bd8f
to
ade9065
Compare
Coverage decreased (-0.5%) to 73.528% when pulling ade906540d38e3d4bbd360d47db0b2c8ed7819c5 on AndrewSamokhvalov:retransmission_in_channellink into 2e6800e on lightningnetwork:master. |
…dust Note that this commit is temporary, and should be reverted once #231 is merged. The reason we need to do this for now, is that we don’t properly track the exact state of the remote party’s commitment. In this test case, the resulting HTLC’s added are dust to one party, but non-dust to another. So upon restart, the states (balance wise) has diverged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR! Looking forward to finally getting in proper channel message retransmission into our next major release.
The set of changes in this diff are very clean, however they're incomplete. You'll find most of the details within the body of the review, but at a high level, the following is missing:
- We need to track the delta between the prior state, and the state we created for the remote party. This is required as we must retransmit all
update_*
messages, not only just the HTLC additions. - We also need to store the exact state of the current commitment chain tail for the remote party. This is required as due to the possibility of asymmetric dust thresholds for both parties, upon restart, it's possible for our commitment states to have diverged.
@@ -48,23 +46,92 @@ func messageToString(msg lnwire.Message) string { | |||
return spew.Sdump(msg) | |||
} | |||
|
|||
// expectedMessage struct hols the message which travels from one peer to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hols -> holds
htlcswitch/link_test.go
Outdated
// Skip logging of extend revocation window messages. | ||
switch m := m.(type) { | ||
case *lnwire.RevokeAndAck: | ||
var zeroHash chainhash.Hash | ||
if bytes.Equal(zeroHash[:], m.Revocation[:]) { | ||
return | ||
return false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above fragment can now be removed as of the latest master, as there is no longer a revocation window to be extended.
htlcswitch/link_test.go
Outdated
// createLogFunc is a helper function which returns the function which will be | ||
// used for logging message are received from another peer. | ||
func createLogFunc(name string, channelID lnwire.ChannelID) messageInterceptor { | ||
return func(m lnwire.Message) { | ||
if getChanID(m) == channelID { | ||
return func(m lnwire.Message) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the significance of the first parameter? The godoc
comment on this function should be modified to detail it.
} | ||
|
||
expectedMessage := expectToReceive[0] | ||
expectToReceive = expectToReceive[1:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The element at expectToReceive[0]
should be set to nil, otherwise we risk a memory leak.
|
||
// Record is used to set the function which will be triggered when new | ||
// lnwire message was received. | ||
func (s *mockServer) record(f messageInterceptor) { | ||
s.recordFuncs = append(s.recordFuncs, f) | ||
func (s *mockServer) intersect(f messageInterceptor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The godoc
comment above wasn't updated.
Also I think you meant to call this function intercept
instead?
htlcswitch/link.go
Outdated
logCommitTimer: time.NewTimer(300 * time.Millisecond), | ||
overflowQueue: newWaitingQueue(), | ||
quit: make(chan struct{}), | ||
cfg: cfg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reference #194 in the commit body? This commit resolves the issue encountered, thanks!
htlcswitch/switch.go
Outdated
// | ||
// NOTE: This MUST be run as a goroutine. | ||
func (s *Switch) htlcForwarder() { | ||
defer s.wg.Done() | ||
func (s *Switch) handleControl() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What deadlock? Can you add some detail to the commit's message explaining the scenario that caused you to make this change? Thanks!
htlcswitch/switch.go
Outdated
// down active onion routed payments. Each active channel is modeled as | ||
// networked device with metadata such as the available payment bandwidth, and | ||
// total link capacity. | ||
// handleControl... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing a godoc
comment here.
htlcswitch/switch.go
Outdated
case req := <-s.linkControl: | ||
switch cmd := req.(type) { | ||
case *addLinkCmd: | ||
cmd.err <- s.addLink(cmd.link) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we'll need to introduce mutexes internally for this new set of changes to be concurrent safe.
@@ -751,6 +751,7 @@ func (s *Switch) addLink(link ChannelLink) error { | |||
s.interfaceIndex[peerPub][link] = struct{}{} | |||
|
|||
if err := link.Start(); err != nil { | |||
s.removeLink(link.ChanID()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
In this commit the reestablish message have been added, which serves as channel state synchronization message. Before exchanging the messages for particular channel peers have to send it to each other as the first message in order to be sure that non of the updates have been lost because of the previous disconnect.
Add message interceptor which checks the order and may skip the messages which were denoted to be skipeed.
This commit where added as a measure to avoid the panic during several server simultanoius fault. The panic happened becuase *t.Testing structure is not concurrent safe.
Add js-like future object which might be used to wait for the response to be received or return the error otherwise.
In order to be able to properly restart switch several times we should have the sequential process of channel link stop. In other words if we stopped the switch we should be sure that all channel links have been stopped too. Addition of the goroutine during the force close was added because of the deadlock: Trace: 1. link:force_close_notification 2. link:wipe_channel 3. peer:switch_remove_link 4. switch:stop_link 5. link:wait <-- deadlock
In this commit BOLT№2 retranmission logic for the channel link have been added. Now if channel link have been initialised with the 'SyncState' field than it will send the lnwire.ChannelReestablish message and will be waiting for receiving the same message from remote side. Exchange of this message allow both sides understand which updates they should exchange with each other in order sync their states.
After addition of the channel reestablish message exchange we couldn't use the Bandwidth() function, at least in the test framework.
After addition of the retransmission logic in the channel link, we should make the onion blobs persistant, the proper way to do this is include the onion blobs in the payment descriptor rather than storing them in the distinct struct in the channel link.
In this commit htlc channeldb representation have been augmented with onion blob field, and (de)serialisaion functions have been changed to make the onion blob persistant.
ade9065
to
ab9d847
Compare
Closing at this has been taken over and fully implemented. |
Laolu
about counter persistence, and add it.generateRevocation
function and make sure that we are passing right indexes.