Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(dcutr): always check for relayed connection first #3982

Merged
merged 11 commits into from
May 31, 2023

Conversation

arpankapoor
Copy link
Contributor

Description

For a non-relay connection, check if it corresponds to a direct connection upgrade.

As discussed in #3964.

Notes & open questions

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

@kpp
Copy link
Contributor

kpp commented May 24, 2023

Is it okay that sometimes QUIC client cannot connect to a server with DCUtR example?

BehaviourEvent: CircuitReqDenied { src_peer_id: PeerId("12D3KooWQYhTNQdmr3ArTeUHRYzFg94BKyTkoWBDWez9kSCVe2Xo"), dst_peer_id: PeerId("12D3KooWH3uVF6wv47WnArKHk5p6cvgCJEb74UTmxztmQDc298L3") }

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

protocols/dcutr/src/behaviour_impl.rs Outdated Show resolved Hide resolved
@thomaseizinger thomaseizinger changed the title fix(dcutr): Fix DCUtR NetworkBehavior's established connection handler fix(dcutr): always check for relayed connection first May 24, 2023
@arpankapoor
Copy link
Contributor Author

Is it okay that sometimes QUIC client cannot connect to a server with DCUtR example?

BehaviourEvent: CircuitReqDenied { src_peer_id: PeerId("12D3KooWQYhTNQdmr3ArTeUHRYzFg94BKyTkoWBDWez9kSCVe2Xo"), dst_peer_id: PeerId("12D3KooWH3uVF6wv47WnArKHk5p6cvgCJEb74UTmxztmQDc298L3") }

Do you suspect this issue to be due to this change?

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arpankapoor what problem is this pull request solving? Why is it relevant to first check whether the connection is relayed. Isn't first checking self.outgoing_direct_connection_attempts and then is_relayed just the same?

protocols/dcutr/src/behaviour_impl.rs Outdated Show resolved Hide resolved
@arpankapoor
Copy link
Contributor Author

@arpankapoor what problem is this pull request solving? Why is it relevant to first check whether the connection is relayed. Isn't first checking self.outgoing_direct_connection_attempts and then is_relayed just the same?

The current version of the code returns a dummy handler for direct connections corresponding to a relayed connection. This is because outgoing_direct_connection_attempts has the relayed connection id in its key and not the direct connection id. Essentially, the Some(_) arm of the match will never be hit.

Related comment.

@thomaseizinger
Copy link
Contributor

@arpankapoor what problem is this pull request solving? Why is it relevant to first check whether the connection is relayed. Isn't first checking self.outgoing_direct_connection_attempts and then is_relayed just the same?

The current version of the code returns a dummy handler for direct connections corresponding to a relayed connection. This is because outgoing_direct_connection_attempts has the relayed connection id in its key and not the direct connection id. Essentially, the Some(_) arm of the match will never be hit.

Related comment.

I just dug into this a bit more and the reason our tests are passing this is because we only assert that the direct connection is established but we don't assert that we report the correct event (which is only reported by the direct::Handler).

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can actually be simplified a lot further, please see comments!

protocols/dcutr/src/behaviour_impl.rs Outdated Show resolved Hide resolved
protocols/dcutr/src/behaviour_impl.rs Outdated Show resolved Hide resolved
protocols/dcutr/src/behaviour_impl.rs Outdated Show resolved Hide resolved
))); // TODO: We could make two `handler::relayed::Handler` here, one inbound one outbound.
}

if let Some(&relayed_connection_id) = self.direct_to_relayed_connections.get(&connection_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can ever be hit actually. In hole-punching, we cannot for sure tell which inbound connection was the one that the remote achieved through hole punching.

Plus, all this state tracking here uses ConnectionIds that are created as part of dialing, i.e. outbound connections.

So unless I am missing something, this function can be reduced to always return a dummy::ConnectionHandler for non-relayed connections.

@mxinden Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with your assessment. I'll make this change after @mxinden's comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right @thomaseizinger.

For what it is worth, likely my train of thought that ended up with the code above:

In NetworkBehaviour::on_connection_handler_event on OutboundConnectNegotiated we trigger a Dial with role_override() (i.e. as_listener) and then insert the maybe_direct_connection_id from the DialOpts into self.direct_to_relayed_connections.

Either::Left(handler::relayed::Event::OutboundConnectNegotiated { remote_addrs }) => {
let opts = DialOpts::peer_id(event_source)
.condition(dial_opts::PeerCondition::Always)
.addresses(remote_addrs)
.override_role()
.build();
let maybe_direct_connection_id = opts.connection_id();
self.direct_to_relayed_connections
.insert(maybe_direct_connection_id, relayed_connection_id);
*self
.outgoing_direct_connection_attempts
.entry((relayed_connection_id, event_source))
.or_default() += 1;
self.queued_events.push_back(ToSwarm::Dial { opts });
}

On success this would be reported to the Behaviour via NetworkBehaviour::handle_established_outbound_connection and not as assumed here in NetworkBehaviour::handle_established_inbound_connection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding an assert here, that connection_id is not in self.direct_to_relayed_connections?

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good to me. Thank you for the work here. I find this domain very tricky.

))); // TODO: We could make two `handler::relayed::Handler` here, one inbound one outbound.
}

if let Some(&relayed_connection_id) = self.direct_to_relayed_connections.get(&connection_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right @thomaseizinger.

For what it is worth, likely my train of thought that ended up with the code above:

In NetworkBehaviour::on_connection_handler_event on OutboundConnectNegotiated we trigger a Dial with role_override() (i.e. as_listener) and then insert the maybe_direct_connection_id from the DialOpts into self.direct_to_relayed_connections.

Either::Left(handler::relayed::Event::OutboundConnectNegotiated { remote_addrs }) => {
let opts = DialOpts::peer_id(event_source)
.condition(dial_opts::PeerCondition::Always)
.addresses(remote_addrs)
.override_role()
.build();
let maybe_direct_connection_id = opts.connection_id();
self.direct_to_relayed_connections
.insert(maybe_direct_connection_id, relayed_connection_id);
*self
.outgoing_direct_connection_attempts
.entry((relayed_connection_id, event_source))
.or_default() += 1;
self.queued_events.push_back(ToSwarm::Dial { opts });
}

On success this would be reported to the Behaviour via NetworkBehaviour::handle_established_outbound_connection and not as assumed here in NetworkBehaviour::handle_established_inbound_connection.

))); // TODO: We could make two `handler::relayed::Handler` here, one inbound one outbound.
}

if let Some(&relayed_connection_id) = self.direct_to_relayed_connections.get(&connection_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding an assert here, that connection_id is not in self.direct_to_relayed_connections?

protocols/dcutr/src/behaviour_impl.rs Outdated Show resolved Hide resolved
protocols/dcutr/src/behaviour_impl.rs Show resolved Hide resolved
@arpankapoor
Copy link
Contributor Author

Overall this looks good to me. Thank you for the work here. I find this domain very tricky.

I've applied the changes you suggested. Please let me know if you'd like anything else updated/changed.

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Thanks @arpankapoor for the work.

Will give @thomaseizinger a chance to review as well.

protocols/dcutr/src/behaviour_impl.rs Show resolved Hide resolved
protocols/dcutr/src/behaviour_impl.rs Show resolved Hide resolved
@mxinden
Copy link
Member

mxinden commented May 30, 2023

Moving an out-of-band discussion here:

any chance you can consider [this pull request] for the upcoming release?

I am fairly certain, unless @thomaseizinger raises any larger objections, that this pull request will make it into libp2p v0.52.0.

This is a bug fix and it does not break our public API. Thus, in case you want this before libp2p v0.52.0, we could as well include it as a patch release on top of libp2p v0.51. If so, we can backport to branch v0.51 once this pull request is merged.

Comment on lines +296 to +298
return Ok(Either::Right(Either::Left(
handler::direct::Handler::default(),
)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With recent refactorings in libp2p-swarm, there is actually no more need for handler::direct::Handler at all. The only thing that it does is report an event to the behaviour that is then bubbled to the user. The behaviour could as well just bubble the event to the user here already and return just a dummy handler.

The above improvement is only slightly related to this pull request. In case you want to tackle it @arpankapoor in this pull request, that would be much appreciated. Though I am happy to merge as is and tackle it some other time.

Idea raised by @thomaseizinger in a synchronous discussion just now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of that refactoring but no need to include in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it could be tackled in another PR perhaps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll merge this as is then :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracked here #4013. Also unresolving here for easier discovery in the future.

@mergify mergify bot merged commit 5321a44 into libp2p:master May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants