Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[conn-manager] denying connections before they're established #418

Closed
pgte opened this issue Mar 12, 2018 · 28 comments
Closed

[conn-manager] denying connections before they're established #418

pgte opened this issue Mar 12, 2018 · 28 comments

Comments

@pgte
Copy link
Contributor

pgte commented Mar 12, 2018

Allow hooks for denying connections before they are established.

First (and wisely) requested by @kumavis here.

@pgte
Copy link
Contributor Author

pgte commented Apr 6, 2018

To deny more connections I can think of these 2 options:

  1. Close the transport listener so we don't even begin accepting more connections. Reopen once we can accept more connections. I think this needs hooks in libp2p-switch, which in turn needs changes in the transport interface and all transport implementations.
  2. When receiving a new raw connection, libp2p-switch should emit a connect event that is cancelable. If event is canceled, close the connection before doing the multiplex->secio-> multiplex dance.

Option 1 looks more efficient, but harder to implement due to changes. @diasdavid Do you have an opinion on this?

@pgte
Copy link
Contributor Author

pgte commented Apr 11, 2018

Looks like the interface defined in interface-transport already allows for closing (.close(cb)) and reopen (.listen(ma, cb)).

Whether the specific transports can themselves handle recycling listeners (listen, close, listen) is another question. If some of them don't allow this, it should be categorised as a bug, since the interface does not state that this can't happen.

A problem could exist in specific transports that also do relaying, like websockets-star. If I understand correctly, when closing a listener, the connection to the signaling web socket server is closed. Since this connection may contain internal streams with other peers, besides not accepting new incoming dials, this would also close the existing connections.

If this is true, I think option 2 would work best. If, for instance, the _incoming dial method would emit a cancelable event, this connection could be closed on the same event loop.

Perhaps @dryajov could shed some light on this?

@pgte
Copy link
Contributor Author

pgte commented Apr 11, 2018

Also, @mkg20001, you have any insight / opinion on this?

@mkg20001
Copy link
Member

Quick hack for wrtc-star would be to just don't accept webrtc offers after a threshold is reached.

For the long term solution: Maybe instead of closing the listener there could simply be new functions like ".startReject()" and ".startAccept()" (for example this would tell wrtc-star to start/stop accepting offers, yet this wouldn't require reconnecting to the wrtc-star server everytime)

@dryajov
Copy link
Member

dryajov commented Apr 11, 2018

TL;DR

  • Relay uses muxed connections exclusively.
    • HOP nodes (the ones that do the relaying), would also just unregister their multistream handler to stop relaying, but keep open the relayed connections.
  • Are we trying to limit the amount of peers we're connected to or the amount of raw connections to a single peer over a single transport?

@pgte Relay uses muxed connections exclusively, so I think it might be out of scope for this?

Close the transport listener so we don't even begin accepting more connections. Reopen once we can accept more connections. I think this needs hooks in libp2p-switch, which in turn needs changes in the transport interface and all transport implementations.

I also like this approach and I think we should look into it further.

If we would still want to stop listening on relay, we can always unregister the multistream handler and that should basically prevent it from listening, while still allowing to dial. Also, once the *-star protocols migrate to libp2p we can probably do the same thing (they use an ad-hoc socket-io implementation right now).

Whether the specific transports can themselves handle recycling listeners (listen, close, listen) is another question. If some of them don't allow this, it should be categorised as a bug, since the interface does not state that this can't happen.

Yeah, we'll need to look into each proto and figure out how it's doing it right now. Looking at libp2p-tcp, it will call server.close which will wait for all connections to terminate before shutting down, however, we have a timeout to forcibly destroy the connections, which kinda invalidates the close functionality we're looking for. Furthermore, the correct way preventing more connection is by setting the maxConnections property.

Another issues I'm thinking about is weather limiting connections like this will solve our issues? What I mean is that, the bulk of connections are muxed, and many peers will only have one underlying raw connection to it, though, I believe the exception might be webrtc?

From @kumavis original issue

We intend to use webrtc which is resource intensive and can only safely support 2-5 active connections.

Are these connections to different peers or webrtc connections to a single peer? Which leads me to the next question, are we trying to limit the amount of peers we're connected to or the amount of raw connections to a single peer over a single transport?

@dryajov
Copy link
Member

dryajov commented Apr 11, 2018

After taking a closer look at the readme, I can see we're limiting the amount of peers being connected to, which can be either over a raw connection or a muxed connection in the case of circuit (and ws-star currently).

Whether the specific transports can themselves handle recycling listeners (listen, close, listen) is another question. If some of them don't allow this, it should be categorised as a bug, since the interface does not state that this can't happen.

Agreed, we should make the transports "reentrant", so that we can selectively start/stop them.

@pgte
Copy link
Contributor Author

pgte commented Apr 13, 2018

@dryajob thank you for your insights!

The goal here is to save resources. An active connection to a peer consumes memory and CPU (varying on connection activity), which may still be relevant in mobile contexts even if we're relaying through websockets. And in JS, memory pressure equals CPU overhead because of GC. Also, webrct is specially taxing, but I think we could solve this in general independently of transport.

So, even if the p2p connections are relayed and muxed, a connection to a new peer still induces a considerable overhead (a lot of time spent on crypto, etc), which is still taxing on the peer and the relay server.

I think we should, in the API, distinguish between closing the transport (.close()) and accepting/rejecting more connections.

For this, I really like your idea of, much like TCP, being able to specify a maxConnections option in the transport. This would be a part of the interface, implemented in each transport.

@dryajov thoughts?

@pgte
Copy link
Contributor Author

pgte commented Apr 17, 2018

Of course, having a maxConnections option limits some applicability here, since it's a static value, and sometimes we may need to stop accepting connections if some measure (like event loop delay) exceed the threshold.

@dryajov
Copy link
Member

dryajov commented Apr 17, 2018

Sorry, totally missed your reply!

Of course, having a maxConnections option limits some applicability here, since it's a static value, and sometimes we may need to stop accepting connections if some measure (like event loop delay) exceed the threshold.

I agree, we still need the ability to stop accepting connections on demand - maybe we need a .start and .stop methods in addition to listen and close? The new methods will stop accepting connections on demand (even if rejecting on arrival instead of letting the runtime handle it as in the case of maxConnections?)


There is one more thing thats bugging me (perhaps it even needs it's own issue?), which is, there are some connections that are more important than others. For example, dropping some connections that circuit (HOP) is using, can drop a bunch of multiplexed connections that rely on it, how would the connection manager handle it? Do we need to introduce a notion of priority, or just increase counts for that transport to be high enough to prevent premature disconnections?

@pgte
Copy link
Contributor Author

pgte commented Apr 18, 2018

@dryajov:

I agree, we still need the ability to stop accepting connections on demand - maybe we need a .start and .stop methods in addition to listen and close? The new methods will stop accepting connections on demand (even if rejecting on arrival instead of letting the runtime handle it as in the case of maxConnections?)

Sounds good!

There is one more thing thats bugging me (perhaps it even needs it's own issue?), which is, there are some connections that are more important than others. For example, dropping some connections that circuit (HOP) is using, can drop a bunch of multiplexed connections that rely on it, how would the connection manager handle it? Do we need to introduce a notion of priority, or just increase counts for that transport to be high enough to prevent premature disconnections?

Can you explain me in a gist how circuit works (both in the cases where a) I'm dialing out and b) accepting new connections)?

@dryajov
Copy link
Member

dryajov commented Apr 18, 2018

Sure thing!

TL;DR

  • Relay uses a muxed connection to bridge multiple nodes
    • a dialer might have several muxed connections going to a relay over another muxed connection
    • a listener might have many muxed connections coming from a relay point over another muxed connection.

Dropping the underlying connection to the relay or from the relay (raw or muxed) will drop all the other connections that are established on top of it.


So, circuit has three parts

  • Dialer
    • dialer that uses libp2p-switch to dial out
  • Listener
    • registers a multistream handler to listen for incoming connections over circuit
  • HOP
    • the actual relay that does the circuiting

Both the dialer and listener are borrowed from the interface-transports interface, so in that regard, is just like any other transport, but instead of using TCP or WebSockets directly, to dial out, it uses the libp2p-switch itself to dial. The switch will use whatever existing connection it has, or establish a new one to dial to the HOP node. The HOP node again, uses the switch to dial the relayed node.

All dialing happens over muxed connections, and once the connection between the two relayed points is established, it gets encrypted and muxed once again.

Borrowing this from the circuit tutorial, this is more or less how the flow looks:

  1. Node A tries to connect to Node B over one of its known addresses
  2. Connection fails because of firewall/NAT/incompatible transports/etc...
  3. Both Node A and Node B know of the same relay - Relay
  4. Node A falls back to dialing over Relay to Node B using its '/p2p-circuit' address, which involves:
    1. Node A sends a HOP request to Relay
    2. Relay extracts the destination address, figures out that a circuit to Node B is being requested
    3. Relay sends a STOP request to Node B
    4. Node B responds with a SUCCESS message
    5. Relay proceed to create a circuit over the two nodes
  5. Node A and Node B are now connected over Relay

The reason why, dropping a connection might drop a bunch of circuited connections, is because that underlying connection can be to a HOP node, or in the case of the HOP node itself, it can be to a listener (or many listeners). There are a few cases were this happens:

  • HOP to listener (STOP) node
    • i.e. a browser/NATed node with popular content
    • "private services" that only some nodes can access over a relay
  • Dialer to HOP node
    • a node that dials out to many nodes over a relay
      • browser to browser (without WebRTC)
      • esoteric transports that only some of the nodes speak (BLE, etc...)

@pgte
Copy link
Contributor Author

pgte commented Apr 20, 2018

@dryajov thank you, that was very clear to me!

Looks to me that a connection to a HOP node should be valued by the peer itself + the values of all the peers we are connected to through it.
To me, it feels like this would be the "true" value measure of a connection to a HOP peer, since disconnecting it would remove all that value.
@dryajov Does this make sense and would this solve this issue?

@pgte
Copy link
Contributor Author

pgte commented Apr 20, 2018

@dryajov also, I guess this would have to be applied recursively, because it looks to me like it's possible to have infinite nesting of HOP connections, or no?

@dryajov
Copy link
Member

dryajov commented Apr 20, 2018

Looks to me that a connection to a HOP node should be valued by the peer itself + the values of all the peers we are connected to through it.

That sounds correct to me too.

I guess this would have to be applied recursively, because it looks to me like it's possible to have infinite nesting of HOP connections, or no?

Hmm - yes. But multi hop dialing is not yet here, so we can probably postpone implementing it for now, tho the reasoning is correct 👍.


I also want to bring up one more thing - the go implementation has a concept of tagging peers, maybe we can borrow that as well...

Basically, as far as I can tell, tagging boils down to being able to set an arbitrary value for a peer - all tags contribute to the peers value - https://github.com/libp2p/go-libp2p-connmgr/blob/master/connmgr.go#L136...L148, the peers with the lowest value then get disconnected - https://github.com/libp2p/go-libp2p-connmgr/blob/master/connmgr.go#L61...L108. Peers with higher importance, get a higher number libp2p/go-libp2p-circuit#20 and libp2p/go-libp2p-kad-dht#95.

IMO, this is a reasonable and simple way of prioritizing certain peers over others...

Maybe we set the default to 0.5 and go from there, tho if we're going to do this, it might make sense to switch to whole numbers, since it makes things easier(?) to handle...

@pgte
Copy link
Contributor Author

pgte commented Apr 21, 2018

@dryajov thanks for showing that, I wasn't aware that go managed it like this (I thought it disconnected peers by order of how old the connection was).

Currently, here we have a global value per peer, which defaults to 1 and can be overridden by the app.
I guess that you're catering for the scenario where there may be multiple applications using the same IPFS node, and that the tag system may be used to represent the value for each application?

@dryajov
Copy link
Member

dryajov commented Apr 21, 2018

Currently, here we have a global value per peer, which defaults to 1 and can be overridden by the app.

Yep, we're pretty close in essence to the Go approach.

I guess that you're catering for the scenario where there may be multiple applications using the same IPFS node

I believe tagging can be applied to that as well (tho, I'm not sure what was the initial use case that Go was going after). My only aim with this (which I believe can be achieved without tagging), is to allow peers to be marked as having higher priority than another peer.

i.e.: We can allow a service/app/transport to increase a peer's value based on some criteria, that when an event gets triggered, allows higher valued peers to be evicted after lower valued peers.

In this case, our concept of peer value works just as well and if we can allow for it to be adjusted arbitrarily, it'll essentially be the same thing without namespaces/tags.

// cc @Stebalien @whyrusleeping @vyzo

@whyrusleeping
Copy link

On the go side we don't currently have the facilities in place to close out connections before they are established. The main issue with that is that you have to do a full secio handshake anyways before you can know which peer it is, and you don't want to blanket ignore all new connections

@daviddias
Copy link
Member

There is a case where a node is really resource constraint and can only afford to pay for the connections it is interested on dialing. Once those connections are set up then asks can go both ways (Protocol and Stream Muxing FTW), but letting every other node dial and force it to do the SECIO handshake is too much.

For the browser, SECIO in JS is expensive. When the day comes that we can piggy back on WebRTC DTLS handshake or QUIC TLS 1.3 handshake, then the situation will be different. Still, the case will exist for IoT devices.

@pgte
Copy link
Contributor Author

pgte commented Apr 23, 2018

Opened "peer value should be HOP-aware" issue: https://github.com/libp2p/js-libp2p-connection-manager/issues/10

@pgte
Copy link
Contributor Author

pgte commented Apr 23, 2018

Moving the interface discussion to here: libp2p/interface-transport#34

@pgte
Copy link
Contributor Author

pgte commented Apr 23, 2018

@daviddias
Copy link
Member

@pgte is there value to keep this issue open?

@pgte
Copy link
Contributor Author

pgte commented Aug 15, 2018

@diasdavid yes, I think so, since it's the job of the connection manager to tell the transports whether they should be accepting more connections or not.

@mkg20001
Copy link
Member

mkg20001 commented Aug 15, 2018

I think making .start()/.stop() not close existing connections is a bad idea because when the libp2p instance gets stopped it might leave unclosed connections around.
Instead there should be a dedicated pair of functions .accept()/.reject() that change whether or not a transport will accept connections and .stop() should close all connections

Edit: With .start()/.stop() I actually meant the functions .listen() and .close()

@pgte
Copy link
Contributor Author

pgte commented Aug 16, 2018

@mkg20001 I think that the transport interface discussion is happening here: libp2p/interface-transport#34
Could you post this there?

@pgte
Copy link
Contributor Author

pgte commented Aug 16, 2018

@mkg20001 I see the issue I pointed you to is closed. Perhaps create a new issue there?

@mkg20001
Copy link
Member

@daviddias daviddias changed the title denying connections before they're established [conn-manager] denying connections before they're established Aug 22, 2019
@daviddias daviddias transferred this issue from libp2p/js-libp2p-connection-manager Aug 22, 2019
@wemeetagain
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants