Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for configuring priority peers (connection tagging) #369

Closed
jacobheun opened this issue Jun 10, 2019 · 10 comments
Closed

Add support for configuring priority peers (connection tagging) #369

jacobheun opened this issue Jun 10, 2019 · 10 comments
Assignees
Labels
exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature P1 High: Likely tackled by core team if no one steps up status/ready Ready to be worked

Comments

@jacobheun
Copy link
Contributor

This is a component of connection tagging. Libp2p should support configuring/tagging specific peers/multiaddrs as priority connections. The goal here is to have connections that the Connection Manager does not kill, and that we try to maintain connections too. If connections are killed, libp2p should attempt to automatically reconnect to them.

An example of this is an IPFS browser node setting a preload node as an important connection. Since the preload node acts as a proxy for serving all of its content, these connections are vital to maintain. If the connection is lost, the node can become effectively unusable.

This would also be important for private clusters that expose a single relay/proxy node. Maintaining those internal connections to the relay is critical for those peers.

Future iterations of this could involve a spec to have the nodes coordinate and agree on this connection keep alive behavior. This would allow both nodes to agree to maintain the connection and avoid hanging up on one another. It would also allow overtaxed nodes to decline the keep alive, proving the requesting node the opportunity to find other nodes with connection availability.

@jacobheun jacobheun added kind/enhancement A net-new feature or improvement to an existing feature exp/expert Having worked on the specific codebase is important status/ready Ready to be worked P1 High: Likely tackled by core team if no one steps up labels Jun 10, 2019
@jacobheun jacobheun self-assigned this Jun 10, 2019
@jacobheun
Copy link
Contributor Author

@vasco-santos @dirkmc I've written down some thoughts around this. It also made me think more about how service configuration works currently and how it can improve, but I've left that out of these notes. I'll look at writing more about that soon and posting a new issue.

Peer Management

Prioritizing Peers

Libp2p needs to be able to identify peers that it deems as priority connections, to enable nodes to maintain a connection to peers that are in a critical path for that node to operate. An example of this would be preload nodes for IPFS browser nodes, or signaling servers for webrtc transports. If the connection to these nodes ends, the node may no longer be able to effectively interact with the network, due to current limitations of distributed technologies.

Being able to prioritize peers also enables nodes in the network to create and more easily maintain overlay networks to specific peers. A potential example of this could be a webrtc overlay. Assuming nodes in the network supported a signaling spec, as webrtc nodes became aware of other nodes, they could create an overlay network with a subset of those nodes and signify them as priority peers, similar to how Gossipsub overlays are constructed. This could potentially improve the ability of nodes to query unconnected nodes, without relying on peers being initially connected to the same signaling server.

Ideally, both peers would agree to this priority connection and avoid disconnecting from one another. If only one peer marks the other as a priority peer, this can lead to disconnects and immediate redials to that peer, which would be unnecessarily taxing for both nodes. This could be especially aggravating for the receiving node if they are at their high watermark for connections.

Configuration

As the Peer Store (PeerBook in JS) is the central location of Peer data, it makes sense for it to house the metadata marking the Peer as priority. It may be useful to prioritize specific multiaddrs instead of the peers themselves, but as multiaddrs can change over time, via protocols like AutoNAT and AutoRelay, an initial implementation of just tagging the Peer should be more dependable.

Similar to how Bootstrap peers are currently configured today, priority peers would be configured via their multiaddr. Peers that were previously in the Bootstrap list would be removed from there, and added to the priority configuration, as those peers will aslo have connections established.

Configuration Options

Here are some potential configuration options

Via the config:

new Libp2p({
  ...,
  config: {
    peers: {
      '/ip4/xxx.xx.xx.x/tcp/4001/QmPreload': {
        tags: ['Priority']
      }
    }
  }
})

Via methods

const libp2p = new Libp2p({ ... })
libp2p.peerBook.tagPeer('/ip4/xxx.xx.xx.x/tcp/4001/QmPreload', libp2p.peerBook.TAGS.Priority)

Updates

  • Update PeerBook to support adding tags to peers
  • Update Connection Manager to check for priority tags before disconnecting, or to exclude those peers entirely from tracking
  • Update libp2p to dial the priority peers on startup
  • Update libp2p to listen to the disconnect for those peers and reconnect should it happen
    • The reconnect should have a small random backoff built in, to avoid mass redials if many peers are disconnected from the same priority peer

Additional Thoughts

It may be valuable to make this a standalone service module that takes a libp2p instance. This would avoid the need to add specific functionality for this to libp2p itself, and would make it easier for other developers to build similar modules that leverage tagging.

const PriorityPeerService = require('libp2p-priority-peer-service')
const libp2p = new Libp2p({ 
  modules: {
    services: [ PriorityPeerService ]
  }
})
libp2p.peerBook.tagPeer('/ip4/xxx.xx.xx.x/tcp/4001/QmPreload', PriorityPeerService.TAGS.Priority)

@dirkmc
Copy link
Contributor

dirkmc commented Jun 12, 2019

This sounds like a good improvement to connection management 👍

Ideally connections would be prioritized by the service the remote peer provides, and there would be a mechanism to discover which peers provide a particular service, so that the configuration doesn't need to be hard-coded (eg discovery could be through bootstrap nodes or a rendezvous service).

A less flexible but simpler approach would be to configure a number of candidate peers that provide a particular service, as a means of providing some redundancy and load balancing.

Do we want to maintain permanent connections to bootstrap nodes? It may be more fault tolerant and put less stress on those nodes to prioritize them as "Hi-Lo" - high when there are few connected peers and low once a reliable mesh of connections has formed with other peers.

@jacobheun
Copy link
Contributor Author

Do we want to maintain permanent connections to bootstrap nodes? It may be more fault tolerant and put less stress on those nodes to prioritize them as "Hi-Lo" - high when there are few connected peers and low once a reliable mesh of connections has formed with other peers.

Yeah, we really don't want to stay connected to them unless we're below our min peers watermark, as they're primarily just an entry point workaround to join the network. We currently have this behavior for all discovered peers with auto dial. If we're above the min peers watermark we stop auto dialing, but below it we do.

I think the progression of a node would ideally look something like:

  1. Bootstrap to the network (hopefully migrating to being more distributed in the future)
  2. Actively discover peers that use our protocols (via rendezvous or other means)
  3. Continue until we have at least minPeers connections to those peers (other nodes would be excluded from this count, such as the bootstrap nodes)
  4. Create n+1 overlay networks with those peers, depending on the needs and quantities of those protocols
  5. Prioritize all overlay network connections
  6. Switch to passive discovery? (we probably don't need to keep crawling at this point)

Different node types may end up needing to have different behavior for this, but I think it accounts for a typical node.

I think in general if we fall below the min peers watermark, we should be going through our list of known peers to connect and actively find more peers, so bootstrap nodes wouldn't really need to be tagged. We really don't need the bootstrap module at all, as we should just be pulling from our Peer Store when we have too few connected peers. Instead of configuring the list of them when creating libp2p, we could/should just be adding them to the Peer Store.

@vasco-santos
Copy link
Member

@jacobheun thanks for putting this together!

I like your proposal and I think that this is definitely the way to go 🚀

I would introduce a keepAlive tag for keeping the connection, and the priority would be used for the priority to start a connection and for pruning. Let me know what you think?

This suggestion also makes me think if that should be an array of tags, or properties. I was more thinking on tags for visualization and debugging purposes, but we can also use them in this use case.


Also agree with the "Hi-Lo" reasoning for the bootstrap peers!

@achingbrain
Copy link
Member

I'm looking at implementing this piece of functionality, "keep-alive" sounds good, but I think "priority" needs a bit more to it, in terms of connection pruning at least.

If you have two components that mark the same peer as "priority" but then one finds a better peer and cleans up after itself, removing the "priority" tag, the other component will get rug-pulled.

Instead tags could be specific to the tagger, for example "preload" for IPFS, "dht-peer" for libp2p-kad-dht, "topic-mesh-node" for libp2p-gossipsub, etc, to ensure stable connections to high value peers.

So if we have tags with a name and a value, then for connection pruning we might just sum up the value of all tags a peer has, use that to order the connections, and prune the low value connections first.

Some tags like "keep-alive" might have "special" meanings like "best-effort reconnect after disconnect".

libp2p.peerBook.tagPeer(PeerId('QmPreload'), 'keep-alive', {
    value: 100,
    ttl: 60000 // optional, expire tag in 1m
}) => Promise<void>

libp2p.peerBook.removeTag(PeerId('QmPreload'), 'keep-alive') => Promise<void>

libp2p.peerBook.getTags(PeerId('QmPreload')) => Promise<[{
  name: 'keep-alive',
  value: 100
}]>

These could get configured at startup:

new Libp2p({
  // ...
  peerStore: {
    peers: {
      'QmPreload': {
        tags: {
          'keep-alive': { value: 100 },
          'preload', { value: 50 }
        }
      }
    }
  }
})

We might configure bootstrap nodes as keep-alive with a ttl for the first 10 minutes of running a node, for example, after that they become eligible for pruning if we have hit our max connections (e.g. they've done their job).

New connections might be protected for a few minutes so they can't get culled before identify has completed and any interested topologies have tagged the peer connections as valuable.

achingbrain added a commit to libp2p/js-libp2p-interfaces that referenced this issue Jun 22, 2022
Allow tagging peers to better prioritise which connections to kill
when hitting limits.  Also for keeping "priority" connections alive.

Refs: libp2p/js-libp2p#369
achingbrain added a commit to libp2p/js-libp2p-interfaces that referenced this issue Jun 22, 2022
Allow tagging peers to better prioritise which connections to kill
when hitting limits.  Also for keeping "priority" connections alive.

Refs: libp2p/js-libp2p#369
achingbrain added a commit to libp2p/js-libp2p-peer-store that referenced this issue Jun 22, 2022
Allows tagging peers to mark some important or ones we should keep
connections open to, etc.

Depends on:

- [ ] libp2p/js-libp2p-interfaces#255

Refs: libp2p/js-libp2p#369
achingbrain added a commit to libp2p/js-libp2p-interfaces that referenced this issue Jun 24, 2022
Allow tagging peers to better prioritise which connections to kill
when hitting limits.  Also for keeping "priority" connections alive.

Refs: libp2p/js-libp2p#369
achingbrain added a commit to libp2p/js-libp2p-peer-store that referenced this issue Jun 24, 2022
Allows tagging peers to mark some important or ones we should keep
connections open to, etc.

Refs: libp2p/js-libp2p#369
@BigLep
Copy link
Contributor

BigLep commented Sep 10, 2022

@achingbrain : I know there has been work here since your last comment. A few things:

  1. Where did we land?
  2. What is remaining on the original issue?
  3. Does a js-libp2p node now have ALLOW-list functionality so it can ignore the world except for some designated peers? Basically can this be used as a an eclipse attack prevention mechanism like go-libp2p added here: Defend against eclipse attacks with ALLOW-list support go-libp2p-resource-manager#29

@BigLep BigLep assigned achingbrain and unassigned jacobheun Sep 13, 2022
@BigLep
Copy link
Contributor

BigLep commented Sep 13, 2022

2022-09-13 triage conversation: we need to summarize where we got to and discuss what can now be done. We believe we provided the functionality originally outlined and now it's about leveraging it i other areas like gossipsub. That will likely translate to a new issue in gossipsub.

@achingbrain
Copy link
Member

achingbrain commented Dec 6, 2022

The final piece here is for interested modules to tag peers they need to keep connections open to.

@dapplion
Copy link
Contributor

dapplion commented Jan 18, 2023

I have some comments to tagging with an integer score:

  • For existing PRs the chosen values appear quasi random, since it's very hard to quantify this concept
  • The value of tags between protocols is arbitrary, currently every protocol having a factor of 1. For eth2 gossipsub, a peer may be grafted 74 times, giving it a tag value of 7400 completely over-running the scores of every other protocol.

This tagging system appears to me as a complicated scoring scheme that has not been properly researched and could have unintended practical and security considerations.

From reading this post a couple times seems that must goals could be achieved without an integer tag value, and instead just expressing a keep alive status like

enum KeepAlive {
    /// If nothing new happens, the connection can be closed at the given `Instant`.
    Until(Instant),
    /// Keep the connection alive.
    Yes,
    /// Close the connection if needed.
    No,
}

Even then, all this decisions are extremely opinionated where you are forcing specific paradigms to libp2p consumers.

Currently lodestar is fighting libp2p features more than necessary due to their opinionated nature. For example, the connection manager should never be deciding what peers to disconnect on the first place, but instead just enforce some limits set by the user. i.e. maxConnections should not mean "disconnect a random peer if above this limit", but instead mean "prevent new connections from being created if at that limit.

If some consumer like IPFS browser wants the default peer manager strategy, then it should buy into it instead of being there by default. A peer manager can be plugged into libp2p easily using the existing APIs. Then multiple peer manager strategies can be developed and shipped as modular components

@achingbrain
Copy link
Member

Closing as this is now complete.

maxConnections should not mean "disconnect a random peer if above this limit", but instead mean "prevent new connections from being created if at that limit.

It's worth noting that maxConnections does indeed prevent new incoming connections from being created (exceptions are made for whitelisted peers/networks). libp2p doesn't disconnect random peers, the whole point here is to be able to apply heuristics so that the disconnected peers are not randomly chosen.

At any rate, the feature has been implemented in a way that consumers such as Lodestar can opt-out and maintain their own peer ranking system separate to the libp2p connection manager.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature P1 High: Likely tackled by core team if no one steps up status/ready Ready to be worked
Projects
None yet
Development

No branches or pull requests

6 participants