Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Peer Management #594

Closed
7 tasks done
chaitanyaprem opened this issue Jul 5, 2023 · 18 comments
Closed
7 tasks done

Introduce Peer Management #594

chaitanyaprem opened this issue Jul 5, 2023 · 18 comments
Assignees
Labels
enhancement New feature or request epic Tracks a yearly team epic (only for waku-org/pm repo)

Comments

@chaitanyaprem
Copy link
Collaborator

chaitanyaprem commented Jul 5, 2023

Planned start date:
Due date: 31st-Aug-2023

Summary

Implement basic peer management which helps in protecting the node from Dos and also to limit number of connections. This would primarily be helpful for Status Desktop(which runs a relay node) and the service node.
Ensure items from waku-org/nwaku#1353 are covered.

Acceptance Criteria

  • Ensure a given in/out ratio for relay peers.
  • Leave room for service peers
  • Dogfooding with status-desktop without any issues.

Tasks

RAID (Risks, Assumptions, Issues and Dependencies)

Dependent on Dogfooding of the new peer management via status desktop.

@alrevuelta
Copy link

Hi there, thanks for this! gowaku indeed needs some peer management.

I would focus on the following for peer management:

  • Ensure a given in/out ratio. This prevents a bunch of attacks, such as shadowing, where malicious peers surround you. If we dont enforce a in/out peer ratio, this is more likely to be exploited.
  • Ensure we leave some room for service peers (aka non relay peers). Nodes should be altruistic, and offer to the network services such as store or peer exchange. So we must ensure that each node has at least some % of connections available for this.
  • Prevent attacks from the same ip, aka colocation factor. Prune peers that are behind the same IP. In other words, never allow more than x connections from the same IP.
  • Perhaps not strictly peer management but related. Enforce some rate limits for heavy protocols such as store. Otherwise a node can be DoSed by store requests. Example: only allow to serve x request/second, or bytes/second. Beyond that, temporally blacklist.

Atleast 1 peer per configured service should be connected

Not sure we need this. Perhaps service peers should be used on demand. You need something, you get it and the connection is closed. Like this, we are also nicer to the network, since we dont occupy connections that we are not using. Pure req/resp.

The idea is to assign priority to each peer based on various factors. Priority range shall between 1-10. 5 - 10 would be the initial priority assigned to each peer based on service it supports. (e.g: relay peers would be given 6, whereas service specific peers such as Store/lightpush would be given 7 ).

Not sure I would implement this peer priority idea, at least right now. With gossipsub scores it should be enough by now.

tldr: we indeed need better peer management and on top of what you have mentioned, i would focus on peer management strategies to prevent being DoSed and ensure the network can serve light protocols.

@chaitanyaprem
Copy link
Collaborator Author

chaitanyaprem commented Jul 5, 2023

Hi there, thanks for this! gowaku indeed needs some peer management.

I would focus on the following for peer management:

* Ensure a given in/out ratio. This prevents a bunch of attacks, such as shadowing, where malicious peers surround you. If we dont enforce a in/out peer ratio, this is more likely to be exploited.

This would be a good feature, will include it.

* Ensure we leave some room for service peers (aka non relay peers). Nodes should be altruistic, and offer to the network services such as store or peer exchange. So we must ensure that each node has at least some % of connections available for this.

Understood, but you are referring to service node. Since go-waku is also used by mobile and light clients, will have to think of how to address those cases as well.

* Prevent attacks from the same ip, aka colocation factor. Prune peers that are behind the same IP. In other words, never allow more than x connections from the same IP.

Understood, there is already a basic IP based limit check and disconnection. Just have to bring it under peer manager.

* Perhaps not strictly peer management but related. Enforce some rate limits for heavy protocols such as store. Otherwise a node can be DoSed by store requests. Example: only allow to serve x request/second, or bytes/second. Beyond that, temporally blacklist.

Good idea to have some rate-limiting for service nodes. Probably per service basis.

Atleast 1 peer per configured service should be connected

Not sure we need this. Perhaps service peers should be used on demand. You need something, you get it and the connection is closed. Like this, we are also nicer to the network, since we dont occupy connections that we are not using. Pure req/resp.
Got it, for light protocols we don't need this. But for protocols that require constant interaction (like Store), this maybe required.

The idea is to assign priority to each peer based on various factors. Priority range shall between 1-10. 5 - 10 would be the initial priority assigned to each peer based on service it supports. (e.g: relay peers would be given 6, whereas service specific peers such as Store/lightpush would be given 7 ).

Not sure I would implement this peer priority idea, at least right now. With gossipsub scores it should be enough by now.

My idea of priority was to decide when to prune or degrade a peer's priority based on various factors(something similar to https://github.com/prysmaticlabs/prysm/tree/develop/beacon-chain/p2p/peers/scorers). Also, this would give flexibility for users/applications using waku to define their own scoring mechanism and set priority for peers. Not sure how much this would be of a requirement for an app like Status. @richard-ramos , maybe can shed some light on this.

tldr: we indeed need better peer management and on top of what you have mentioned, i would focus on peer management strategies to prevent being DoSed and ensure the network can serve light protocols.

This is mainly required for service-nodes, but for light-clients peer management would mostly involve how to manage in/out connections as per bandwidth and other requirements.

@richard-ramos
Copy link
Member

Regarding these suggestions from @alrevuelta:

Prevent attacks from the same ip, aka colocation factor. Prune peers that are behind the same IP. In other words, never allow more than x connections from the same IP.

This should be already covered by the connection gater: https://github.com/waku-org/go-waku/blob/master/waku/v2/connection_gater.go#L55-L66

Perhaps not strictly peer management but related. Enforce some rate limits for heavy protocols such as store. Otherwise a node can be DoSed by store requests. Example: only allow to serve x request/second, or bytes/second. Beyond that, temporally blacklist.

This I think we can manage by using go-libp2p scaling limit config, for example here we can see the limits set by go-libp2p for the default services it provides: https://github.com/libp2p/go-libp2p/blob/master/limits.go#L15 . We should be able to control the maximum number of streams allocated to each service and memory.

@richard-ramos
Copy link
Member

Atleast 1 peer per configured service should be connected. Else initiate connections to peers based on priority.

This i think it might not be really necessary. We connect to services on demand, so as long as as we chose a peerID supporting the protocol we need (or have the peerstore chose one for us), it should be fine, as a new connection will be opened to that peer or an existing one reused

Ensure we leave some room for service peers (aka non relay peers). Nodes should be altruistic, and offer to the network services such as store or peer exchange. So we must ensure that each node has at least some % of connections available for this.

This idea i do like. One problem go-waku test fleet has is that sometimes wakucanary has trouble connecting to the nodes since all the slots are already allocated to other peers. I imagine having a % allocated to services should help with this scenario (not so much for cases when wakucanary attempts relay connection and all gossipsub slots are already busy :) )

(peer priority). This would give flexibility for users/applications using waku to define their own scoring mechanism and set priority for peers. Not sure how much this would be of a requirement for an app like Status.

No requirement defined for this, as far as i know.

@alrevuelta
Copy link

@richard-ramos golibp2p limits looks very nice, had no idea about this. With a quick look it doesnt seem we can enforce served bytes/sec, but perhaps memory is just enough. Or well, we can always do it on top. prysm approach seems also interesting, using a leakybucket: https://github.com/prysmaticlabs/prysm/blob/e9b5e52ee219d058ab587f5187732281505f5b2b/beacon-chain/sync/rate_limiter.go#L36

@chaitanyaprem

My idea of priority was to decide when to prune or degrade a peer's priority based on various factors(something similar to https://github.com/prysmaticlabs/prysm/tree/develop/beacon-chain/p2p/peers/scorers).

Yep, agree with the feature but would say its just a nice to have right now. imho peer management for DoS protection is prio 1 right now. The shard that a peer supports will be part of this "score" in the future, since automatic sharding will add an extra dimension to this.

Also, this would give flexibility for users/applications using waku to define their own scoring mechanism and set priority for peers. Not sure how much this would be of a requirement for an app like Status. @richard-ramos , maybe can shed some light on this.

Not sure this flexibility is i) needed or ii) if it will be used at all. Defining a custom scoring mechanisim requires depth knowledge on waku/gossipsub/libp2p and doing it wrong can have a huge impact on both the network and the node itself. I don't imagine anyone in Ethereum writing their own peer scoring algorithm. The only thing that afaik people configure here is staticnodes, which are nodes that the protocol always stays connected to.

So, "scoring" (or some criteria beyond just gossipsub), yes. But I wouldn't overengineer it at this stage. Something as simple as gossipsub scoring + keep staticnodes connected + just connect to shards we are interested should be enough imho.

@chaitanyaprem
Copy link
Collaborator Author

chaitanyaprem commented Jul 6, 2023

Thanks @richard-ramos and @alrevuelta for your comments and suggestions.
Agree with both of your points wrt what items to be considered right now.

Based on the above discussion, following are conclusions

  • not consider peer priority(we can take a relook at this later) since it is more of nice to have.
  • Introduce peer management with below capabilities

Peer Manager Capabilities

  1. Ensure static nodes are always connected.
  2. Support for peers when subscribed or publishing to multiple shards. Basic implementation is done, need to add extensive tests.
  3. Ensure there is always some room for service peers similar to feat: limit relay connections below max conns nwaku#1813
  4. Use go-libp2p scaling limit config
  5. Peer connections Pruning
  6. Ensure a given in/out ratio for relay peers.
  7. Implement a relay-connectivity-loop

This was referenced Jul 6, 2023
@chaitanyaprem chaitanyaprem added the enhancement New feature or request label Jul 6, 2023
@chaitanyaprem chaitanyaprem self-assigned this Aug 2, 2023
@fryorcraken fryorcraken added the milestone Tracks a subteam milestone label Aug 3, 2023
@fryorcraken
Copy link
Collaborator

@chaitanyaprem are you happy to update the description of this issue to match this template? waku-org/pm#40

@chaitanyaprem chaitanyaprem changed the title feat: Introduce a basic Peer Manager [Milestone] Introduce a basic Peer Manager Aug 3, 2023
@chaitanyaprem chaitanyaprem changed the title [Milestone] Introduce a basic Peer Manager [Milestone] Basic Peer Management Aug 3, 2023
@chaitanyaprem chaitanyaprem changed the title [Milestone] Basic Peer Management [Milestone] Introduce Peer Management Aug 3, 2023
@chaitanyaprem
Copy link
Collaborator Author

chaitanyaprem commented Aug 4, 2023

Weekly Update

  • achieved: Basic peer management to ensure standard in/out ratio for relay peers.
  • next: add service slots to peer manager

@chaitanyaprem
Copy link
Collaborator Author

@alrevuelta, Was going through the comments again and had noticed this point mentioned by you keep staticnodes connected.

Can you please elaborate on this as i could not understand what needs to be done here? Because as far as i understand, we don't need to maintain any persistent connections towards static nodes unless they are being used constantly. Otherwise it would result in connection timeout and connections can be established on demand.

@alrevuelta
Copy link

@alrevuelta, Was going through the comments again and had noticed this point mentioned by you keep staticnodes connected.

@chaitanyaprem Wouldn't say its an strict requirement but static nodes are nodes that are manually configured from the cli, which are considered as trusted. So it would make sense to stay connected to them.

@alrevuelta
Copy link

alrevuelta commented Aug 8, 2023

@chaitanyaprem Where do you got that from? iirc nwaku doesn't have that implemented. Either way, don't have a strong opinion on this feature. But its something that I have seen in some blockchains.

@chaitanyaprem
Copy link
Collaborator Author

@chaitanyaprem Where do you got that from? iirc nwaku doesn't have that implemented. Either way, don't have a strong opinion on this feature. But its something that I have seen in some blockchains.

You have mentioned here #594 (comment) in the last statement.
Yes, i did not see that implemented in nwaku as well.

As an initial thought, even it made sense to me as well that we need to be connected to nodes that are statically configured(as they are trusted).
But considering the protocols we have in Waku, It did not feel like a need to be connected to statically configured nodes. Some thoughts for the same.

  • store : Would mostly require only when the node is coming up to fetch messages that were missed.
  • filter: the filter peer would automatically establish a connection when it needs to forward any messages that matches a filter subscription
  • Disc nodes: It would be good to not stay connected to them, since the idea is to use these nodes only for initial discovery.

We can have a CLI flag which indicates that we need to maintain persistent connections to specific nodes(probably because they are trusted), in which case we need more mechanism to ensure connections to these trusted nodes should not be pruned and given some sort of priority. I am not sure if there is such a use-case as of now.

@alrevuelta
Copy link

@chaitanyaprem Lets leave it by now, since its not that important.

Note thought that by "keep connected" I mean keep a relay connection (aka using relay protocol) so not really related to store filter, etc.

Note also that ideally "disc nodes" aka bootstrap nodes should not really run any waku protocol but just discv5. In other words, you shouldn't be able to "connect" to a disc node, because it shouldn't support any waku protocol (relay, store, etc). Ofc this is not the case, but its something that imho we should strive for. A disc node (bootstrap node) should just give you other peers, but not provide any other service. (at least imho).

@chaitanyaprem
Copy link
Collaborator Author

Got it, understood. We should ensure relay nodes have connections for all shards.

Agreed that it would be good to have separate discovery nodes so that they only handle discovery traffic. But, practically we will have to see how it is going to be in the network.

@chaitanyaprem
Copy link
Collaborator Author

chaitanyaprem commented Aug 11, 2023

Weekly Update

  • achieved: add service slots to peer manager.
  • next: implement relay connectivity loop, integrate gossipsub scoring for peer disconnections

@chaitanyaprem
Copy link
Collaborator Author

chaitanyaprem commented Aug 18, 2023

Weekly Update

  • achieved: implement relay connectivity loop, log reachability status reported with help of AutoNAT service, local testing using waku simulator and bug fixes
  • next: work towards dogfooding new peer mgmt with Status

@fryorcraken fryorcraken changed the title [Milestone] Introduce Peer Management [Epic] Introduce Peer Management Aug 24, 2023
@fryorcraken fryorcraken removed the milestone Tracks a subteam milestone label Aug 24, 2023
@fryorcraken fryorcraken added the epic Tracks a yearly team epic (only for waku-org/pm repo) label Aug 24, 2023
@chaitanyaprem
Copy link
Collaborator Author

chaitanyaprem commented Aug 25, 2023

Weekly Update

  • achieved: Raised PR in status-go to use this version in order to dogfood. Local testing with status desktop
  • next: Dogfood changes with Status desktop and mobile using Waku CC's

@chaitanyaprem
Copy link
Collaborator Author

Closing this issue as dogfooding has been done in following ways:

  1. Used status desktop and status-mobile to install dev versions of the app and validate relay peer connections are limited.
  2. Status-go team has already included this change and status-mobile team is already testing these changes.

Any issues that would be reported, will be addressed separately

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request epic Tracks a yearly team epic (only for waku-org/pm repo)
Projects
Archived in project
Development

No branches or pull requests

4 participants