Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kad: improve FIND_NODE response definition #609

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

guillaumemichel
Copy link
Contributor

Context

The Kademlia spec doesn't explicitly define how a node N should answer to the request FIND_NODE(N). The FIND_NODE RPC is expected to return the (k) closest nodes to the requested key.

The libp2p Kademlia DHT offers the following types of operations:

  • Peer routing
    • Finding the closest nodes to a given key via FIND_NODE.
  1. Upon a response:
    1. If successful the response will contain the k closest nodes the peer knows to the key Key. Add them to the candidate list Pn, except for those that have already been queried.

Implementations have diverse behaviors and expectations, which can lead to poor interoperability (see libp2p/go-libp2p-kad-dht#966).

None of these implementations actually follows the spec, so they should be adapted to return the k closest peers.

Proposal

  1. Nodes should not return their own peer record if they are among the k known closest peers to the requested key.
    • Peers sending the request FIND_NODE(N) to N already have N's peer record. Even though the peer record sent by N could include additional multiaddresses that the requestor isn't aware of yet, this is the role of Identify, not the DHT.
  2. FIND_NODE response should never include the requester among the closer peers.
    • This information is useless.

@SgtPooki
Copy link
Member

Probably some dumb questions. Wouldnt it make sense for the requestor to filter responses rather than peers not providing accurate responses?

If im asking for the closest N nodes to a key, a node adding itself to the response (if it is one of the closest), or adding the requestor, is more truthful than not.

Is find-node only used when looking for peers to add provider records to or are there other uses? how are the responses currently processed for all find-node calls? Why wouldnt the requestor filter responses as necessary? They are already keeping a set of queried nodes.

Can we guarantee that a prior find-node call that returned the current requestee was already processed?

@guillaumemichel
Copy link
Contributor Author

If im asking for the closest N nodes to a key, a node adding itself to the response (if it is one of the closest), or adding the requestor, is more truthful than not.

Is find-node only used when looking for peers to add provider records to or are there other uses? how are the responses currently processed for all find-node calls? Why wouldnt the requestor filter responses as necessary? They are already keeping a set of queried nodes.

Can we guarantee that a prior find-node call that returned the current requestee was already processed?

I think that in most implementations the requestor is already filtering the response, which is good.

Not sending these two peers is about saving space on the wire. A response including either one of these peers will only contain useful information about k-1 peers.

FIND_NODE is used to find peers that are close to the requested key, this has many applications such as finding a specific node, looking for the closest peers to store a provider record/IPNS record, etc.

@dennis-tra
Copy link

First of all, I think it's indeed not spec-compliant to only return a single record when the spec says to always return the k closest peers to a key and I'm favour of calling out these special cases here.

In my mental model, a peer is always in its own 255th bucket so it must be part of the response set. This means I'm siding with @SgtPooki here when he says:

If im asking for the closest N nodes to a key, a node adding itself to the response (if it is one of the closest), or adding the requestor, is more truthful than not.


However, I think the most important consideration is if any of the options (including the peers or not) has an impact on the correct operation of the protocol. I would argue, in terms of routing soundness it doesn't matter, right? If it does then we would have a clear winner.


Some references I have found from git blame:

@sukunrt
Copy link
Member

sukunrt commented Apr 2, 2024

Apologies if this is off topic or my understanding is completely incorrect.

  1. I think the do not tell the peer about itself logic is broken.
closest = dht.betterPeersToQuery(pmes, from, dht.bucketSize)

// Never tell a peer about itself.
if targetPid != from {
	// Add the target peer to the set of closest peers if
	// not already present in our routing table.
	//
	// Later, when we lookup known addresses for all peers
	// in this set, we'll prune this peer if we don't
	// _actually_ know where it is.
	found := false
	for _, p := range closest {
		if targetPid == p {
			found = true
			break
		}
	}
	if !found {
		closest = append(closest, targetPid)
	}
}

if the requester is in the server's routing table, we will tell the peer about itself because it is in the closest slice.

  1. Why do we return the targetPid if it wasn't found in the routing table?

@guillaumemichel
Copy link
Contributor Author

@sukunrt these issues are specific to go-libp2p-kad-dht.

  1. I think the do not tell the peer about itself logic is broken.

dht.betterPeersToQuery never returns itself, nor from. Because in the current implementation, a node is expected to return its own peerid if it receives a FIND_NODE request for itself, the node will add its own peerid to the list.

  1. Why do we return the targetPid if it wasn't found in the routing table?

It is possible that we are connected to a node, but it isn't in the routing table (e.g because there are better candidates). Adding targetPid to the list allows us to return the target peer record, in this special situation.

@achingbrain
Copy link
Member

achingbrain commented Apr 29, 2024

This suggestion in combination with libp2p/go-libp2p-kad-dht#820 makes it really hard to bootstrap a DHT from scratch.

Since initially no nodes have any peers in their routing tables, the PR above means go-libp2p-kad-dht will only add a peer to it's routing table that responds with 1+ peers to a query for it's own PeerId, and since no nodes will respond with their own Peer Info to a query for their own PeerID, then no peer will ever be added to the routing table of a go-libp2p-kad-dht node unless that peer already has DHT peers.

This is fine if you're joining an already established DHT but makes creating one almost impossible, unless I'm missing something.

I think maybe you include your own PeerInfo in the response to a request for your own PeerId. Granted it's not the most efficient thing to do, but arguably querying a peer for it's own PeerId doesn't make much sense, unless you're doing it as an effective no-op as go-libp2p-kad-dht does.

@guillaumemichel
Copy link
Contributor Author

Good point @achingbrain!

I think that the lookupCheck introduced in libp2p/go-libp2p-kad-dht#820 is responsible of the problem you are describing. A node shouldn't be picky about peers it adds to its routing table, when its own routing table isn't well populated.

I just opened a PR to address this libp2p/go-libp2p-kad-dht#970

arguably querying a peer for it's own PeerId doesn't make much sense

An example where this query makes sense could be a reputation system where peers gets a grade from other network participants. The grade would be stored on the n closest peers to the target node. We don't want to trust the node itself to store its own grades. In order to give or retrieve the grade associated with a peer, FIND_NODE(peer) should return the n closest peers.

@achingbrain
Copy link
Member

Yes, my point was from the point of view of a "find closer peers"-type query it doesn't make sense, one where you want to find out more information about the network, find out who to dial next, etc, not that there was no reason to do it. There are plenty of reasons to do it as we both point out. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

Successfully merging this pull request may close these issues.

5 participants