-
Notifications
You must be signed in to change notification settings - Fork 959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have a sharded Kademlia protocol #1087
Comments
Relevant, as we could also use this "get_providers" system if it is improved: libp2p/specs#163 |
A priori, I'd think a namespace might be another hash function mapping from keys to storage nodes. In principle, any node or key could select as many namespaces as desired. And the default namespace itself might be optional. If too few nodes participate in a namespace then look ups within that namespace become slow, due to the first lookup failing, or fail outright, if the stored or queried key does not enable the default namespace or reduces replicas there. We've other corner cases when namespaces have few nodes too. Anything like this requires more thought. |
What I was thinking with namespaces is that a node is stored in the DHT at index For example, you'd store something like |
I'd think any performance improvement would require that storage nodes must opt into namespaces they like though, right? |
As an aside, there is sometimes a need to validate record placed into the DHT, which means some record type and validation logic too. |
We do not really have good answers here right now. We currently think: Polkadot should not run on the same network as IPFS, etc. We see no reason to hurt performance just to be on the same network. We also want the freedom to adopt different or nicer crypto then the support, including maybe add a post-quantum handshake, and maybe break other things. We do not yet know if Kademlia or another scheme like Chord actually makes more sense. @FatemeShirazi We're interested in a fibered or namespaced design to improve connectivity within parachains, but doing this naively risks parachains loosing connectivity due to too few nodes supporting the DHT. We need much "session" key material to appear on chain. Yet, we never quite diagrammed out what happens when nodes crash, etc. If a node crashes, should its BABE and/or GRANDPA keys die, and thus require it wait an hour or so before it can come back onlilne? I think not, well even if you want to armor those keys, then you could do so by keeping them in SGX. At the same time a node's IP address might change mid epoch, so that at least needs to be flexible. |
Abstractly, there is a desire for node discovery that is targeted at a specific subset of nodes in the DHT, whereby this subset is associated with some identifier. That is, knowing such an identifier for a subset of nodes, a particular node wants to discover other peers in the DHT within this subset. Concretely, in libp2p#1087, it is mentioned that a validator in Polkadot wants to discover (at least some of) the nodes of a specific parachain it is (temporarily) assigned to, and be able to do so in a targeted manner, i.e. without randomly walking the DHT until it finds some. In that context, the validator knows about an identifier for the parachain and the general idea described in that issue seems to be around ways to tag or group nodes of a parachain together in the DHT based on such an identifier. To that end, this is an implementation that generally permits keys in the Kademlia DHT to be assigned prefixes, corresponding to such an identifier / namespace shared by multiple nodes and thus by multiple DHT keys.
Abstractly, there is a desire for node discovery that is targeted at a specific subset of nodes in the DHT, whereby this subset is associated with some identifier. That is, knowing such an identifier for a subset of nodes, a particular node wants to discover other peers in the DHT within this subset. Concretely, in libp2p#1087, it is mentioned that a validator in Polkadot wants to discover (at least some of) the nodes of a specific parachain it is (temporarily) assigned to, and be able to do so in a targeted manner, i.e. without randomly walking the DHT until it finds some. In that context, the validator knows about an identifier for the parachain and the general idea described in that issue seems to be around ways to tag or group nodes of a parachain together in the DHT based on such an identifier. To that end, this is an implementation that generally permits keys in the Kademlia DHT to be assigned prefixes, corresponding to such an identifier / namespace shared by multiple nodes and thus by multiple DHT keys.
Abstractly, there is a desire for node discovery that is targeted at a specific subset of nodes in the DHT, whereby this subset is associated with some identifier. That is, knowing such an identifier for a subset of nodes, a particular node wants to discover other peers in the DHT within this subset. Concretely, in libp2p#1087, it is mentioned that a validator in Polkadot wants to discover (at least some of) the nodes of a specific parachain it is (temporarily) assigned to, and be able to do so in a targeted manner, i.e. without randomly walking the DHT until it finds some. In that context, the validator knows about an identifier for the parachain and the general idea described in that issue seems to be around ways to tag or group nodes of a parachain together in the DHT based on such an identifier. To that end, this is an implementation that generally permits keys in the Kademlia DHT to be assigned prefixes, corresponding to such an identifier / namespace shared by multiple nodes and thus by multiple DHT keys.
Crosslinking Protocol-Lab's RFP for Multi-Level DHT Design and Evaluation here. |
I think this should be solved on the specs level, closing here. |
(note: this issue is about designing a protocol, not a bugfix).
There is no way, from Kademlia itself, to know if a node is IPFS, Polkadot, Edgeware, or something else.
This results is Substrate/Polkadot trying a lot of wrong nodes, then disconnecting from them.
Even if, say, Polkadot's network is totally separate from Edgeware's and IPFS's networks, all the Polkadot parachains would be part of a single network, and we will eventually need a way to discover the nodes/collators which are part of a specific parachain, for validators to be able to connect to them.
In #942 (which is more or less abandoned), I went with a "namespaces" system. Each node belongs to a namespace, this namespace is part of the identity of the node. I'm not sure if that's a good solution.
Ethereum 2.0 obviously has a similar issue with Sharding. Since I opened #942, they created discovery v5 (https://github.com/fjl/p2p-drafts), which I haven't read so far.
The text was updated successfully, but these errors were encountered: