Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent accidental change of network-key for active authorities #3852

Merged
merged 28 commits into from
Apr 15, 2024

Conversation

alexggh
Copy link
Contributor

@alexggh alexggh commented Mar 27, 2024

As discovered during investigation of #3314 and #3673 there are active validators which accidentally might change their network key during restart, that's not a safe operation when you are in the active set because of distributed nature of DHT, so the old records would still exist in the network until they expire 36h, so unless they have a good reason validators should avoid changing their key when they restart their nodes.

There is an effort in parallel to improve this situation #3786, but those changes are way more intrusive and will need more rigorous testing, additionally they will reduce the time to less than 36h, but the propagation won't be instant anyway, so not changing your network during restart should be the safest way to run your node, unless you have a really good reason to change it.

Proposal

  1. Do not auto-generate the network if the network file does not exist in the provided path. Nodes where the key file does not exist will get the following error:
Error: 
   0: Starting an authorithy without network key in /home/alexggh/.local/share/polkadot/chains/ksmcc3/network/secret_ed25519.
      
       This is not a safe operation because the old identity still lives in the dht for 36 hours.
      
       Because of it your node might suffer from not being properly connected to other nodes for validation purposes.
      
       If it is the first time running your node you could use one of the following methods.
      
       1. Pass --unsafe-force-node-key-generation and make sure you remove it for subsequent node restarts
      
       2. Separetly generate the key with: polkadot key generate-node-key --file <YOUR_PATH_TO_NODE_KEY>
  1. Add an explicit parameters for nodes that do want to change their network despite the warnings or if they run the node for the first time. --unsafe-force-node-key-generation

  2. For polkadot key generate-node-key add two new mutually exclusive parameters base_path and default_base_path to help with the key generation in the same path the polkadot main command would expect it.

  3. Modify the installation scripts to auto-generate a key in default path if one was not present already there, this should help with making the executable work out of the box after an instalation.

Notes

Nodes that do not have already the key persisted will fail to start after this change, however I do consider that better than the current situation where they start but they silently hide that they might not be properly connected to their peers.

TODO

  • Make sure only nodes that are authorities on producation chains will be affected by this restrictions.
  • Proper PRDOC, to make sure node operators are aware this is coming.

@s0me0ne-unkn0wn
Copy link
Contributor

As it makes the life of first-time node starters harder, could we include the node key generation logic in the post-install pipeline of the .deb package and Docker image?

Signed-off-by: Alexandru Gheorghe <[email protected]>
Signed-off-by: Alexandru Gheorghe <[email protected]>
Signed-off-by: Alexandru Gheorghe <[email protected]>
Signed-off-by: Alexandru Gheorghe <[email protected]>
@alexggh
Copy link
Contributor Author

alexggh commented Mar 27, 2024

As it makes the life of first-time node starters harder, could we include the node key generation logic in the post-install pipeline of the .deb package and Docker image?

Done, although, I'm still concerned if it is wise to generate automatically keys at install, I've seen pipelines where part of producing your docker image for deployment you install all the required tooling, so in this case when operators would produce their v2 of the image they will get key on the filesystem and if they use the same image to run multiple nodes they might get into trouble because they would use the same key.

polkadot/.rpm/polkadot.spec Outdated Show resolved Hide resolved
polkadot/scripts/packaging/deb-maintainer-scripts/postinst Outdated Show resolved Hide resolved
@alexggh alexggh changed the title Prevent accidental change of network-key for active authorities [DNM] Prevent accidental change of network-key for active authorities Mar 28, 2024
@alexggh
Copy link
Contributor Author

alexggh commented Mar 28, 2024

@paritytech/release-engineering could you help me test the generated Deb and Rpm files after this PR. Thank you!

Signed-off-by: Alexandru Gheorghe <[email protected]>
Signed-off-by: Alexandru Gheorghe <[email protected]>
@alexggh alexggh changed the title [DNM] Prevent accidental change of network-key for active authorities Prevent accidental change of network-key for active authorities Mar 28, 2024
Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just misses a prdoc for Node operators, otherwise looks good.

Ty!

substrate/client/cli/src/error.rs Outdated Show resolved Hide resolved
alexggh and others added 2 commits April 11, 2024 09:30
Signed-off-by: Alexandru Gheorghe <[email protected]>
@alexggh
Copy link
Contributor Author

alexggh commented Apr 11, 2024

Just misses a prdoc for Node operators, otherwise looks good.

Ty!

Done, merging ...

@alexggh alexggh enabled auto-merge April 11, 2024 15:34
@alexggh alexggh disabled auto-merge April 11, 2024 15:35
@alexggh alexggh added this pull request to the merge queue Apr 12, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Apr 12, 2024
@alexggh alexggh enabled auto-merge April 12, 2024 14:03
@alexggh alexggh added this pull request to the merge queue Apr 15, 2024
Merged via the queue into master with commit 2bc4ed1 Apr 15, 2024
132 of 136 checks passed
@alexggh alexggh deleted the alexaggh/node_identity branch April 15, 2024 06:45
alexggh added a commit that referenced this pull request Jul 5, 2024
Because of https://github.com/paritytech/polkadot-sdk/blob/299aacb56f4f11127b194d12692b00066e91ac92/cumulus/client/cli/src/lib.rs#L318
#3852 wrongly enforces
that the network key is present for collators started with
polkadot-parachain binary, instead of generating it
on the fly.

That is not necessary and created some small friction for
collators, since they have to pass `--unsafe-force-node-key-generation`
or generate the key with the `generate-node-key` command.

This PR fixes that since the change was intended to apply
only for relaychain authorithies.

Signed-off-by: Alexandru Gheorghe <[email protected]>
alexggh added a commit that referenced this pull request Jul 5, 2024
Because of https://github.com/paritytech/polkadot-sdk/blob/299aacb56f4f11127b194d12692b00066e91ac92/cumulus/client/cli/src/lib.rs#L318
#3852 wrongly enforces
that the network key is present for collators started with
polkadot-parachain binary, instead of generating it
on the fly.

That is not necessary and created some small friction for
collators, since they have to pass `--unsafe-force-node-key-generation`
or generate the key with the `generate-node-key` command.

This PR fixes that since the change was intended to apply
only for relaychain authorithies.

Signed-off-by: Alexandru Gheorghe <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T0-node This PR/Issue is related to the topic “node”.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants