Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPDEV-103936] Fix prepare_dns_etc_hosts for work with unavailable nodes #680

Merged
merged 6 commits into from
Aug 5, 2024

Conversation

n549
Copy link
Collaborator

@n549 n549 commented Jul 31, 2024

Description

When one or more nodes are unavailable, add_node, remove_node procedures have to be run with prepare.dns.etc_hosts or update.etc_hosts task skipped (see PR).
If after that you try to add another node, add_node fails with error like

Temporary failure in name resolution': ['some_node_fqdn']}

Fixes # (issue)
CPDEV-103936

Solution

  1. New boolean parameter globals.ignore_unavailable_nodes_for_etchosts_update is added to cluster.yaml with default value False.
    system_prepare_dns_etc_hosts is adjusted to work as previously if globals.ignore_unavailable_nodes_for_etchosts_update= false and update /etc/hosts only at the available nodes if it's set to True.

  2. IN is updated with description of new variable.

How to apply

Test Cases

TestCase 1
step 1) Set globals.ignore_unavailable_nodes_for_etchosts_update to true in the cluster.yaml, disable one of the nodes in a cluster and try to add a new node with add_node procedure.
ER: The new node is added successfully
step 2) With the node still disabled and globals.ignore_unavailable_nodes_for_etchosts_update set to true, try to add another new node with add_node procedure.
ER: The new node is added successfully

TestCase 2
When a node is disabled in cluster and globals.ignore_unavailable_nodes_for_etchosts_update is set to true, remove another node with remove_node procedure.
ER: The node is removed successfully.

TestCase 3
Remove globals.ignore_unavailable_nodes_for_etchosts_update variable from cluster.yaml and try to add a node when there is at least one unavailable node.
ER: add_node fails because not all the nodes are available.

TestCase 4
Remove globals.ignore_unavailable_nodes_for_etchosts_update variable from cluster.yaml and try to add a node when all the nodes are available.
ER: add_node finishes successfully.

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • Integration CI passed
  • Unit tests. If Yes list of new/changed tests with brief description
  • There is no merge conflicts

Unit tests

Indicate new or changed unit tests and what they do, if any.

@n549 n549 added the bug Something isn't working label Jul 31, 2024
@ilia1243
Copy link
Contributor

The reason for why it was previously implemented in this way #628 (comment)

@n549 n549 marked this pull request as draft August 1, 2024 09:16
@n549 n549 marked this pull request as ready for review August 2, 2024 10:50
@n549 n549 requested a review from ilia1243 August 2, 2024 10:50
@koryaga koryaga merged commit b05ba5d into main Aug 5, 2024
42 checks passed
@koryaga koryaga deleted the fix/prepare_dns_etc_hosts branch August 5, 2024 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants