Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don’t hold node lock if reboot is blocked #819

Merged
merged 1 commit into from
Aug 17, 2023

Conversation

jackfrancis
Copy link
Collaborator

@jackfrancis jackfrancis commented Aug 16, 2023

Fixes #792

This PR moves the "is this node blocked for reboot?" foo earlier in the code flow in the main "does node need to be rebooted?" loop. The purpose is to determine whether or not the node is blocked for reboot prior to trying to acquire a node lock that will prevent another node in the cluster from being rebooted.

The purpose of blocking a node from rebooting is to implement node-specific safety checks prior to taking a node offline (an obvious side-effect of a reboot). These safety checks should be independent from the configuration that enforces how many nodes may go offline at any time (implemented via the node reboot concurrency config + the daemonset lock). This is what informs my thinking that we should simply check for a blocking condition prior to attempting to acquire a lock. If we are blocked for reboot, we should simply short-circuit the loop and try again later, like we do for other conditional outcomes that don't allow us to make forward progress.

@ckotzbauer ckotzbauer merged commit 8bc66c9 into kubereboot:main Aug 17, 2023
14 checks passed
@ckotzbauer ckotzbauer added this to the 1.14.0 milestone Aug 17, 2023
@ckotzbauer ckotzbauer mentioned this pull request Aug 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lock is not released if reboot of node is blocked
2 participants