Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing rebootSentinelCommand leads to reboot loop #976

Open
davidhrbac opened this issue Sep 6, 2024 · 1 comment
Open

Missing rebootSentinelCommand leads to reboot loop #976

davidhrbac opened this issue Sep 6, 2024 · 1 comment

Comments

@davidhrbac
Copy link

We have working deployment of Kured which is provisioned on Rancher clusters running Rocky Linux.

We deploy Kured with Fleet like this:

defaultNamespace: kured
helm:
  releaseName: kured
  chart: kured
  repo: https://kubereboot.github.io/charts
  values:
    configuration:
      rebootSentinelCommand: sh -c "! needs-restarting --reboothint"
      useRebootSentinelHostPath: false
      annotateNodes: true

It seems to me that if rebootSentinelCOmmand is missing at the node (in our environment needs-restarting), it leads to reboot loop.

Kured log:

2024-09-06T08:50:58.902005537Z time="2024-09-06T08:50:58Z" level=warning msg="sh: line 1: needs-restarting: command not found" cmd=/usr/bin/nsenter std=err
2024-09-06 11:51:13time="2024-09-06T08:51:13Z" level=info msg="Running command: [/usr/bin/nsenter -m/proc/1/ns/mnt -- /bin/systemctl reboot] for node: d2-pool1-fc962ed3-6m7j4"
2024-09-06 11:51:13time="2024-09-06T08:51:13Z" level=info msg="Waiting for reboot

Kured was installed on the node on Wed Sep 4. The node is continuously rebooted since then:

docker   pts/0        x.x.x.x     Fri Sep  6 08:47   still logged in
reboot   system boot  5.14.0-362.18.1. Fri Sep  6 07:50   still running
reboot   system boot  5.14.0-362.18.1. Fri Sep  6 06:52 - 07:49  (00:56)
reboot   system boot  5.14.0-362.18.1. Fri Sep  6 04:53 - 06:51  (01:58)
reboot   system boot  5.14.0-362.18.1. Fri Sep  6 03:08 - 04:51  (01:43)
reboot   system boot  5.14.0-362.18.1. Fri Sep  6 02:23 - 03:06  (00:43)
reboot   system boot  5.14.0-362.18.1. Fri Sep  6 00:50 - 02:21  (01:31)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 22:37 - 00:48  (02:10)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 21:04 - 22:35  (01:30)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 20:28 - 21:03  (00:35)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 19:21 - 20:26  (01:05)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 18:29 - 19:19  (00:49)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 17:27 - 18:27  (00:59)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 16:11 - 17:26  (01:15)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 14:34 - 16:09  (01:35)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 13:45 - 14:32  (00:47)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 11:40 - 13:43  (02:02)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 10:17 - 11:39  (01:21)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 09:31 - 10:16  (00:44)
docker   pts/0        x.x.x.x     Thu Sep  5 08:08 - 08:09  (00:01)
docker   pts/0        x.x.x.x     Thu Sep  5 07:31 - 07:31  (00:00)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 06:19 - 09:29  (03:10)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 05:22 - 06:17  (00:55)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 04:02 - 05:20  (01:18)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 02:43 - 04:00  (01:16)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 01:33 - 02:41  (01:08)
reboot   system boot  5.14.0-362.18.1. Thu Sep  5 00:06 - 01:32  (01:25)
reboot   system boot  5.14.0-362.18.1. Wed Sep  4 22:29 - 00:04  (01:35)
reboot   system boot  5.14.0-362.18.1. Wed Sep  4 21:32 - 22:27  (00:55)
reboot   system boot  5.14.0-362.18.1. Wed Sep  4 20:14 - 21:30  (01:16)
reboot   system boot  5.14.0-362.18.1. Wed Sep  4 18:40 - 20:11  (01:30)
reboot   system boot  5.14.0-362.18.1. Wed Sep  4 16:07 - 18:39  (02:31)
reboot   system boot  5.14.0-362.18.1. Wed Sep  4 14:42 - 16:05  (01:22)
reboot   system boot  5.14.0-362.18.1. Wed Sep  4 13:12 - 14:41  (01:28)
reboot   system boot  5.14.0-362.18.1. Wed Mar 13 06:56 - 13:08 (175+06:11)
reboot   system boot  5.14.0-362.18.1. Tue Mar 12 22:13 - 22:14  (00:01)
@evrardjp
Copy link
Collaborator

sh -c "! needs-restarting --reboothint" is returning 0.

In the past, we ran test -f /var/run/reboot-required. The success of the command (rc=0) determined that a reboot was required. If the return code was non zero, no reboot happened.

If you make your command failing returning a non-zero return code, kured will behave properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants