Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kured v.1.2 fails on Kubernetes v.1.16 (on prem, kubeadm created) #89

Closed
HansK-p opened this issue Sep 22, 2019 · 23 comments
Closed

Kured v.1.2 fails on Kubernetes v.1.16 (on prem, kubeadm created) #89

HansK-p opened this issue Sep 22, 2019 · 23 comments
Milestone

Comments

@HansK-p
Copy link

HansK-p commented Sep 22, 2019

Kured v.1.2 fails on my kubeadm created K8s v.1.16.0 cluster. It seems that this issue has been solved by fix #75, which isn't a part of a release yet.

The error message I received before compiling kubed from the latest source was:

time="2019-09-22T19:45:07Z" level=info msg="Blocking Pod Selectors: []"
time="2019-09-22T19:45:07Z" level=fatal msg="Error testing lock: the server could not find the requested resource"

It would be really nice to get a kubed release v.1.3 including an updated stable helm chart. This will hopefully make the stable kured helm chart work on a K8s v.1.16 clusters without modifications.

I have not used Kured on older Kubernetes versions (yet).

@ujoergen
Copy link

Hi @HansK-p

Did you get it to run on 1.16?

@lundsec
Copy link

lundsec commented Sep 27, 2019

seems to work if you enable this on your api-servers
--runtime-config=extensions/v1beta1/daemonsets=true

see https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#deprecations-and-removals

@HansK-p
Copy link
Author

HansK-p commented Sep 27, 2019

It worked well for me on a kubeadm 1.16 created K8s cluster after I compiled kured using the latest source code. Kured have rebooted my kubernetes nodes more than once without any issues related to kured running on K8s v. 1.16.

@SerialVelocity
Copy link

@awh Is it possible to get a new release tagged so kured works on K8s 1.16?

@ReSearchITEng
Copy link

we need both image and helm chart update.
I can take care of the helm chart once image is ready.

@stealthybox
Copy link
Contributor

Thanks all.
As a workaround, you should be able to use weaveworks/kured:master-4beddb5.
That should have the AppsV1 change as well as the new --reboot-days feature.

You can specify this image tag as an override for the current helm chart

Related.
#66 (comment)

@evrardjp
Copy link
Collaborator

Please also watch #97

@nicolas-marcq
Copy link

@stealthybox . The last docker image seems locked.

 Error response from daemon: pull access denied for weaveworks/master-4beddb5, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

@nicolas-marcq
Copy link

New error on the image kured:master-4beddb5

 Error: unknown flag: --reboot-days sa,su
Usage:
  kured [flags]
Flags:
      --alert-filter-regexp regexp.Regexp   alert names to ignore when checking for active alerts
      --blocking-pod-selector stringArray   label selector identifying pods whose presence should prevent reboots
      --ds-name string                      name of daemonset on which to place lock (default "kured")
      --ds-namespace string                 namespace containing daemonset on which to place lock (default "kube-system")
      --end-time string                     schedule reboot only before this time of day (default "23:59:59")
  -h, --help                                help for kured
      --lock-annotation string              annotation in which to record locking node (default "weave.works/kured-node-lock")
      --period duration                     reboot check period (default 1h0m0s)
      --prometheus-url string               Prometheus instance to probe for active alerts
      --reboot-days strings                 schedule reboot on these days (default [su,mo,tu,we,th,fr,sa])
      --reboot-sentinel string              path to file whose existence signals need to reboot (default "/var/run/reboot-required")
      --slack-hook-url string               slack hook URL for reboot notfications
      --slack-username string               slack username for reboot notfications (default "kured")
      --start-time string                   schedule reboot only after this time of day (default "0:00")
      --time-zone string                    use this timezone for schedule inputs (default "UTC")
time="2019-11-20T16:06:37Z" level=fatal msg="unknown flag: --reboot-days sa,su" 

@nicolas-marcq
Copy link

Ok, it seems that we cannot use flags like described in the doc.
This doesn't work:

--reboot-days sat,sun

This is ok

--reboot-days=sat,sun

@nicolas-marcq
Copy link

nicolas-marcq commented Nov 20, 2019

On kubernetes 1.14 I had to add a permission on the Cluster role:

- apiGroups: ["apps"]
  resources: ["daemonsets"]
  verbs:     ["get"]

@SerialVelocity
Copy link

@nicolas-marcq you can use flags like described in the doc. It looks like you have done the equivalent of "--reboot-days sat,sun". I'm assuming this is because you used kubernetes and instead of:

 - --reboot-days
 - sat,sun

you did:

  - --reboot-days sat,sun

@nicolas-marcq
Copy link

You are right @SerialVelocity

@archmangler
Copy link

This does not seem to be working even with the right yaml syntax in 1.2.0. See issue #101

@SerialVelocity
Copy link

@archmangler Why would it work with 1.2.0? It says above you need to use weaveworks/kured:master-4beddb5

@dholbach dholbach mentioned this issue Feb 4, 2020
@dholbach dholbach added this to the 1.3.0 milestone Feb 4, 2020
@dholbach
Copy link
Member

dholbach commented Feb 5, 2020

I just tested this with 2 node cluster with kubeadm 1.16.2 and kured (master-f6e4062).

Any more feedback from other testers? Please speak up, as we'd like to get 1.3.0 out some time soon.

@onedr0p
Copy link

onedr0p commented Feb 5, 2020

@dholbach I am currently running Kubernetes v1.17.2 do you foresee any issue? In any case I'll give this a go sometime.

@dholbach
Copy link
Member

dholbach commented Feb 5, 2020

@onedr0p I know people have tested this already and gave their 👍, but as you'd run this from master (and not a released version of kured) maybe better check on a test cluster?

@HansK-p
Copy link
Author

HansK-p commented Feb 5, 2020

I've been running weaveworks/kured:master-f6e4062 on a v1.17.0 cluster for a few days with start-time and end-time set.

Things seems to be working as they should, but so far there has been no reason reboot. So it's not really much of a test - yet. but at least all logs from Kured so far indicates that Kured is working as it should in a Kubeadm configured K8s v1.17.0 cluster.

@onedr0p
Copy link

onedr0p commented Feb 6, 2020

Same with @HansK-p post. I have it deployed and the logs don't indicate an issue. Won't know for sure until a reboot is triggered by a security update.

@dholbach
Copy link
Member

I did some testing for #111 following the instructions in #112 and it looks like we're good. I'll close this issue now. Please reopen if your tests indicated anything else.

@dholbach
Copy link
Member

1.3.0 is out now: https://github.com/weaveworks/kured/releases/tag/1.3.0 🎆

@evrardjp
Copy link
Collaborator

awesome thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests