Kubeone continues despite failed healthz check #3012
Labels
kind/bug
Categorizes issue or PR as related to a bug.
sig/cluster-management
Denotes a PR or issue as being assigned to SIG Cluster Management.
Milestone
What happened?
I'm trying to set up a K1 cluster in a new environment. I'm still fine tuning the infra and particularly the firewalls (OpenStack security groups). Access to the CP nodes through the LB was broken when I ran
kubeone apply
to install a Kubernetes cluster.Luckily, K1 seems to run a healthz check before trying to do anything on the cluster. Unluckily, after the healthz check fails, it just continues anyway. And then it fails to create a resource but still keeps on going.
Also, fixing the firewall issue and letting
kubeone apply
run again wasn't successful, I had to replace the VMs and start fresh.Expected behavior
K1 notices that the healthz check fails and doesn't continue. K1 notices that creating a resource failed and doesn't continue.
Also, K1 should probably be able to recover from this in a subsequent run.
How to reproduce the issue?
Yea, that's not going to be easy, I guess. As I described above, I had a custom TF-based OpenStack setup. Everything worked as expected, except accessing port 6443 through the LB. I think the LB accepted the connection, but the connection between LB and VM was blocked. Access to port 6443 on the LB without going through the LB worked.
What KubeOne version are you using?
1.7.2
Provide your KubeOneCluster manifest here (if applicable)
Don't think it matters, otherwise let me know. (I need to manually do some of the steps our pipeline does to get this manifest.)
What cloud provider are you running on?
OpenStack
What operating system are you running in your cluster?
Ubuntu 22.04
Additional information
I'll add the logs of the initial "install run" and the "subsequent run". I eventually cancelled both job runs, equivalent to ctrl+c.
k1-install-run.log
k1-subsequent-run.log
The text was updated successfully, but these errors were encountered: