Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm init fails #1380

Closed
tcurdt opened this issue Feb 1, 2019 · 36 comments
Closed

kubeadm init fails #1380

tcurdt opened this issue Feb 1, 2019 · 36 comments
Labels
area/ecosystem priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done.

Comments

@tcurdt
Copy link

tcurdt commented Feb 1, 2019

What keywords did you search in kubeadm issues before filing this one?

init

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version

&version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:33:30Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/arm"}

Environment:

  • Kubernetes version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:35:51Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/arm"}
The connection to the server localhost:8080 was refused - did you specify the right host or port?
  • Cloud provider or hardware configuration:

RPi3 B+

  • OS:
PRETTY_NAME="Raspbian GNU/Linux 9 (stretch)"
NAME="Raspbian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
  • Kernel (e.g. uname -a):
Linux km01 4.14.34-hypriotos-v7+ #1 SMP Sun Apr 22 14:57:31 UTC 2018 armv7l GNU/Linux
  • Others:

What happened?

Fresh install of hypriotos-rpi-v1.9.0.
Then apt-get install -y kubeadm. So far so good.
As root kubeadm init --pod-network-cidr 10.244.0.0/16 fails with:

# kubeadm init --pod-network-cidr 10.244.0.0/16
[init] Using Kubernetes version: v1.13.2
[preflight] Running pre-flight checks
  [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.0. Latest validated version: 18.06
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.43 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.43 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [km01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.178.43]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
  timed out waiting for the condition

This error is likely caused by:
  - The kubelet is not running
  - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
  - 'systemctl status kubelet'
  - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
  - 'docker ps -a | grep kube | grep -v pause'
  Once you have found the failing container, you can inspect its logs with:
  - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
# docker version
Client:
 Version:           18.09.0
 API version:       1.39
 Go version:        go1.10.4
 Git commit:        4d60db4
 Built:             Wed Nov  7 00:57:21 2018
 OS/Arch:           linux/arm
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.0
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.4
  Git commit:       4d60db4
  Built:            Wed Nov  7 00:17:57 2018
  OS/Arch:          linux/arm
  Experimental:     false

# docker ps -a
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS                     PORTS               NAMES
72e8af203fcc        7a29ac9f3098           "kube-apiserver --au…"   2 minutes ago       Up 2 minutes                                   k8s_kube-apiserver_kube-apiserver-km01_kube-system_f4ce15d3927feb3b0957420d19d0c935_3
522449a6bc58        7a29ac9f3098           "kube-apiserver --au…"   3 minutes ago       Exited (0) 2 minutes ago                       k8s_kube-apiserver_kube-apiserver-km01_kube-system_f4ce15d3927feb3b0957420d19d0c935_2
fba7a40b03f3        2d981d285d92           "kube-scheduler --ad…"   7 minutes ago       Up 6 minutes                                   k8s_kube-scheduler_kube-scheduler-km01_kube-system_9729a196c4723b60ab401eaff722982d_0
d8671e640520        e7a8884c8443           "etcd --advertise-cl…"   7 minutes ago       Up 6 minutes                                   k8s_etcd_etcd-km01_kube-system_3e14953b2357b11169e86844e2c48a10_0
30cd3ecbdaea        a0e1a8b762a2           "kube-controller-man…"   7 minutes ago       Up 6 minutes                                   k8s_kube-controller-manager_kube-controller-manager-km01_kube-system_097345e297e344d595052996fbb45893_0
5162020dd593        k8s.gcr.io/pause:3.1   "/pause"                 7 minutes ago       Up 6 minutes                                   k8s_POD_etcd-km01_kube-system_3e14953b2357b11169e86844e2c48a10_0
ff429ccce911        k8s.gcr.io/pause:3.1   "/pause"                 7 minutes ago       Up 6 minutes                                   k8s_POD_kube-scheduler-km01_kube-system_9729a196c4723b60ab401eaff722982d_0
c154467b6f6d        k8s.gcr.io/pause:3.1   "/pause"                 7 minutes ago       Up 6 minutes                                   k8s_POD_kube-controller-manager-km01_kube-system_097345e297e344d595052996fbb45893_0
c79165dffa4b        k8s.gcr.io/pause:3.1   "/pause"                 7 minutes ago       Up 6 minutes                                   k8s_POD_kube-apiserver-km01_kube-system_f4ce15d3927feb3b0957420d19d0c935_0

# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Fri 2019-02-01 21:22:08 CET; 8min ago
     Docs: https://kubernetes.io/docs/home/
 Main PID: 28230 (kubelet)
    Tasks: 24 (limit: 4915)
   Memory: 27.6M
      CPU: 46.457s
   CGroup: /system.slice/kubelet.service
           └─28230 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cg

Feb 01 21:29:50 km01 kubelet[28230]: E0201 21:29:50.623818   28230 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plu
Feb 01 21:29:55 km01 kubelet[28230]: W0201 21:29:55.628723   28230 cni.go:203] Unable to update cni config: No networks found in /etc/cni/net.d
Feb 01 21:29:55 km01 kubelet[28230]: E0201 21:29:55.630138   28230 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plu
Feb 01 21:30:00 km01 kubelet[28230]: W0201 21:30:00.632774   28230 cni.go:203] Unable to update cni config: No networks found in /etc/cni/net.d

# journalctl -xeu kubelet
Feb 01 21:30:40 km01 kubelet[28230]: E0201 21:30:40.686723   28230 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plu
Feb 01 21:30:44 km01 kubelet[28230]: E0201 21:30:44.109239   28230 dns.go:132] Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 8.8.8.8 8.8.4.4 
Feb 01 21:30:45 km01 kubelet[28230]: W0201 21:30:45.691267   28230 cni.go:203] Unable to update cni config: No networks found in /etc/cni/net.d
Feb 01 21:30:45 km01 kubelet[28230]: E0201 21:30:45.692066   28230 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plu

What you expected to happen?

Finish of the init and then giving me the join command.

How to reproduce it (as minimally and precisely as possible)?

Fresh install of hypriotos-rpi-v1.9.0 then:

sudo bash <<EOF
curl -sSL https://packagecloud.io/Hypriot/rpi/gpgkey | apt-key add -
curl -sSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
apt-get update && apt-get install -y kubeadm
kubeadm init --pod-network-cidr 10.244.0.0/16
EOF

Anything else we need to know?

This has worked before (as in yesterday). I don't see anything that might have changed.
Unless there was a new release like today or yesterday I am running out of ideas.

I am not sure the cni config is a problem yet. Installing flannel would be next on the list.

@neolit123
Copy link
Member

neolit123 commented Feb 1, 2019

RPI is problematic and we suspect that deeper problems exist - e.g. races in the kubelet.

please have a look at this solution and the related thread for failures on RPI:
#413 (comment)

if this has worked before i can only suspect a flakyness that we do not account for on RPI.
the above solution fiddles with liveness probes.

you can also have a deeper look at the kubelet and api-server logs

Unless there was a new release like today or yesterday I am running out of ideas.

no, there were no 1.13 releases today, AFAIK.

I am not sure the cni config is a problem yet. Installing flannel would be next on the list.

a pod network plugin is installed only after the api server comes up.

@neolit123 neolit123 added area/ecosystem priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Feb 1, 2019
@tcurdt
Copy link
Author

tcurdt commented Feb 1, 2019

Thanks for quick update and the pointer to the other issue, @neolit123.

That does not read so well.
It really sounds like a race condition. I must have been just lucky before then.
I will try the initialization in phases and see if I can get it working that way - and then report back.

But what I really fail to see how this is not considered a bug.

@neolit123
Copy link
Member

But what I really fail to see how this is not considered a bug.

we need someone on RPI to debug the kubelet and pinpoint the exact problem area.
it was agreed that this is not a kubeadm bug, because the default liveness probe values should work on slow machines too.

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

Understood - but is there an issue open for kubelet then?

I was thinking I could init like this but apparently the --pod-network-cidr is not allowed as flag.

sudo kubeadm init phase control-plane --pod-network-cidr 10.244.0.0/16
sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g'             /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g'            /etc/kubernetes/manifests/kube-apiserver.yaml
sudo kubeadm init --skip-phases=control-plane --pod-network-cidr 10.244.0.0/16

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

Or is that maybe just not relevant for the control-plane phase?

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

Now this was the plan

sudo kubeadm init phase control-plane all
sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g'             /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g'            /etc/kubernetes/manifests/kube-apiserver.yaml
sudo kubeadm init --skip-phases=control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16

but kubeadm crashed 🤔

$ sudo kubeadm init --skip-phases=control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16 
[init] Using Kubernetes version: v1.13.3
[preflight] Running pre-flight checks
	[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
	[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
	[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.04.0-ce. Latest validated version: 18.06
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.43 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.43 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [km01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.178.43]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xaab708]

goroutine 1 [running]:
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.validateKubeConfig(0xfb953a, 0xf, 0xfb8fec, 0xe, 0x30346c0, 0x68f, 0x7bc)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:236 +0x120
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.createKubeConfigFileIfNotExists(0xfb953a, 0xf, 0xfb8fec, 0xe, 0x30346c0, 0x0, 0xf8160)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:257 +0x90
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.createKubeConfigFiles(0xfb953a, 0xf, 0x32d7b00, 0x31efc60, 0x1, 0x1, 0x0, 0x0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:120 +0xf4
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.CreateKubeConfigFile(0xfb8fec, 0xe, 0xfb953a, 0xf, 0x32d7b00, 0xb9c7cc01, 0xb9bfcc)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:93 +0xe8
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases.runKubeConfigFile.func1(0xf76bc8, 0x31f6aa0, 0x0, 0x0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/kubeconfig.go:155 +0x168
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1(0x32cea00, 0x0, 0x0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235 +0x160
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll(0x32d9220, 0x31efd68, 0x31f6aa0, 0x0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:416 +0x5c
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run(0x32d9220, 0x24, 0x3515db4)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:208 +0xc8
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdInit.func1(0x32afb80, 0x31cc680, 0x0, 0x4)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:141 +0xfc
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0x32afb80, 0x31cc4c0, 0x4, 0x4, 0x32afb80, 0x31cc4c0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:760 +0x20c
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x32ae140, 0x32afb80, 0x32ae780, 0x30ae108)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:846 +0x210
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(0x32ae140, 0x308c0c8, 0x117dec0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:794 +0x1c
k8s.io/kubernetes/cmd/kubeadm/app.Run(0x309c030, 0x0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:48 +0x1b0
main.main()
	_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:29 +0x20

@neolit123
Copy link
Member

neolit123 commented Feb 2, 2019

Understood - but is there an issue open for kubelet then?

AFAIK, no.
because there is no proof it's the kubelet.

was thinking I could init like this but apparently the --pod-network-cidr is not allowed as flag.

you can pass the podSubnet from the config (it's the same as pod-network-cidr).
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta1

Or is that maybe just not relevant for the control-plane phase?

it is relevant to the controller manager (so yes "control-plane" phase) and also kube-proxy.

[preflight] Running pre-flight checks
[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists

try calling kubeadm reset first.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xaab708]

please file a separate bug report about the panic including the reproduction steps.
might be only a case when you forgot to run reset.

@neolit123
Copy link
Member

also 1.13.3 was released as we speak:
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kubernetes-dev/3V2-C5Z6HA0/OdwHVNABEgAJ

but please still file that panic bug report for 1.13.2.

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

So do I understand correctly that I need to pass the config in order to provide the pod-network-cidr equivalent?

I don't quite understand why I would need to run kubeadm reset first given that I did execute this after a fresh install.

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

I don't quite get the way the contexts are used - and not checked for nil
https://github.com/kubernetes/kubernetes/blob/v1.13.3/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go#L236 ...but I guess that's for the other issue then.

@neolit123
Copy link
Member

So do I understand correctly that I need to pass the config in order to provide the pod-network-cidr equivalent?

looking at the source code you might want to try:

kubeadm init phase control-plane all --pod-network-cidr ....

what is the error message that you are getting?

I don't quite understand why I would need to run kubeadm reset first given that I did execute this after a fresh install.

always call reset before init, or any batch of phases.
init leaves "state" on the node that needs cleanup.

@neolit123
Copy link
Member

I don't quite get the way the contexts are used - and not checked for nil

this can only happen if the kubeconfig phase reads a corrupted kubeconfig file with missing cluster context. possible a side effect of not calling reset.

but yes, panics should be fixed....

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

I am still not convinced about the missing reset.
Where should the state come from when init has never been called so far?

I tried this anyway

sudo kubeadm reset
sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g'             /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g'            /etc/kubernetes/manifests/kube-apiserver.yaml
sudo kubeadm init --skip-phases=control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16

and I am getting the same thing

$ sudo kubeadm init --skip-phases=control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16
[init] Using Kubernetes version: v1.13.3
[preflight] Running pre-flight checks
	[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
	[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
	[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.04.0-ce. Latest validated version: 18.06
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [km01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.178.43]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.43 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.43 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xaab708]

goroutine 1 [running]:
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.validateKubeConfig(0xfb953a, 0xf, 0xfc3e7a, 0x17, 0x3034540, 0x68f, 0x7bc)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:236 +0x120
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.createKubeConfigFileIfNotExists(0xfb953a, 0xf, 0xfc3e7a, 0x17, 0x3034540, 0x0, 0xf8160)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:257 +0x90
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.createKubeConfigFiles(0xfb953a, 0xf, 0x3144b40, 0x3527c60, 0x1, 0x1, 0x0, 0x0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:120 +0xf4
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.CreateKubeConfigFile(0xfc3e7a, 0x17, 0xfb953a, 0xf, 0x3144b40, 0x99807501, 0xb9bfcc)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:93 +0xe8
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases.runKubeConfigFile.func1(0xf76bc8, 0x32f2ff0, 0x0, 0x0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/kubeconfig.go:155 +0x168
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1(0x336ee80, 0x0, 0x0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235 +0x160
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll(0x34ecbe0, 0x3527d68, 0x32f2ff0, 0x0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:416 +0x5c
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run(0x34ecbe0, 0x24, 0x35bbdb4)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:208 +0xc8
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdInit.func1(0x3513400, 0x3227560, 0x0, 0x4)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:141 +0xfc
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0x3513400, 0x32274c0, 0x4, 0x4, 0x3513400, 0x32274c0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:760 +0x20c
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x3512140, 0x3513400, 0x35123c0, 0x300c8e0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:846 +0x210
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(0x3512140, 0x300c0d8, 0x117dec0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:794 +0x1c
k8s.io/kubernetes/cmd/kubeadm/app.Run(0x3034060, 0x0)
	/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:48 +0x1b0
main.main()
	_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:29 +0x20

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

Time to open the other issue for the crash.

But working around the crash - this is the config file /etc/kubernetes/admin.conf after the crash. There is a cluster context. Any further ideas?

$ sudo cat /etc/kubernetes/admin.conf 
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRFNU1ESXdNakF4TXpBeU5Wb1hEVEk1TURFek1EQXhNekF5TlZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTUIrCmJ1ckREUGtBR1V5NmM3ZkY5VWI5UlMyRWVqdmx2aHRwRlBGVzVPRGtRL3ZsbFU5b05MR0txdXZjRVVFekJmTnkKcURzQzBsVktoTkFBMWl6TnplZVJEWlRDZ2ZFYitxa3Zib0xGd25hd1A0ZkRKKzVnUndxN0JEM0xYdWNsTFNmeApmczkwb05RaXdYL0hXSjBOUkJZRnN6Zk1iaXZaUSsrRDJjS0FOZm9qSGx2Rm9oU1BqZkVlWmp1NnBtTEhXNlMyCmY4NjJGcnhwSEdOWmhmR3JaTmd1YUFkK0tIM1BCc1IxTThpUFpiMnFjTEN0LzNmMHY2ejc4bUVoL294UC9oUjEKdWVGWmZJWCtpbmxzVXZDM2N3WXZ3VFd6ZnlOT0NSMUJCcUNHRmd4bmt0VVRJd0M3Szc3VHZUcGpnazd5NnAzSQpHMVd3SmVUUERYRXRleGhFTDQwQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFId2NhY0RuK0ZhaldKekJpVExlZmxLeVRSNTgKVm9yKzluZmtMRkNrTWZBcC94b2pwekVKbEJEM0didmh5V21tS2tJNDgvSHZabml1c1g1THNRdXhzamV4bnZTYwppMG9keFZOMTdSOFFiWVovZ0hsREdRanlnYXhvUWN6M1J5MFU3NmJ0U0toQ1VTTko2NEZqeGp1bU9MemVYbkRLCjlsRElPZHZ4VXRXZDVaajc1YmZFRmNyNHJKbEJTK0dZRi9Da2RrdzZtUlpXNCsrYkNPd3RBUGVUemd6bEZtQ1EKZmptM28wQUlNSitvMk9YUjFrRXFlTXo2VDM3b2FsYWNNU1hEeHh1cjBZUmw3NUJ2M2lBOGk0NE5Oei9tNzhOdQpPaW1ONnBVMDFyUWJEVjVBRzJmbndwaURBcGxNbkQ2R0FyZ3R5b3VUREs2ZmlWOXpZaVdkQlBLeFQ5az0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    server: https://192.168.178.43:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: kubernetes-admin
  name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
  user:
    client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM4akNDQWRxZ0F3SUJBZ0lJYk4rZTR0WFh4and3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB4T1RBeU1ESXdNVE13TWpWYUZ3MHlNREF5TURJd01UTXlNRFJhTURReApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sa3dGd1lEVlFRREV4QnJkV0psY201bGRHVnpMV0ZrCmJXbHVNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXBnMzBYR0h4U1IwMWdLMGEKc3FFb1hTanFJNXloZVErZ21YcWNJeDRMYUNIWVZJM2VZc29SbTVSTCtDYTNRblJ5aE4vSHVvMkJYUE1MdGlIZwpIR1BlL3VKRkRHOHJxa2xVbHZZSXZDMkE4QVpLVENEUzBFRmNoQ0RhOHhDMGVQUG9jbXdLWTdVRHFkWGIvY2RHCk8yZG9LaWJLeGtGM3dEWjVCUXR4VXgzTDB1bWZDVFFNOWlQYk00aHF3N3N0Rzc5SXE1dUZXU1VxMFNRb0tad0oKbDFzRXpCQ3kveGV2bWIvTG1jLzR5QTVRVGNPK09yejFTdUZReVRxN0NIb1g1T1ZadDRqbk9jQUZpdFhDbWFROAp3OW0wRERvanJLakVrWlZNL1M4aWY3T3hUZ1d5MDVkaGE4VWZ2TStHaFBmVTF1cEJqdFJGcXB2VUIySkp6UEFmCnl4cFJCUUlEQVFBQm95Y3dKVEFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUgKQXdJd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFFcmRYbEhyb2w1a0NQR1UyYUNJOUE5a3NuL0k2a1l3RlR6RAo5NFprNVFSQVhUNjlqSy9namY3c3dNL1JxY1RpRnNFQnY0bXpzeGRjNnBydXdHbytab1o3V2VGTTAvNFJNcFZJCm1qVitKbWdlNk14WUkyMWhOZnMydjNNN2RnbVpMRjJsN25yRTNMTVpiMHZMdUJuN2ZKZWxXb0lGSDd3WWFnQnIKeFlWVzZjYzJtWkkzWHVxYTcraWpjNHpJdmpDSjR6cTFiRUdSUlNEQWNwbjhnQjFXWXRoUWd2cHV0cGZGTGlDTApIK0dya1ZCR3FEY3VVbFRJMkJlZXVMMUduRXJsQzYremhDZnY1VStGR2pwS2RwaVN6UkV4T1F4bEJJOEYzQnZVCnp0VTRVVkR3S24vYUFOUm01N3d6dHlTL0FyOGJQUlRrR0psbGpOZVE0bEd1cWtFQTJKaz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBcGczMFhHSHhTUjAxZ0swYXNxRW9YU2pxSTV5aGVRK2dtWHFjSXg0TGFDSFlWSTNlCllzb1JtNVJMK0NhM1FuUnloTi9IdW8yQlhQTUx0aUhnSEdQZS91SkZERzhycWtsVWx2WUl2QzJBOEFaS1RDRFMKMEVGY2hDRGE4eEMwZVBQb2Ntd0tZN1VEcWRYYi9jZEdPMmRvS2liS3hrRjN3RFo1QlF0eFV4M0wwdW1mQ1RRTQo5aVBiTTRocXc3c3RHNzlJcTV1RldTVXEwU1FvS1p3Smwxc0V6QkN5L3hldm1iL0xtYy80eUE1UVRjTytPcnoxClN1RlF5VHE3Q0hvWDVPVlp0NGpuT2NBRml0WENtYVE4dzltMEREb2pyS2pFa1pWTS9TOGlmN094VGdXeTA1ZGgKYThVZnZNK0doUGZVMXVwQmp0UkZxcHZVQjJKSnpQQWZ5eHBSQlFJREFRQUJBb0lCQURsWFlrV3drS2lsekg3MQp4OTFkWjFuY01oWkFGVVovemY2UjUyNzlCZ1ZjZ3A2WUt1NUVSeFpKZkg1aHFERHJrMHd0Rm9SbUx3RFE4UDlnCjdVb0FkdFhmZnVhUFVTM0ppc3RpaEp1dXZ2S2p5VzVHZTJYczNDeklSN05kMW1SYUhhKzlmVXozQ2gvUXVOb0cKd1Vyc0ozMCt6aER1TkpNTWZIZndmcDZzRUdGeE9yYnN5WWE3S0l1RWxuQ0FHWXQwalpjcmw2OENKcVJnZEhEbwpwRFZCL2Zub0ZBZi82Ym9Ga1JTckJkeUM5clpqYlZRbmtwT0VpQ0JONCtMS3RIRjlhUXhELytJWXRVeWFrb2tLClNJNWVTZEhhbkl0U2hxaTVCQmtjV3c5cmdhZDJjYWE5TjRMR1Q1N29LSFJmSFV2T1phTDlGZC9xbjJlb285RlAKTXplcVdCVUNnWUVBeGk5Y3FIOEo1eHFmWUlpRXU2Ly9HTjZETXo0ZWlaSUZNRGdzWjhRR21Fc0hhd1ZZWHlRRwpQNjVES0p1ZUI3SWk1eHV2Z2tGR2JJVnBBMmRleEo2YzhtQmo4N2Zoa2s5SGxDb1VpcDBEdU9uVnJJTVR5Uk02CkR5QWNQaUw2MEY4cGFoU2htZ21USHdXYS81N1Vscllxc1N6RW4vVDBuVFFwQ09uYVJFTlVvTzhDZ1lFQTFuOE8Kdkk1OGx4MzBOSTg3UXV3eVNoMmxKUG04bnJUc0ZBbXRNNXB6Z1ovaUc5TUVGU0V0RzZLbnBuNlRrQjR0YzEzQgpiN01SVWZWY0RIQTRwS09TNk1DZHVoTmJLN3pmNjNOMFpMeWtMdzN2aExRYlhrRlBScEtEQm0rc3J2M0V1MEVnCnQwODNSKzdaMjV1aGhYa2I4WU9kaTZpQXk1VytMS2FPRzh0OWhVc0NnWUVBc2dDeUdZalk3U0NsUzMveXI5ejQKbzI2ZnFyTzltOVJ5SW9naG9pV1h3c3VJNHgvTzZzMGhhNnJxR1J3RWlXYi9JRkptaGZoNDkxbXdJMldCNGRtUQpuOFhob0hKbEFST0I5OXIveml3T3Z0UVBuYjJ4VktXWFBTU2JHVmd6ckZuOGlaSDBQN1VmMWZvajZEblJPWGh1CnllbXF4UHl2aEU3b0dHQnFNV3ZFSkRNQ2dZQVYxV01ib0dsZ1BJVlNJRTVJOXAvNzJWNnBEOTY2VFBKRzYrRTgKZ25sRmRZL2ZnekJFTWxkVUc4OXk3Q2w3SHdkRFdnVEpxUEdYWlNGVWhzdk5QblZDeWZDRU0xb3hibzFnZXlVYQo1L1RTY1ZtektWNHJ6dndSMC9JUVlxZXlQRlNkTnZqc2o5eXhyc2R3U2p3N3lPTW1SMTV2Qzl6b1hEcTZjczIrCldJMVRWd0tCZ1FDbWRpeG9nTXM0WkR3dFl4c3BBZzRqYUdRMURJck52bWJEcEl4eFhYNXBqSmFSWXU2VnljZk0KQkZJZmFtTkJpODNadDBIWkdERkdOdUdCSEJFTys4L1k4NWZDNWgySlM0MTBjUGhoVkoxWSs5Q0NpOGgzREZ2Swo5SWRzNkR0MUlCRFlsejFuV2p4cVcyb01zaGxZSy9BSkpYbGxRVXR3ZEJhczc4bkRvdkplYWc9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=

@neolit123
Copy link
Member

Any further ideas?

please, add --v=1 to the last init call and show the output near:

[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file

and I am getting the same thing

this skipping of the CP phase used to work, so something broke it.

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

$ sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
$ sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g'             /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g'            /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo kubeadm init --v=1 --skip-phases=control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16
I0202 02:56:47.690929   22985 feature_gate.go:206] feature gates: &{map[]}
[init] Using Kubernetes version: v1.13.3
[preflight] Running pre-flight checks
I0202 02:56:47.691923   22985 checks.go:572] validating Kubernetes and kubeadm version
I0202 02:56:47.692057   22985 checks.go:171] validating if the firewall is enabled and active
I0202 02:56:47.732287   22985 checks.go:208] validating availability of port 6443
I0202 02:56:47.732666   22985 checks.go:208] validating availability of port 10251
I0202 02:56:47.732815   22985 checks.go:208] validating availability of port 10252
I0202 02:56:47.732943   22985 checks.go:283] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml
  [WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
I0202 02:56:47.733144   22985 checks.go:283] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml
  [WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
I0202 02:56:47.733268   22985 checks.go:283] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml
  [WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
I0202 02:56:47.733363   22985 checks.go:283] validating the existence of file /etc/kubernetes/manifests/etcd.yaml
I0202 02:56:47.733401   22985 checks.go:430] validating if the connectivity type is via proxy or direct
I0202 02:56:47.733507   22985 checks.go:466] validating http connectivity to first IP address in the CIDR
I0202 02:56:47.733587   22985 checks.go:466] validating http connectivity to first IP address in the CIDR
I0202 02:56:47.733637   22985 checks.go:104] validating the container runtime
I0202 02:56:48.080363   22985 checks.go:130] validating if the service is enabled and active
I0202 02:56:48.147592   22985 checks.go:332] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0202 02:56:48.147801   22985 checks.go:332] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0202 02:56:48.147885   22985 checks.go:644] validating whether swap is enabled or not
I0202 02:56:48.148050   22985 checks.go:373] validating the presence of executable ip
I0202 02:56:48.148163   22985 checks.go:373] validating the presence of executable iptables
I0202 02:56:48.148235   22985 checks.go:373] validating the presence of executable mount
I0202 02:56:48.148307   22985 checks.go:373] validating the presence of executable nsenter
I0202 02:56:48.148370   22985 checks.go:373] validating the presence of executable ebtables
I0202 02:56:48.148450   22985 checks.go:373] validating the presence of executable ethtool
I0202 02:56:48.148521   22985 checks.go:373] validating the presence of executable socat
I0202 02:56:48.148585   22985 checks.go:373] validating the presence of executable tc
I0202 02:56:48.148649   22985 checks.go:373] validating the presence of executable touch
I0202 02:56:48.148714   22985 checks.go:515] running all checks
  [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.04.0-ce. Latest validated version: 18.06
I0202 02:56:48.379479   22985 checks.go:403] checking whether the given node name is reachable using net.LookupHost
I0202 02:56:48.379549   22985 checks.go:613] validating kubelet version
I0202 02:56:48.609854   22985 checks.go:130] validating if the service is enabled and active
I0202 02:56:48.661929   22985 checks.go:208] validating availability of port 10250
I0202 02:56:48.662183   22985 checks.go:208] validating availability of port 2379
I0202 02:56:48.662279   22985 checks.go:208] validating availability of port 2380
I0202 02:56:48.662378   22985 checks.go:245] validating the existence and emptiness of directory /var/lib/etcd
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
I0202 02:56:49.188603   22985 checks.go:833] image exists: k8s.gcr.io/kube-apiserver:v1.13.3
I0202 02:56:49.918678   22985 checks.go:833] image exists: k8s.gcr.io/kube-controller-manager:v1.13.3
I0202 02:56:50.638009   22985 checks.go:833] image exists: k8s.gcr.io/kube-scheduler:v1.13.3
I0202 02:56:51.361751   22985 checks.go:833] image exists: k8s.gcr.io/kube-proxy:v1.13.3
I0202 02:56:52.059469   22985 checks.go:833] image exists: k8s.gcr.io/pause:3.1
I0202 02:56:52.375666   22985 checks.go:833] image exists: k8s.gcr.io/etcd:3.2.24
I0202 02:56:52.735005   22985 checks.go:833] image exists: k8s.gcr.io/coredns:1.2.6
I0202 02:56:52.735203   22985 kubelet.go:71] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
I0202 02:56:53.663319   22985 kubelet.go:89] Starting the kubelet
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0202 02:56:54.401537   22985 certs.go:113] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
I0202 02:57:04.582986   22985 certs.go:113] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.43 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.43 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
I0202 02:58:01.814890   22985 certs.go:113] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [km01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.178.43]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0202 02:58:30.380212   22985 certs.go:72] creating a new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0202 02:58:39.411867   22985 kubeconfig.go:92] creating kubeconfig file for admin.conf
[kubeconfig] Writing "admin.conf" kubeconfig file
I0202 02:58:43.492318   22985 kubeconfig.go:92] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0202 02:58:50.508103   22985 kubeconfig.go:92] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0202 02:58:55.971676   22985 kubeconfig.go:92] creating kubeconfig file for scheduler.conf
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xaab708]

goroutine 1 [running]:
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.validateKubeConfig(0xfb953a, 0xf, 0xfb8fec, 0xe, 0x24b0270, 0x68b, 0x7bc)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:236 +0x120
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.createKubeConfigFileIfNotExists(0xfb953a, 0xf, 0xfb8fec, 0xe, 0x24b0270, 0x0, 0x2514aa0)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:257 +0x90
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.createKubeConfigFiles(0xfb953a, 0xf, 0x256cb40, 0x2483c60, 0x1, 0x1, 0x0, 0x0)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:120 +0xf4
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.CreateKubeConfigFile(0xfb8fec, 0xe, 0xfb953a, 0xf, 0x256cb40, 0x92184e01, 0xb9bfcc)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:93 +0xe8
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases.runKubeConfigFile.func1(0xf76bc8, 0x2583130, 0x0, 0x0)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/kubeconfig.go:155 +0x168
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1(0x2581180, 0x0, 0x0)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235 +0x160
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll(0x24d2dc0, 0x2483d68, 0x2583130, 0x0)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:416 +0x5c
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run(0x24d2dc0, 0x24, 0x29dddb4)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:208 +0xc8
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdInit.func1(0x2519400, 0x269c660, 0x0, 0x5)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:141 +0xfc
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0x2519400, 0x269c630, 0x5, 0x6, 0x2519400, 0x269c630)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:760 +0x20c
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x2518000, 0x2519400, 0x25183c0, 0x240c1e8)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:846 +0x210
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(0x2518000, 0x248a0c8, 0x117dec0)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:794 +0x1c
k8s.io/kubernetes/cmd/kubeadm/app.Run(0x24a2000, 0x0)
  /workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:48 +0x1b0
main.main()
  _output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:29 +0x20

@neolit123
Copy link
Member

neolit123 commented Feb 2, 2019

please show the contents of /etc/kubernetes/scheduler.conf

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

That file has a length of 0.

@neolit123
Copy link
Member

posting in the panic ticket in a bit.

@neolit123
Copy link
Member

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

related discussion at reddit:

Cool - I didn't find that one. Sounds very familiar.

Too bad that changing the timeouts (for some reason) doesn't help in my setup :-/

@neolit123
Copy link
Member

neolit123 commented Feb 2, 2019

we have reports for production CNCF clusters running ARM64 where supposedly the architecture works fine.

oddly enough these problems are only manifesting on RPI.
could be related to a family of CPUs and/or go compiler problems.

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

But then it would be odd that many people get it to work on RPis - at least eventually.

@neolit123
Copy link
Member

neolit123 commented Feb 2, 2019

you can also try to call the kubeconfig phase as a separate step:

sudo kubeadm reset
sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
sudo kubeadm init phase kubeconfig all
....
sudo kubeadm init --skip-phases=control-plane,kubeconfig --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16

untested, i think this might not work.
but just want to see if the panic persists.

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

Unfortunately that did not work :-/

$ sudo kubeadm reset
...
$ sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
$ sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g'             /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g'            /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo kubeadm init phase kubeconfig all
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
error execution phase kubeconfig/admin: couldn't create a kubeconfig; the CA files couldn't be loaded: failed to load certificate: couldn't load the certificate file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory

@neolit123
Copy link
Member

neolit123 commented Feb 2, 2019

try adding the certs phases and executing in this order:

sudo kubeadm reset
sudo kubeadm init phase certs all
sudo kubeadm init phase kubeconfig all
sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
....
sudo kubeadm init --skip-phases=certs,kubeconfig,control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16

the docs for init are here:
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init-phase/

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

Holy smokes - that worked! 🎉

sudo kubeadm reset
sudo kubeadm init phase certs all
sudo kubeadm init phase kubeconfig all
sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g'             /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g'            /etc/kubernetes/manifests/kube-apiserver.yaml
sudo kubeadm init --v=1 --skip-phases=certs,kubeconfig,control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

I then installed flannel

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.11.0/Documentation/kube-flannel.yml

and after a while - yay!

$ kubectl get nodes
NAME   STATUS   ROLES    AGE     VERSION
km01   Ready    master   3m43s   v1.13.3

@neolit123
Copy link
Member

glad it worked out.

i cannot explain why the panic happens and that needs investigation.
will now close this issue and track the other one.

as mentioned, at some point we need test signal to be able to catch problems on ARM.

@tcurdt
Copy link
Author

tcurdt commented Feb 2, 2019

Yes, the panic is really odd - and still leaves me a little puzzled.
But well - for that discussion we got the other issue.
Thanks for the help!

@gavD
Copy link

gavD commented Jul 25, 2019

I am so unbelievably grateful for this thread - thank you @neolit123 and everyone else involved - I finally got a Raspberry Pi 3 B to run a K8s master after three SOLID days of trying!

What's weird is, the very first time I tried, I followed this tutorial and it worked great, I set up a 3 node cluster, no problem. I shut it down at night, the next day I started it up - and nothing worked!

So, I re-etchered my SD cards and started afresh. Then the pain began. I tried every permutation of Docker version, Raspiian version, all sorts of flags and kernel versions, even bought a network switch and started flailing at that, must have read over 100 web pages and Github issues, nothing at all worked until I used the steps that @tcurdt used above.

Anyway, THANK YOU

@vrer2
Copy link

vrer2 commented Nov 28, 2019

Thank you soooo much..been facing this issue for almost 3 days..

@aryansaraf018
Copy link

Holy smokes - that worked! 🎉

sudo kubeadm reset
sudo kubeadm init phase certs all
sudo kubeadm init phase kubeconfig all
sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g'             /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g'            /etc/kubernetes/manifests/kube-apiserver.yaml
sudo kubeadm init --v=1 --skip-phases=certs,kubeconfig,control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16

Holy smokes - that worked! 🎉

sudo kubeadm reset
sudo kubeadm init phase certs all
sudo kubeadm init phase kubeconfig all
sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g'             /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g'            /etc/kubernetes/manifests/kube-apiserver.yaml
sudo kubeadm init --v=1 --skip-phases=certs,kubeconfig,control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16

Hey @tcurdt do we need to execute the 3 sudo sed commands?. I tried execution without them, it's still showing the same error. Help will be appreciated

@tcurdt
Copy link
Author

tcurdt commented Sep 22, 2021

@aryansaraf018 sorry, it's already quite a while since I looked into this. I suspected some kind of race condition, and adjusting the timings may or may not help. I was hoping this is no longer necessary.

@neolit123
Copy link
Member

neolit123 commented Sep 22, 2021 via email

@Anjali24-54
Copy link

Hello ! i understand this is an old thread.I am also getting same kind issue.well, I am using WSL and trying to install k8s components. I know i can use docker desktop but i found the approach little different here in docker desktop that's why i dont want to use docker desktop anymore. everytime i do kubeinit i get this error :
Unfortunately, an error has occurred:
timed out waiting for the condition

This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

How can i fix this issue? not to forget i an in WSL enviroment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ecosystem priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done.
Projects
None yet
Development

No branches or pull requests

6 participants