Replies: 9 comments
-
Please attach log bundle |
Beta Was this translation helpful? Give feedback.
-
I see three masters have requested a config, so network is up there. The log bundle however doesn't have info from masters. Did they boot? Did |
Beta Was this translation helpful? Give feedback.
-
Yes the masters booted up, and I can access them via SSH. Is there a command to run on the masters to extract any logs? |
Beta Was this translation helpful? Give feedback.
-
Journalctl on master 0 shows logs like below: Jun 09 22:48:31 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: I0609 22:48:31.977808 2053 scope.go:95] [topologymanager] RemoveContainer - Container ID: 3b6538fe524debcea6870c641e4a5ab74fa471de7734021eb3aecc277e22e11d Jun 09 22:48:31 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:31.980313 2053 pod_workers.go:191] Error syncing pod c15d60c2-ca2e-4e01-aca2-912347c6b9ac ("ingress-operator-ff794bdbb-ckb27_openshift-ingress-operator(> Jun 09 22:48:32 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:32.375884 2053 kubelet_node_status.go:470] Error updating node status, will retry: error getting node "summit-odience-infra-xk4z8-master-0": Get "https:> Jun 09 22:48:32 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:32.378079 2053 kubelet_node_status.go:470] Error updating node status, will retry: error getting node "summit-odience-infra-xk4z8-master-0": Get "https:> Jun 09 22:48:32 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:32.380138 2053 kubelet_node_status.go:470] Error updating node status, will retry: error getting node "summit-odience-infra-xk4z8-master-0": Get "https:> Jun 09 22:48:32 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:32.383109 2053 kubelet_node_status.go:470] Error updating node status, will retry: error getting node "summit-odience-infra-xk4z8-master-0": Get "https:> Jun 09 22:48:32 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:32.385486 2053 kubelet_node_status.go:470] Error updating node status, will retry: error getting node "summit-odience-infra-xk4z8-master-0": Get "https:> Jun 09 22:48:32 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:32.385836 2053 kubelet_node_status.go:457] Unable to update node status: update node status exceeds retry count Jun 09 22:48:32 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:32.791521 2053 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://api-int.summit-odience-infra.okd.rcs.st:6443/apis> Jun 09 22:48:33 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:33.980228 2053 status_manager.go:550] Failed to get status for pod "dns-default-wwjz7_openshift-dns(e744d9f9-151a-45a8-a1e3-f73598088c77)": Get "https:/> Jun 09 22:48:33 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:33.983467 2053 status_manager.go:550] Failed to get status for pod "machine-config-server-rjz2m_openshift-machine-config-operator(4c7076f7-fdc0-4c0f-8ad> Jun 09 22:48:33 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:33.985810 2053 status_manager.go:550] Failed to get status for pod "openshift-kube-scheduler-operator-67f8c75f44-wd9dd_openshift-kube-scheduler-operator> Jun 09 22:48:33 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:33.987829 2053 status_manager.go:550] Failed to get status for pod "multus-admission-controller-t5cct_openshift-multus(125c0e5f-dd7b-4932-8e86-dff87298a> Jun 09 22:48:33 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:33.990597 2053 status_manager.go:550] Failed to get status for pod "kube-apiserver-operator-65b5d69986-kfkks_openshift-kube-apiserver-operator(f26d2bff-> Jun 09 22:48:33 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:33.992843 2053 status_manager.go:550] Failed to get status for pod "service-ca-operator-69864f7977-cw6xv_openshift-service-ca-operator(faadebd1-9500-4b8> Jun 09 22:48:33 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:33.996035 2053 status_manager.go:550] Failed to get status for pod "insights-operator-648b8fcb86-tp7wt_openshift-insights(d78fc7e6-fec4-47cc-a8e6-814a04> Jun 09 22:48:33 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:33.997811 2053 status_manager.go:550] Failed to get status for pod "olm-operator-6cccdd75cf-zcb2c_openshift-operator-lifecycle-manager(e82d8e41-d1e6-43b> Jun 09 22:48:33 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:33.999600 2053 status_manager.go:550] Failed to get status for pod "openshift-config-operator-646647cc57-p8r7c_openshift-config-operator(5367da96-8a69-4> Jun 09 22:48:34 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:34.001302 2053 status_manager.go:550] Failed to get status for pod "catalog-operator-7fbd544bc4-n8vhf_openshift-operator-lifecycle-manager(6fb8001d-eab0> Jun 09 22:48:34 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:34.003028 2053 status_manager.go:550] Failed to get status for pod "cluster-image-registry-operator-6769dccbdc-fc2gx_openshift-image-registry(9b1fbf55-1> Jun 09 22:48:34 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:34.005271 2053 status_manager.go:550] Failed to get status for pod "openshift-kube-scheduler-summit-odience-infra-xk4z8-master-0_openshift-kube-schedule> Jun 09 22:48:34 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:34.006925 2053 status_manager.go:550] Failed to get status for pod "etcd-summit-odience-infra-xk4z8-master-0_openshift-etcd(8e74cae7-4afe-4941-8d7b-34bd> Jun 09 22:48:34 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:34.008436 2053 status_manager.go:550] Failed to get status for pod "ingress-operator-ff794bdbb-ckb27_openshift-ingress-operator(c15d60c2-ca2e-4e01-aca2-> Jun 09 22:48:34 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:34.010585 2053 status_manager.go:550] Failed to get status for pod "service-ca-6756b64f77-wchfv_openshift-service-ca(386eee64-1dc8-471a-b443-343dedafcb8> Jun 09 22:48:34 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: I0609 22:48:34.061413 2053 prober.go:117] Readiness probe for "etcd-summit-odience-infra-xk4z8-master-0_openshift-etcd(8e74cae7-4afe-4941-8d7b-34bd5793696d):etcd" f> Jun 09 22:48:39 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: I0609 22:48:39.061477 2053 prober.go:117] Readiness probe for "etcd-summit-odience-infra-xk4z8-master-0_openshift-etcd(8e74cae7-4afe-4941-8d7b-34bd5793696d):etcd" f> Jun 09 22:48:39 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:39.794633 2053 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://api-int.summit-odience-infra.okd.rcs.st:6443/apis> Jun 09 22:48:40 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:40.866498 2053 event.go:273] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"dns-default-> Jun 09 22:48:42 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:42.388822 2053 kubelet_node_status.go:470] Error updating node status, will retry: error getting node "summit-odience-infra-xk4z8-master-0": Get "https:> Jun 09 22:48:42 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:42.390989 2053 kubelet_node_status.go:470] Error updating node status, will retry: error getting node "summit-odience-infra-xk4z8-master-0": Get "https:> Jun 09 22:48:42 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:42.393252 2053 kubelet_node_status.go:470] Error updating node status, will retry: error getting node "summit-odience-infra-xk4z8-master-0": Get "https:> Jun 09 22:48:42 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:42.395193 2053 kubelet_node_status.go:470] Error updating node status, will retry: error getting node "summit-odience-infra-xk4z8-master-0": Get "https:> Jun 09 22:48:42 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:42.397276 2053 kubelet_node_status.go:470] Error updating node status, will retry: error getting node "summit-odience-infra-xk4z8-master-0": Get "https:> Jun 09 22:48:42 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:42.397331 2053 kubelet_node_status.go:457] Unable to update node status: update node status exceeds retry count Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: I0609 22:48:43.788727 2053 kubelet_getters.go:176] "Pod status updated" pod="openshift-vsphere-infra/haproxy-summit-odience-infra-xk4z8-master-0" status=Running Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: I0609 22:48:43.788890 2053 kubelet_getters.go:176] "Pod status updated" pod="openshift-vsphere-infra/coredns-summit-odience-infra-xk4z8-master-0" status=Running Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: I0609 22:48:43.788940 2053 kubelet_getters.go:176] "Pod status updated" pod="openshift-kube-scheduler/openshift-kube-scheduler-summit-odience-infra-xk4z8-master-0" > Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: I0609 22:48:43.788975 2053 kubelet_getters.go:176] "Pod status updated" pod="openshift-etcd/etcd-summit-odience-infra-xk4z8-master-0" status=Running Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: I0609 22:48:43.789005 2053 kubelet_getters.go:176] "Pod status updated" pod="openshift-vsphere-infra/mdns-publisher-summit-odience-infra-xk4z8-master-0" status=Runn> Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: I0609 22:48:43.789041 2053 kubelet_getters.go:176] "Pod status updated" pod="openshift-vsphere-infra/keepalived-summit-odience-infra-xk4z8-master-0" status=Running Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:43.981968 2053 status_manager.go:550] Failed to get status for pod "catalog-operator-7fbd544bc4-n8vhf_openshift-operator-lifecycle-manager(6fb8001d-eab0> Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: I0609 22:48:43.982224 2053 scope.go:95] [topologymanager] RemoveContainer - Container ID: 3b6538fe524debcea6870c641e4a5ab74fa471de7734021eb3aecc277e22e11d Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: E0609 22:48:43.983600 2053 pod_workers.go:191] Error syncing pod c15d60c2-ca2e-4e01-aca2-912347c6b9ac ("ingress-operator-ff794bdbb-ckb27_openshift-ingress-operator(> Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:43.985039 2053 status_manager.go:550] Failed to get status for pod "cluster-image-registry-operator-6769dccbdc-fc2gx_openshift-image-registry(9b1fbf55-1> Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:43.986857 2053 status_manager.go:550] Failed to get status for pod "openshift-kube-scheduler-summit-odience-infra-xk4z8-master-0_openshift-kube-schedule> Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:43.988950 2053 status_manager.go:550] Failed to get status for pod "etcd-summit-odience-infra-xk4z8-master-0_openshift-etcd(8e74cae7-4afe-4941-8d7b-34bd> Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:43.996688 2053 status_manager.go:550] Failed to get status for pod "ingress-operator-ff794bdbb-ckb27_openshift-ingress-operator(c15d60c2-ca2e-4e01-aca2-> Jun 09 22:48:43 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:43.998540 2053 status_manager.go:550] Failed to get status for pod "service-ca-6756b64f77-wchfv_openshift-service-ca(386eee64-1dc8-471a-b443-343dedafcb8> Jun 09 22:48:44 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:44.000927 2053 status_manager.go:550] Failed to get status for pod "dns-default-wwjz7_openshift-dns(e744d9f9-151a-45a8-a1e3-f73598088c77)": Get "https:/> Jun 09 22:48:44 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:44.005534 2053 status_manager.go:550] Failed to get status for pod "machine-config-server-rjz2m_openshift-machine-config-operator(4c7076f7-fdc0-4c0f-8ad> Jun 09 22:48:44 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:44.010377 2053 status_manager.go:550] Failed to get status for pod "openshift-kube-scheduler-operator-67f8c75f44-wd9dd_openshift-kube-scheduler-operator> Jun 09 22:48:44 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:44.013218 2053 status_manager.go:550] Failed to get status for pod "multus-admission-controller-t5cct_openshift-multus(125c0e5f-dd7b-4932-8e86-dff87298a> Jun 09 22:48:44 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:44.014891 2053 status_manager.go:550] Failed to get status for pod "kube-apiserver-operator-65b5d69986-kfkks_openshift-kube-apiserver-operator(f26d2bff-> Jun 09 22:48:44 summit-odience-infra-xk4z8-master-0 hyperkube[2053]: W0609 22:48:44.016731 2053 status_manager.go:550] Failed to get status for pod "service-ca-operator-69864f7977-cw6xv_openshift-service-ca-operator(faadebd1-9500-4b8> |
Beta Was this translation helpful? Give feedback.
-
Looks like it did not pass: $ systemctl status machine-config-daemon-firstboot ○ machine-config-daemon-firstboot.service - Machine Config Daemon Firstboot Loaded: loaded (/etc/systemd/system/machine-config-daemon-firstboot.service; enabled; vendor preset: disabled) Active: inactive (dead) Condition: start condition failed at Wed 2021-06-09 22:27:24 UTC; 17h ago Jun 09 22:27:24 summit-odience-infra-xk4z8-master-0 systemd[1]: Condition check resulted in Machine Config Daemon Firstboot being skipped. |
Beta Was this translation helpful? Give feedback.
-
If kubelet is running it means it has passed on previous boot.
Kubelet can't reach api-int, which is internal API LB. Seems similar to #665 |
Beta Was this translation helpful? Give feedback.
-
We tried a deployment on the same infra with a trial version of openshift, and that worked. Most likely there are differences in the code causing this. We are still investigating. Where is the API hosted, and which node is trying to access it? Also we are able to deploy an older version of OKD (e.g. 4.4) successfully. |
Beta Was this translation helpful? Give feedback.
-
Each master and bootstrap node run api container. Did you get a chance to collect a log-bundle with data from masters? |
Beta Was this translation helpful? Give feedback.
-
This may be similar to issue #557 but this case seems to have a different root cause based on the logs observed.
Description
We are trying to create a basic IPI cluster as a test on vmware. Using a similar install-config.yaml on AWS worked fine, but once the same config is switched over to VMware 6.7, the installation fails consistently. The installation log from the console shows the following errors:
Install Shell
Version
This has been happening since OKD 4.5, but was tested yesterday with okd-2021-06-04-191031 and was confirmed to still have the same issue.
How reproducible
This happens 100% of the time (attempted 3 times, with a fresh installation of FC 33 on the machine running the installer). The install-config used can be found below:
Log bundle
The log bundle will take some time to sanitize. In the meantime, the bootkube log appears to have some indications of the failures that occurred:
Beta Was this translation helpful? Give feedback.
All reactions