Nodes become unhealthy after from 4.11 to 4.12 #2003
hamidostad
started this conversation in
General
Replies: 1 comment
-
Hi, We are not working on FCOS builds of OKD any more. Please see these documents... https://okd.io/blog/2024/06/01/okd-future-statement We will be providing documentation on upgrading clusters from 4.15 FCOS to 4.16 SCOS. In terms of clusters that are older, you may be able to get help from community members. I'll convert this to a discussion to facilitate that. Many thanks, Jaime |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The previous cluster version was 4.11.0-0.okd-2022-12-02-145640. After upgrade the cluster to each version of 4.12, The Nodes become unhealthy. When we check the nodes, we find out the EC2 is not healthy too. When we check EC2 and its services, we faced error in networkmanager that doesn't assign IP to the instance and also, kubelet service is not running. Finally the error shows issue is relatted to ovsdb-server. The user and group "openvswitch:hugetlbfs" is not exist on the instance and it cause failing the ovsdb-server and openvswitch.
When we create mentioned user and group, the problem is solved.
The question is:
Why upgrading to 4.12 version causes this problem? The cluster doesn't have this issue when upgrade patches in 4.11 version.
ovsdb-server log
Cluster upgrade history
Version
from: 4.11.0-0.okd-2022-12-02-145640
to: 4.12.0-0.okd-2023-03-18-084815
How to reproduce
oc adm upgrade --to="4.12.0-0.okd-2023-03-18-084815"
Beta Was this translation helpful? Give feedback.
All reactions