OKD (kernel issues (6.x)) #1874
forzamehlano
started this conversation in
General
Replies: 1 comment 2 replies
-
Correct, it needs to be fixed upstream, the fix land in Fedora and it would be picked up by FCOS and finally in OKD
You can build your own OS with downgraded kernel to mitigate this, but in general if its deemed stable for FCOS and passes k8s conformance tests this kernel is good enough for OKD. I suggest to file a Fedora bug so that it could be proposed as a critical enough for FCOS. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We're running into https://bugzilla.kernel.org/show_bug.cgi?id=217572 and have been since OKD 4.12 I believe (currently running 4.14.0-0.okd-2023-11-14-101924).
We managed to mitigate the bug somewhat by migrating all of our persistent volumes away from xfs in favour of ext4. That move reduced the number of instances of this bug from 3 or 4 per day, down to a handful per week. This helped but clearly this is still a problem.
We're after ideas on how to either troubleshoot this issue or work around it. The idea of SCOS and a kernel more closely aligned with RHCOS makes sense, but a lot of the eco-system around SCOS doesn't feel ready yet (nvidia gpu driver containers for example).
The bug itself is almost certainly IO related but we're unable to reproduce it on demand.
Appreciate this isn't directly an OKD issue, but we need some ideas from an OKD perspective on where we can go from here...
Any bright ideas/thoughts?
Beta Was this translation helpful? Give feedback.
All reactions