-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Faults" view should show all Terminating pods #2738
Comments
The reason is because a pod in Terminating does not necessarily mean its containers are not ready, which is what k9s is using to classify faulty pods. You can see that here: Lines 168 to 177 in be1ec87
When an error is returned, this is propagated to the k9s/internal/model1/table_data.go Line 211 in be1ec87
In my experience we had an EC2 fail causing services to go out - pod phases were Terminating, but not showing in the Faults view because the container statuses were still Ready. So I agree with @akatch. Majority of the time the container-ready/container-total metric works, but there are cases where it doesn't apply. In addition, if we could just see all terminating pods it would help reveal pods that are stuck in terminating. We've dealt with that problem extensively and had to manually search for them because they are filtered out in the faults view. |
Describe the bug
Enabling the "Toggle Faults" view shows some Terminating pods, but not all. Enabling this view should display all Terminating pods (and indeed all pods not in a Running and Ready state). However, it is unclear why some pods show up as Terminating in this view, but others do not. I did some brief digging in the code and it is not entirely clear how k9s determines which pods are considered faulty - it's possible that some Terminating pods meet these criteria but not all.
Further investigation shows that some Terminating pods with Events such as Node Not Ready (which I would absolutely 100% expect to show up in Faults) do not show up in the Fault view. This is the case in the attached screenshots below.
To Reproduce
Steps to reproduce the behavior:
:pods [namespace]
where many pods are Terminatingctrl+z
by defaultExpected behavior
All pods not in a Running/Ready state should appear when Faults view is enabled.
Screenshots
I have had to heavily sanitize these but hopefully they help demonstrate the issue.
A view of all pods, in particular many that are Terminating
The same namespace captured moments later in Fault view. No Terminating pods are seen.
Versions
The text was updated successfully, but these errors were encountered: