-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8scluster] add k8s.container.status_last_terminated_reason metric #31282
Comments
Pinging code owners for receiver/k8scluster: @dmitryax @TylerHelmuth @povilasv. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@povilasv |
I've grepped thru the Kubernetes code base and couldn't find a set of possible Reason values. It seems that it's set dynamically somehow, and really hard to find the actual possible values. I think this is why Kube-State-Metrics also did this similiar way -> https://github.com/kubernetes/kube-state-metrics/blob/122e5e899943eb78eaf3e366733d5dbec6613ac0/internal/store/pod.go#L339 |
In that case an attribute makes sense to me |
Opened a PR for it #31281 :) |
Would adding a |
@avanish-vaghela maybe. Please open another issue for that request. |
I don't get why another metric is needed for this. Can it be an optional resource attribute instead? |
@dmitryax I believe you meant metric attribute as the attribute's value would change based on the state when metric value is registered |
Thanks for feedback. I think you are right this should be resource attribute. I think im still used modelling everything as metric. Given Otel has Resource model and this seems to fit more . Container is the resource and |
… resource attribute (open-telemetry#31505) **Description:** Add k8s.container.status.last_terminated_reason resource attribute **Link to tracking Issue:** open-telemetry#31282
Can this issue be closed now that #31505 is merged? |
@povilasv, thank you for enabling us to detect OOMKilled errors. However, which metric should we utilize for detecting CrashLoopBackOff? kube-state-metrics is utilizing kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff"}. Thank you. |
@ElfoLiNk you need a different resource attribute (status_waiting_reason) for this. I suggest you file a new issue. |
Component(s)
receiver/k8scluster
Is your feature request related to a problem? Please describe.
I would like to get some container state metrics, about termination. One use case is to know whether the container was terminated due to OOM kill or application errorr.
Example happening in pod:
Kube State Metrics has this modelled as this Prometheus metric:
Ref: https://github.com/kubernetes/kube-state-metrics/blob/main/docs/pod-metrics.md
So would be great to have a similar metric.
Describe the solution you'd like
Not sure how to model it in otel correctly, but I'm thinking something like this:
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: