Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2: selfReporting generates incomplete metric grafana_kubernetes_monitoring_build_info #875

Open
marcomusso opened this issue Nov 8, 2024 · 5 comments

Comments

@marcomusso
Copy link

From a vanilla values file I enabled selfReporting to ensure all metrics checks are green in Grafana Cloud (Home => Infrastructure => Kubernetes => Configuration => Metrics status).

I defined 3 destinations (prometheus, loki and otlp) and deployed.

In the resulting alloy configs only the singleton shows the reporting blocks added by the chart templates (which is probably correct as it's the $chosenCollector even if I would expect the receiver to be higher priority if the "list" is ordered) and specifically generates this metric:

# TYPE grafana_kubernetes_monitoring_build_info gauge                                                        grafana_kubernetes_monitoring_build_info{version="2.0.0-rc.2", namespace="grafana-k8s-monitoring-v2", platform=""} 1

which lacks the cluster label (ie it's not added even in further relabeling blocks) thus failing the check:

Image

Here we can query for that metric and see its labels:

Image

Please note: the cluster label is missing from all grafana_kubernetes.* metrics, is that expected/correct?

As a side note: the selfReporting.scrapeInterval key can be set but it's overridden by the global one so I don't see the point in setting it (and the comment/description in the values file doesn't help understand how to use it).

@petewall
Copy link
Collaborator

petewall commented Nov 8, 2024

the scrape interval for self-reporting isn't overridden by the global:

  scrape_interval = {{ .Values.selfReporting.scrapeInterval | default "1h" | quote}}

If you're seeing otherwise, let me know.

The cluster label should be set by the destination, not the data source (self-reporting)... I'll investigate why this isn't showing up here...

@marcomusso
Copy link
Author

marcomusso commented Nov 8, 2024

Fact is (what I tried): I set 5m in that scrape interval and yet I got samples each minute that's why I said it was overridden (maybe poor choice of words, I didn't investigate too much but trusted the comment in the value file).

@petewall
Copy link
Collaborator

petewall commented Nov 8, 2024

Cool. I'll try it out

@marcomusso
Copy link
Author

marcomusso commented Nov 8, 2024

btw: the cluster label is now present in all grafana_kubernetes_* metrics. Probably before it was not added as "external label" because it was not sent to a prometheus destination (if I read that line correctly)?

PS: it is wrong to assume that an OTLP-only destination should be able to carry/relabel everything correctly?

@petewall
Copy link
Collaborator

petewall commented Nov 8, 2024

Yeah, I just fixed an issue where it would request prometheus-ecosystem metrics destinations, but then try to use otlp-ecosystem metrics destinations. I fixed it to be consistent (prefer prometheus ecosystem).

I also need to fix the otlp destination to set cluster as well as k8s.cluster.name, which matches the behavior of loki and prometheus destinations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants