Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated Prometheus metrics output not meet with the requirements #2366

Closed
kallaics opened this issue Apr 8, 2024 · 6 comments
Closed

Generated Prometheus metrics output not meet with the requirements #2366

kallaics opened this issue Apr 8, 2024 · 6 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@kallaics
Copy link

kallaics commented Apr 8, 2024

What happened:

The KSM configuration worked well until KSM version v2.10.1. After the upgrade to v2.11.0 the Prometheus reported "invalid metric type" error message. The latest version v2.12.0 solved the "invalid metric type issue", but the required output has been provided only one resource type per metrics. The deployment and configuration not changed during this period.

The issue affected with the "build_info" metric name.

What you expected to happen:

To provide Prometheus output with same metric name and more resource type.

How to reproduce it (as minimally and precisely as possible):

  1. Kube state metrics deployed from prometheus-community/kube-prometheus-stack Helm chart via FluxCD
  2. Relevant Kube State Metrics configuration provided in Yaml format.
kube-state-metrics:
  collectors: [ ]
  extraArgs:
    - --custom-resource-state-only=true
  rbac:
    extraRules:
      - apiGroups:
          - apps
        resources:
          - deployments
        verbs: 
          - list
          - watch
      - apiGroups:
          - source.toolkit.fluxcd.io
          - kustomize.toolkit.fluxcd.io
          - helm.toolkit.fluxcd.io
          - notification.toolkit.fluxcd.io
          - image.toolkit.fluxcd.io
        resources:
          - gitrepositories
          - buckets
          - helmrepositories
          - helmcharts
          - ocirepositories
          - kustomizations
          - helmreleases
          - alerts
          - providers
          - receivers
          - imagerepositories
          - imagepolicies
          - imageupdateautomations
        verbs: [ "list", "watch" ]
  customResourceState:
    enabled: true
    config:
      spec:
        resources:
          - groupVersionKind:
              group: apps
              version: v1
              kind: Deployment
            metricNamePrefix: gotk
            metrics:
              - name: "build_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      version: [metadata, labels, "app.kubernetes.io/version" ]
                      component: [metadata, labels, "app.kubernetes.io/component" ]
                      instance: [metadata, labels, "app.kubernetes.io/instance" ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
          - groupVersionKind:
              group: kustomize.toolkit.fluxcd.io
              version: v1
              kind: Kustomization
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
                  source_name: [ spec, sourceRef, name ]
          - groupVersionKind:
              group: helm.toolkit.fluxcd.io
              version: v2beta2
              kind: HelmRelease
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  released: [ status, conditions, "[type=Released]", status ]
                  suspended: [ spec, suspend ]
                  chart_name: [ spec, chart, spec, chart ]
                  chart_source_name: [ spec, chart, spec, sourceRef, name ]
          - groupVersionKind:
              group: source.toolkit.fluxcd.io
              version: v1
              kind: GitRepository
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
                  url: [ spec, url ]
          - groupVersionKind:
              group: source.toolkit.fluxcd.io
              version: v1beta2
              kind: Bucket
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
                  endpoint: [ spec, endpoint ]
                  bucket_name: [ spec, bucketName ]
          - groupVersionKind:
              group: source.toolkit.fluxcd.io
              version: v1beta2
              kind: HelmRepository
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
                  url: [ spec, url ]
          - groupVersionKind:
              group: source.toolkit.fluxcd.io
              version: v1beta2
              kind: HelmChart
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
                  chart_name: [ spec, chart ]
                  chart_version: [ spec, version ]
          - groupVersionKind:
              group: source.toolkit.fluxcd.io
              version: v1beta2
              kind: OCIRepository
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
                  url: [ spec, url ]
          - groupVersionKind:
              group: notification.toolkit.fluxcd.io
              version: v1beta3
              kind: Alert
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
          - groupVersionKind:
              group: notification.toolkit.fluxcd.io
              version: v1beta3
              kind: Provider
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
          - groupVersionKind:
              group: notification.toolkit.fluxcd.io
              version: v1
              kind: Receiver
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
                  webhook_path: [ status, webhookPath ]
          - groupVersionKind:
              group: image.toolkit.fluxcd.io
              version: v1beta2
              kind: ImageRepository
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
                  image: [ spec, image ]
          - groupVersionKind:
              group: image.toolkit.fluxcd.io
              version: v1beta2
              kind: ImagePolicy
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
                  source_name: [ spec, imageRepositoryRef, name ]
          - groupVersionKind:
              group: image.toolkit.fluxcd.io
              version: v1beta1
              kind: ImageUpdateAutomation
            metricNamePrefix: gotk
            metrics:
              - name: "resource_info"
                help: "The current state of a GitOps Toolkit resource."
                each:
                  type: Info
                  info:
                    labelsFromPath:
                      name: [ metadata, name ]
                labelsFromPath:
                  exported_namespace: [ metadata, namespace ]
                  ready: [ status, conditions, "[type=Ready]", status ]
                  status: [ status, conditions, "[type=Ready]", reason ]
                  reconciling: [ status, conditions, "[type=Reconciling]", status ]
                  stalled: [ status, conditions, "[type=Stalled]", status ]
                  suspended: [ spec, suspend ]
                  source_name: [ spec, sourceRef, name ]

Anything else we need to know?:

Environment:

  • kube-state-metrics version: v2.12.0
  • Kubernetes version (use kubectl version): 1.28.5
  • Cloud provider or hardware configuration: Azure Kubernetes Service
  • Other info: Deployed with Helm from kube-prometheus-stack Helm chart.
@kallaics kallaics added the kind/bug Categorizes issue or PR as related to a bug. label Apr 8, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 8, 2024
@kallaics kallaics changed the title Metrics ouput not meet with the requirements Generated Prometheus metrics output not meet with the requirements Apr 8, 2024
@kingdonb
Copy link

I've tested the flux2-monitoring-example and verified we were using kube-state-metrics v2.12.0, it does not seem to resolve the issue completely, though some metrics came back, in fluxcd/flux2-monitoring-example#32 you can see we only returned "HelmRelease" metrics and the other resource kinds' metrics did not come back.

@speer
Copy link

speer commented Apr 16, 2024

I did some tests and found, that it's related to the code change of the SanitizeHeaders function in: #2270 https://github.com/kubernetes/kube-state-metrics/pull/2270/files#diff-60450a33adea08c953656dd1e78a80e9f3b279bbc7656dedf31fd1a0c7fc1196

The issue seems to be in the help: "The current state of a GitOps Toolkit resource." message. If you make this one unique (ex. different one for HelmRelease, Kustomization, etc.), the metrics do not get removed by the function mentioned above.

I am just not sure if that's a bug or a feature, maybe the author @rexagod knows?

@logicalhan
Copy link
Member

/assign @CatherineF-dev
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 18, 2024
@kallaics
Copy link
Author

I did some tests and found, that it's related to the code change of the SanitizeHeaders function in: #2270 https://github.com/kubernetes/kube-state-metrics/pull/2270/files#diff-60450a33adea08c953656dd1e78a80e9f3b279bbc7656dedf31fd1a0c7fc1196

The issue seems to be in the help: "The current state of a GitOps Toolkit resource." message. If you make this one unique (ex. different one for HelmRelease, Kustomization, etc.), the metrics do not get removed by the function mentioned above.

I am just not sure if that's a bug or a feature, maybe the author @rexagod knows?

I can confirm. After I changed the "help" fields, the metrics are appeared in Prometheus and Grafana. Thanks @speer !

@rexagod
Copy link
Member

rexagod commented May 20, 2024

Hello, apologies for the late response. 👋🏼

Prometheus' protobuf machinery does not support all OpenMetrics types at the moment (#2248). To resolve this, #2270 was merged which implicitly converted stateset and info to gauge metrics, before piping them out (PTAL at these test-cases). This, in turn, gave rise to cases where metrics that were previously seemingly non-conflicting, would potentially start to conflict now, which is why the patch had to include a deduplicating capability, causing the issue raised here as a side-effect.

fluxcd/flux2-monitoring-example#32 (comment) presents a take on this that has been the implicit sentiment on such configuration scenarios, i.e., if the use-case warrants for different groupVersionKind definitions, it should ideally be acquainted by different help texts to indicate what changed between them.

I'd be happy to follow this up by pointing out the caveat observed here in the documentation for future instances.

@kallaics
Copy link
Author

Hello,

Many thanks for the detailed explanation. It's clear now, why it caused the issue. 👍️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

7 participants