Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

labels_allow_list and annotations_allow_list wildcards clobber resource specific configuration #2488

Open
ringerc opened this issue Aug 29, 2024 · 1 comment
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@ringerc
Copy link

ringerc commented Aug 29, 2024

What happened:

When configuring kube-state-metrics to emit kube_{resourcekindplural}_labels info-metrics, I found that a * entry in the label allow map clobbers all configuration for all other resource types.

Expressed in kube-state-metrics yaml config format rather than CLI format for readability, for example, adding the wildcard in the below:

    labels_allow_list:
      "*":
        - some_custom_label
      nodes:
        # AWS EKS node pool
        - eks.amazonaws.com/nodegroup
        # Azure AKS node pool
        - agentpool
        # google GKE node pool
        - cloud.google.com/gke-nodepool

will cause kube_node_labels to have only label_some_custom_label.

The addition of the wildcard removes the previously present label_eks_amazonaws_com_nodegroup, label_agentpool and label_cloud_google_com_gke_nodepool.

So the net result of this configuration is equivalent to:

    labels_allow_list:
      nodes:
        - some_custom_label

What you expected to happen:

I expected the resource-specific configuration to either append to or override the wildcard configuration, so the net effective configuration would be either:

    labels_allow_list:
      nodes:
        # labels from "*"
        - some_custom_label
        # labels from "nodes"
        - eks.amazonaws.com/nodegroup
        - agentpool
        - cloud.google.com/gke-nodepool
      otherresource:
        # labels from "*"
        - some_custom_label
      # ...

or (if node-specific overrides * rather than appending:

    labels_allow_list:
      nodes:
        # labels from "nodes"
        - eks.amazonaws.com/nodegroup
        - agentpool
        - cloud.google.com/gke-nodepool
      otherresource:
        # labels from "*"
        - some_custom_label
      # ...

How to reproduce it (as minimally and precisely as possible):

Run kube-state-metrics with CLI arguments for wildcard added:

"--metric-labels-allowlist=nodes=[kubernetes.io/arch],*=[somenonexistentlabel]"

and query its metrics endpoint over port-forward (or use prometheus) e.g.

curl -sSLf1 http://127.0.0.1:8080/metrics |grep ^kube_node_labels

you will note that the node labels do not contain label_kubernetes_io_arch.

Now re-launch kube-state-metrics, but this time with CLI arguments omitting the wildcard:

"--metric-labels-allowlist=nodes=[kubernetes.io/arch]"

If you query the metrics endpoint, the label_kubernetes_io_arch label will appear on the metrics.

Anything else we need to know?:

This looks like it was intentional, per https://github.com/kubernetes/kube-state-metrics/blame/c864c93606db61e1c424b9313da03522f9f11adb/internal/store/builder.go#L235-L239

It was added in 0b76e7d#diff-a1639ee623bffb002ce1b1d3d18893f1d3ca6460a15d030cd272281f3126a7be

It just doesn't make sense though, there's no purpose having support for both a wildcard and non-wildcard when the wildcard unconditionally clobbers the non-wildcards.

It should either:

  • use the resource-kind-specific config if found, and fall back to the wildcard if not (recommended); or
  • append the wildcard config to the resource-specific config and do a unique sort

The former option is preferred because it allows the config author to say "add these labels to all resource kinds, except for this specific kind where I want to leave some of them out".

Environment:

  • kube-state-metrics version: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.13.0
  • Kubernetes version (use kubectl version): server v1.29.2
  • Cloud provider or hardware configuration: Repro'd on kind, but seen on CSP-managed k8s too
  • Other info:

Tasks

No tasks being tracked yet.
@ringerc ringerc added the kind/bug Categorizes issue or PR as related to a bug. label Aug 29, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 29, 2024
@dashpole
Copy link

dashpole commented Sep 5, 2024

/assign @rexagod
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants