Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promethus: Unable to monitor kube-scheduler, kube-proxy and kube-contoller-manager components #22

Closed
ricsanfre opened this issue Dec 3, 2021 · 2 comments · Fixed by #23
Labels
bug Something isn't working
Milestone

Comments

@ricsanfre
Copy link
Owner

Issue Description

kube-prometheus-stack helm default installation unable to discover targets of some of the kubernetes components:

  • kube-scheduler
  • kube-proxy
  • kube-controller-manager
  • kube-etcd

image

The solution implies:

  1. Change K3S configuration to make kube-scheduler, proxy and controller processes to be bound to the IP address of the master node, instead using the default: 127.0.0.1.
  2. Modify Kube-prometheus-stack chart configuration to establish the IP address of the master node in the ServiceMonitoring section.

See how to solve this issue here:
k3s-io/k3s#3619

@ricsanfre ricsanfre added the bug Something isn't working label Dec 3, 2021
@ricsanfre ricsanfre added this to the release 1.1 milestone Dec 3, 2021
@ricsanfre
Copy link
Owner Author

ricsanfre commented Dec 5, 2021

Worker node outage after applying changes specified in k3s-io/k3s#3619

Changes applied:

  • Additional k3s server arguments to make the components bind to node IP address

     --kube-controller-manager-arg 'bind-address=0.0.0.0'
     --kube-controller-manager-arg 'address=0.0.0.0'
     --kube-proxy-arg 'metrics-bind-address=0.0.0.0'
     --kube-scheduler-arg 'bind-address=0.0.0.0'
     --kube-scheduler-arg 'address=0.0.0.0'
    
  • Adding end-points to Prometheus ServiceMonitoring configuration of the kube components. Only etc metrics scrape is disabled (K3S is using embedded sqllite database)

    kubeApiServer:
         enabled: true
       kubeControllerManager:
         enabled: true
         endpoints:
         - 10.0.0.11
       kubeScheduler:
         enabled: true
         endpoints:
         - 10.0.0.11
       kubeProxy:
         enabled: true
         endpoints:
         - 10.0.0.11
       kubeEtcd:
         enabled: false

When the new targets are discovered Prometheus memory consumption increases a lot eventually producing a node outage for lack of memory.

There is a known issue in K3S: k3s-io/k3s#2262, where duplicated metrics are emitted by the three components (kube-proxy, kube-scheduler and kube-controller-manager).

This behaviour can be checked scraping the metrics using curl command

10249 - kube-proxy
10251 - kube-scheduler
10252 - kube-controller-manager

curl localhost:10249/metrics

Duplicated metrics might cause higher memory consumption in Prometheus. As the proposed solution in k3s-io/k3s#2262 for Rancher Monitoring, the solution to avoid the scrape of duplicated metrics is to activate only the service monitoring of one of the components. (i.e. kube-proxy)

@ricsanfre
Copy link
Owner Author

ricsanfre commented Dec 5, 2021

Changing helm chart configuration to just enabling one of the services makes prometheus not suffer any memory increase, but the resulting configuration is not complete (only grafana dashboards of the enabled component is created)

Applying for example:

      kubeControllerManager:
        enabled: false
      kubeScheduler:
        enabled: false
      kubeProxy:
        enabled: true

Prometheus-kube-stack helm chart create all the required resources (headless services , servicemonitor, and grafana dashboards) for monitoring kube-proxy but not the resources for monitoring the rest of components (headless services, service monitoring).

Solution: Keep the configuration for monitoring these kube components out of kube-prometheus-stack control, disabling the monitoring in chart configuration, and create the required resources (headless service, service monitor and grafana dashboards) for monitoring them: One single headless service (k3s metrics endpoint), one single serviceMonitor, and 3 configMaps containing grafana dashboards (one per component).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant