Promethus: Unable to monitor kube-scheduler, kube-proxy and kube-contoller-manager components #22

ricsanfre · 2021-12-03T14:33:48Z

Issue Description

kube-prometheus-stack helm default installation unable to discover targets of some of the kubernetes components:

kube-scheduler
kube-proxy
kube-controller-manager
kube-etcd

The solution implies:

Change K3S configuration to make kube-scheduler, proxy and controller processes to be bound to the IP address of the master node, instead using the default: 127.0.0.1.
Modify Kube-prometheus-stack chart configuration to establish the IP address of the master node in the ServiceMonitoring section.

See how to solve this issue here:
k3s-io/k3s#3619

The text was updated successfully, but these errors were encountered:

ricsanfre · 2021-12-05T10:07:57Z

Worker node outage after applying changes specified in k3s-io/k3s#3619

Changes applied:

Additional k3s server arguments to make the components bind to node IP address

 --kube-controller-manager-arg 'bind-address=0.0.0.0'
 --kube-controller-manager-arg 'address=0.0.0.0'
 --kube-proxy-arg 'metrics-bind-address=0.0.0.0'
 --kube-scheduler-arg 'bind-address=0.0.0.0'
 --kube-scheduler-arg 'address=0.0.0.0'

Adding end-points to Prometheus ServiceMonitoring configuration of the kube components. Only etc metrics scrape is disabled (K3S is using embedded sqllite database)

kubeApiServer:
     enabled: true
   kubeControllerManager:
     enabled: true
     endpoints:
     - 10.0.0.11
   kubeScheduler:
     enabled: true
     endpoints:
     - 10.0.0.11
   kubeProxy:
     enabled: true
     endpoints:
     - 10.0.0.11
   kubeEtcd:
     enabled: false

When the new targets are discovered Prometheus memory consumption increases a lot eventually producing a node outage for lack of memory.

There is a known issue in K3S: k3s-io/k3s#2262, where duplicated metrics are emitted by the three components (kube-proxy, kube-scheduler and kube-controller-manager).

This behaviour can be checked scraping the metrics using curl command

10249 - kube-proxy
10251 - kube-scheduler
10252 - kube-controller-manager

curl localhost:10249/metrics

Duplicated metrics might cause higher memory consumption in Prometheus. As the proposed solution in k3s-io/k3s#2262 for Rancher Monitoring, the solution to avoid the scrape of duplicated metrics is to activate only the service monitoring of one of the components. (i.e. kube-proxy)

ricsanfre · 2021-12-05T12:50:47Z

Changing helm chart configuration to just enabling one of the services makes prometheus not suffer any memory increase, but the resulting configuration is not complete (only grafana dashboards of the enabled component is created)

Applying for example:

      kubeControllerManager:
        enabled: false
      kubeScheduler:
        enabled: false
      kubeProxy:
        enabled: true

Prometheus-kube-stack helm chart create all the required resources (headless services , servicemonitor, and grafana dashboards) for monitoring kube-proxy but not the resources for monitoring the rest of components (headless services, service monitoring).

Solution: Keep the configuration for monitoring these kube components out of kube-prometheus-stack control, disabling the monitoring in chart configuration, and create the required resources (headless service, service monitor and grafana dashboards) for monitoring them: One single headless service (k3s metrics endpoint), one single serviceMonitor, and 3 configMaps containing grafana dashboards (one per component).

…ponents

ricsanfre added the bug Something isn't working label Dec 3, 2021

ricsanfre added this to the release 1.1 milestone Dec 3, 2021

ricsanfre added a commit that referenced this issue Dec 8, 2021

Fix #22. Adding monitoring of K3S controller, scheduler and proxy com…

a1b200a

…ponents

ricsanfre mentioned this issue Dec 11, 2021

Ansible playbook refactoring and bugs fixing #23

Merged

ricsanfre closed this as completed in #23 Dec 11, 2021

This was referenced Aug 23, 2022

K3S emitting duplicated metrics in all endpoints (Api server, kubelet, kube-proxy, kube-scheduler, etc) #67

Closed

Memory footprint optimization #63

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Promethus: Unable to monitor kube-scheduler, kube-proxy and kube-contoller-manager components #22

Promethus: Unable to monitor kube-scheduler, kube-proxy and kube-contoller-manager components #22

ricsanfre commented Dec 3, 2021

ricsanfre commented Dec 5, 2021 •

edited

Loading

ricsanfre commented Dec 5, 2021 •

edited

Loading

Promethus: Unable to monitor kube-scheduler, kube-proxy and kube-contoller-manager components #22

Promethus: Unable to monitor kube-scheduler, kube-proxy and kube-contoller-manager components #22

Comments

ricsanfre commented Dec 3, 2021

Issue Description

ricsanfre commented Dec 5, 2021 • edited Loading

ricsanfre commented Dec 5, 2021 • edited Loading

ricsanfre commented Dec 5, 2021 •

edited

Loading

ricsanfre commented Dec 5, 2021 •

edited

Loading