-
Notifications
You must be signed in to change notification settings - Fork 715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bind controller-manager and scheduler to node ip address to expose /metrics
endpoint
#2388
Comments
binding to the node IP is intuitive, but here is my argument against this:
these options are unlikely going to happen, given there exist a number of workarounds including phases and patches.
the positional argument problem is not true. you can pass additional the probe host is a key, so there is no problem there. the
EDIT: i've commented on the prometheus-operator ticket. EDIT: added startup probe. |
Thanks. Confirmed the suggested kubeadm patches work. My slightly modified version: # kubeadm-patches/kube-controller-manager+json.yaml
- op: add
path: /spec/containers/0/command/-
value: --bind-address=SOME_IP
- op: replace
path: /spec/containers/0/livenessProbe/httpGet/host
value: SOME_IP
- op: replace
path: /spec/containers/0/startupProbe/httpGet/host
value: SOME_IP
# kubeadm-patches/kube-scheduler+json.yaml
- op: add
path: /spec/containers/0/command/-
value: --bind-address=SOME_IP
- op: replace
path: /spec/containers/0/livenessProbe/httpGet/host
value: SOME_IP
- op: replace
path: /spec/containers/0/startupProbe/httpGet/host
value: SOME_IP My small concern about patches was that they are applied to manifests. If manifests structure stability is not guaranteed, then patches may become wrong or non-applicable. However, in this particular case a breakage is unlikely, so patches are good enough for me. |
i brought this topic for discussion in the SIG Cluster Lifecycle meeting today and we agreed that this change is not something we see as necessary, since the user base of consuming metrics for these components is not big and it's a topology change. some discussed alternatives:
https://docs.google.com/document/d/1Gmc7LyCIL_148a9Tft7pdhdee0NBHdOfHS1SAF0duI4/edit# |
How can I figure out which instance of kube-scheduler or kube-controller-manager is the leader without endpoint? |
I'm not saying I disagree with your conclusions, but this seems like an opinionated take with bad results for UX I was just able to configure kube-prometheus-stack operator Helm chart on my cluster with alertmanager for the first time successfully, and I spent about two hours tracking down "why do these four important Kubernetes services read It would be great if there was a way to centrally configure kubeadm's child components and opt into this, I think even if we are not in the majority, there are enough people out there using prometheus and alertmanager that they would care about monitoring these kubeadm components, (I am not a kube-prometheus maintainer, but from my perspective as a user, figuring things out for the first time, it sure seems they cared enough about it to put them in the default configuration!) Admittedly I have not read all the kubeadm docs and it might be that a page I'm unaware of addresses this use case specifically, I think that Prometheus integration should get special treatment and it should be straightforward to configure all of these components at once for metrics, or at least one straightforward way to configure all four that is the same for all four. Least I can say is, based on the steps I describe taking here, (which were the least I could do to turn my running kubernetes cluster into one with a functioning alertmanager without silences, based on the default configuration from kube-prometheus-stack) it's not straightforward monitoring kubeadm-derived Kubernetes installations right now: Maybe this was the document that I needed to find, and maybe there are fixes that could be added to it which would address all of my concerns (I've only just found this doc for the first time upon hypothesizing that it might exist): It looks like there are still some gaps, I think there's a bit more needed to make this run smoothly. Comparing this with my comment on the helm chart, I can see there are a few things I needed that aren't mentioned in this doc. Maybe we can take care of it all here. The parts that are missing are for The more I recount these docs, finding things that are just a bit too outdated (like references to CoreOS that seem to date the documents and explain why there might be a difference between my experience and the documented latest art here...) The more I look at it, the more I think my issue needs to go in the kube-prometheus community. Anyway I think this is pretty low traffic on this issue indicates this probably wasn't a very popular option, but I hope if we make the UX good, it will be more popular in the future? It seems like besides scheduler and controller-manager, there might also be etcd and kube-proxy that are ripe for configuration, and I'm afraid they might not all be able to be configured straightforwardly through options in the kubeadm cluster configuration, (or else someone might have documented this already, since it would be easy.) |
For folks wanting a quick fix: ssh $MASTER_NODE_1
vi /etc/kubernetes/manifests/kube-controller-manager.yaml - - --bind-address=127.0.0.1
+ - --bind-address=0.0.0.0 This may well be blown away by a |
FEATURE REQUEST
Versions
kubeadm version (use
kubeadm version
): 1.20.2Environment:
kubectl version
): v1.20.2uname -a
): 5.10.4-gentooWhat happened?
TL;DR Controller manager and scheduler's metrics endpoint is inaccessible with the default kubeadm setup. The proposal is to use the node IP address instead of 127.0.0.1 for kube-controller-manager (KCM) and kube-scheduler's (KS)
--bind-address
argument and probe configs.Both KCM and KS bind to 127.0.0.1 with port 10257 and 10258 to expose
/healthz
and/metrics
endpoints./healthz
is exposed mostly for local consumptions (probes), where/metrics
is expected to be scraped by central metrics scraper (Prometheus or similar) which is not expected to run as a daemon set on control plane nodes. The current workaround is to use--bind-address=0.0.0.0
which may overexpose metrics to unwanted interfaces and effectively to the internet. This may be fixed by applying firewall rules (e.g. iptables) to drop all traffic coming to these ports except for node IP destination.Another solution would be patching
kubeadm.yml
kubeadm config file with the below sections right before applyingkubeadm init phase control-plane controller-manager --config kubeadm.yml
(same for KS):One more workaround is to use
kubeadm
's--experimental-patches
argument. The downside is that it fixes static manifests instead of kubeadm config. It depends on --bind-address argument position in the argument list, and requires patching multiple places (command line and two probes for each component).Arguably, most users would prefer
/metrics
endpoint be set up in a useful way by default and without such an effort.The proposed solutions are:
api-server
andetcd
.The text was updated successfully, but these errors were encountered: