GCE health check does not pick up changes to pod readinessProbe #582

ConradIrwin · 2017-04-11T04:22:06Z

Importing this issue from kubernetes/kubernetes#43773 as requested.

Kubernetes version (use kubectl version): 1.4.9

Environment:

Cloud provider or hardware configuration: GKE
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

What happened:

I created an ingress with type GCE

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: admin-proxy
  labels:
    name: admin-proxy
  annotations:
    kubernetes.io/ingress.allow-http: "false"
spec:
  tls:
    - secretName: ingress-admin-proxy
  backend:
    serviceName: admin-proxy
    servicePort: 80

This set up the health-check on the google cloud backend endpoint to call "/" as documented.

Unfortunately my service doesn't return 200 on "/" and so I added a readinessCheck to the pod as suggested by the documentation.

What you expected to happen:

I expected the health check to be automatically updated.

Instead I had to delete the ingress and re-create it in order for the health-check to update.

How to reproduce it (as minimally and precisely as possible):

Create a deployment with no readiness probe.
Create an ingress pointing to the pods created by that deployment.
Add a readiness probe to the deployment.

Anything else we need to know:

The text was updated successfully, but these errors were encountered:

ConradIrwin · 2017-04-11T04:22:33Z

Comment by @nicksardo:

There's actually a TODO for deciding what to do in this case https://github.com/kubernetes/ingress/blob/6ed6475e87588494dd29ba5d33bac167ae8acfdd/controllers/gce/healthchecks/healthchecks.go#L71 The controller doesn't want to overwrite handmade changes. We could probably update the settings if they have the default value.

porridge · 2017-04-11T07:45:10Z

But updating from default value would lead to confusing behavior where only the first change is propagated, and it's not even possible to go back to default, by changing the ingress, right? I think it's reasonable to assume that if the ingress controller knows how to manage a setting, it will overwrite manual changes.

nicksardo · 2017-04-11T16:23:19Z

I agree that it's reasonable and would be an easy solution. However, I worry about unsuspecting users who upgrade their master and find their load balancer throwing 502s. Would creating an event on health check update be sufficient? Thoughts from other users?

ConradIrwin · 2017-04-11T16:31:03Z

Is there any way that the "ingress controller"? can track whether the value was set automatically?

nicksardo · 2017-04-11T16:56:01Z

The controller does not have persistent storage. One option would be to convert the health check settings to JSON and store the state in the HealthCheck's optional description field for tracking diffs, but this seems very ugly.

nicksardo · 2017-04-11T16:56:42Z

cc @ibawt

porridge · 2017-04-11T16:57:47Z

What do you mean by "creating an event on health check update"?

nicksardo · 2017-04-11T17:10:36Z

If the controller sees a health check with settings different from default/readinessProbe, it updates the check then could create a K8s event for each parent ingress. If readiness probe exists, event could say, "Health Check for port 3XXXX updated to readinessProbe settings". If probe doesn't exist, "Health check for port 3XXXX updated to default settings (see ingress docs for defining settings via readinessProbe)".
There's already an event being created at HC creation, but it doesn't seem very useful to me: https://github.com/kubernetes/ingress/blob/987540f8f61f86d07314cd776fe32e150724b48c/controllers/gce/controller/utils.go#L603

ConradIrwin · 2017-04-11T17:21:11Z

Just my 2c: As a kubernetes user, I have no idea how to use events — they seem too transient to be useful (i.e. it's impossible to see when they actually happened, and I've never seen an event that's helped me to run my cluster better). It might just be a UI problem — if there was a way of just tailing the event logs, then I might comprehend them better, so maybe it's ok to add one and assume that they'll be fixed upstream Sent via Superhuman <https://sprh.mn/[email protected]>

…

On Tue, Apr 11, 2017 at 10:10 AM, Nick ***@***.***>wrote: If the controller sees a health check with settings different from default/readinessProbe, it updates the check then could create a K8s event for each parent ingress. If readiness probe exists, event could say, "Health Check for port 3XXXX updated to readinessProbe settings". If probe doesn't exist, "Health check for port 3XXXX updated to default settings (see ingress docs for defining settings via readinessProbe)". There's already an event being created at HC creation, but it doesn't seem very useful to me: https://github.com/kubernetes/ingress/blob/ 987540f/controllers/gce/controller/utils. go#L603 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#582 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFwQP-D0IkTqISjn40vKGBNilNOffRjks5ru7QNgaJpZM4M5mnb> .

kilianc · 2017-05-14T21:36:20Z

Is there any workaround to make it pick up the changes, even manually?

kilianc · 2017-05-14T21:50:58Z

To answer my own question and for everybody landing here with the same problem, rotate the name of the service in your rule or backend and the controller will pick up the path correctly. I assume this could be done with 0 downtime creating a new rule and deleting the old one.

bviolier · 2017-06-08T07:26:37Z

We actually encountered an issue yesterday in regards to readinessProbes and health checks on GCE. A while ago we had some difficulties with setting up the health-check correctly through Kubernetes, adding that it doesn't pick up changes, we decided to manually update the health-check on the GCE side, everything worked.
Yesterday we updated our master to 1.6.4, and suddenly all manual set health-check updates were reset to the defaults, probably because Kubernetes (re)set them again.

But, reading this issue, it should never update health-checks. I cannot re-trigger an update now, so it doesn't seem that this functionality has been added. But for sure the master upgrade did trigger this, is that expected behavior?

bviolier · 2017-06-08T07:34:42Z

Another gotcha, when using the example provided, https://github.com/kubernetes/ingress/blob/6ed6475e87588494dd29ba5d33bac167ae8acfdd/controllers/gce/examples/health_checks/health_check_app.yaml, if you remove the ports.containerPort, the GCE ingress health check sets defaults, and can't process the readinessProbe. This might be expected behavior, but at least for us, it wasn't.

All our reset health-checks didn't have a ports.containerPort set.

nicksardo · 2017-06-08T15:34:50Z

@bviolier

Regarding your last comment, containerPort is a required attribute for defining a port of a container. How would you expect to serve network traffic without defining it?

Regarding your loss of manual health check settings, the latest version of the controller creates a newer-type of health check (compute.HealthCheck instead of the legacy compute.HTTPHealthCheck). This chance was made to make HTTP and HTTPS health checks more manageable by the controller and to make future features of the load balancer easier to consume.

What settings are you trying to override with the readiness probe? Just the path?

bviolier · 2017-06-08T15:52:26Z

@nicksardo Thanks for the information!

I was not aware that the containerPort was a required attribute. But, the pod/services also worked fine without it. The health-check was also working (although after manual editing the path), so traffic could get into the pods through the port specified in the readinessProbe, and without the containerPort.

Makes sense if it created a new type of health-check, thanks for the explanation. I do think more people that manually edited their health checks will run into this issue, but, manually editing health checks is also not the best thing to do ;-)

The most important changes we make with readinessProbe is indeed the path. But occasionally we also change the timeout, interval, and failureThreshold.

nicksardo · 2017-06-09T20:56:41Z

@bviolier
In case you aren't aware, if your service is using externalTraffic=Global (default), then the other settings aren't very useful. Traffic from the load balancer hits any of the nodes and kube-proxy will route the traffic to a random node. This makes health checking close to useless as you won't have consistent data for a node. readinessProbes/livenessProbes are the way to go for pod health.

bowei · 2017-10-11T17:50:43Z

This issue was moved to kubernetes/ingress-gce#39

nicksardo added backend/gce labels Apr 11, 2017

nicksardo self-assigned this Apr 11, 2017

freehan mentioned this issue May 15, 2017

Ingress Healthcheck Configuration #720

Closed

abstrctn mentioned this issue Jun 12, 2017

GCP HealthCheck customizations lost upgrading to GLBC 0.9.3 #842

Closed

bowei mentioned this issue Oct 11, 2017

GCE health check does not pick up changes to pod readinessProbe kubernetes/ingress-gce#39

Closed

bowei closed this as completed Oct 11, 2017

bowei mentioned this issue Oct 11, 2017

Ingress Healthcheck Configuration kubernetes/ingress-gce#42

Closed

sebglon mentioned this issue Jan 29, 2018

Helm Grafana - ingress fail with anonymous disabled prometheus-operator/prometheus-operator#934

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCE health check does not pick up changes to pod readinessProbe #582

GCE health check does not pick up changes to pod readinessProbe #582

ConradIrwin commented Apr 11, 2017

ConradIrwin commented Apr 11, 2017

porridge commented Apr 11, 2017 via email

nicksardo commented Apr 11, 2017

ConradIrwin commented Apr 11, 2017

nicksardo commented Apr 11, 2017

nicksardo commented Apr 11, 2017

porridge commented Apr 11, 2017 via email

nicksardo commented Apr 11, 2017

ConradIrwin commented Apr 11, 2017 via email

kilianc commented May 14, 2017

kilianc commented May 14, 2017

bviolier commented Jun 8, 2017 •

edited

Loading

bviolier commented Jun 8, 2017

nicksardo commented Jun 8, 2017

bviolier commented Jun 8, 2017

nicksardo commented Jun 9, 2017

bowei commented Oct 11, 2017

GCE health check does not pick up changes to pod readinessProbe #582

GCE health check does not pick up changes to pod readinessProbe #582

Comments

ConradIrwin commented Apr 11, 2017

ConradIrwin commented Apr 11, 2017

porridge commented Apr 11, 2017 via email

nicksardo commented Apr 11, 2017

ConradIrwin commented Apr 11, 2017

nicksardo commented Apr 11, 2017

nicksardo commented Apr 11, 2017

porridge commented Apr 11, 2017 via email

nicksardo commented Apr 11, 2017

ConradIrwin commented Apr 11, 2017 via email

kilianc commented May 14, 2017

kilianc commented May 14, 2017

bviolier commented Jun 8, 2017 • edited Loading

bviolier commented Jun 8, 2017

nicksardo commented Jun 8, 2017

bviolier commented Jun 8, 2017

nicksardo commented Jun 9, 2017

bowei commented Oct 11, 2017

bviolier commented Jun 8, 2017 •

edited

Loading