Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCE load balancer health check does match k8s pod health #1656

Closed
scarby opened this issue Jan 21, 2022 · 16 comments
Closed

GCE load balancer health check does match k8s pod health #1656

scarby opened this issue Jan 21, 2022 · 16 comments
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@scarby
Copy link

scarby commented Jan 21, 2022

Issue

It would appear that there is zero connection between kubernetes' concept of when a pod is healthy and the GCE load balancer's concept of the same.

As such when a deployment is updating:

  • kubernetes spins up new pods,
  • new pods pass their health-checks and kubernetes considers them to be ready
  • At this point the GCE load balancer is not guaranteed to have passed it's health-check,
  • k8s will then potentially terminate the old pods before new pods are considered healthy by the GCE load balancer (and they are instantly dropped from the NEG)

The only 'solution' we have found to this is to add a significant initial delay on the kubernetes health-checks. Not only is this hacky but it doesn't guarantee that there are actually pods able to serve traffic when he old pods are removed (we're just hoping)

Expected behaviour

I would expect k8s not to terminate the pod until the load balancer had a new pod ready to replace it

Is there any way to tie these two together so we avoid a situation where there are no pods available?

@freehan
Copy link
Contributor

freehan commented Jan 25, 2022

When NEG is enabled, LB health checks are feedback into pod readiness https://cloud.google.com/kubernetes-engine/docs/concepts/container-native-load-balancing#pod_readiness

To configure custom LB health check, use BackendConfig

@scarby
Copy link
Author

scarby commented Jan 26, 2022

Ok. I'm mistaken wrong no connection however I'm not sure this is fit for purpose.

GKE sets the value of cloud.google.com/load-balancer-neg-ready for a Pod to True if any of the following conditions are met:

One or more of the Pod's IP addresses are endpoints in a GCE_VM_IP_PORT NEG managed by the GKE control plane. The NEG is attached to a backend service. The load balancer health check for the backend service times out.

Which is likely what is happening in my case. If my health check times out I clearly don't want my pod to be considered ready?

So going back to my original point there appears to be no way to ensure there is actually a pod ready to serve traffic.

@kundan2707
Copy link
Contributor

/kind support

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Jan 27, 2022
@dry4ng
Copy link

dry4ng commented Feb 23, 2022

It appears that GCP load balancer creates health check once, when the ingress is created, and then never updates is. At least from what I have observed. From there on there is no connection between the pod state and the GCP load balancer.
I have different health checks for startup and liveness. I don't want GCP load balancer to be hitting the startup health check, as it's quite heavy.

@jmcarp
Copy link

jmcarp commented Apr 25, 2022

Does this controller intentionally not update backend health checks? Changing readiness probes doesn't seem to change health checks on the backend.

@swetharepakula
Copy link
Member

The ingress controller waits to make sure that a pod, on startup, is considered healthy by the load balancer before updating the readiness check on the pod. If in 15 minutes, the load balancer does not consider the pod ready, the readiness check on the pod will be ready. The idea is to only let Kubernetes consider the pod ready once the load balancer considers the pod ready. If you require a different health check for the load balancer, that can be specified using the BackendConfig CRD.

Are your pods taking longer than 15 minutes to pass the load balancer health check?

@goobysnack
Copy link

The pods in our deployment can take up to 90s to fully initialize and pass the readiness probe (yay java!). The load balancer healthcheck is just hitting the tomcat listener. THIS ALONE passes before pod readiness passes, and marks the NEGS as ready. It seems that the load balancer backend shouldn't forward traffic to a pod unless both the backend healthcheck and the pod readiness are in a good state.

@goobysnack
Copy link

I opened a Google case, and their response is "by design", and offered to open a feature request. That seems more like a bug than a feature.

My response:

This seems like a bug to fix, not a feature request.  Why would you bypass k8s readiness probes just because the ingress check passes?  That makes zero sense and undermines the purpose of readiness probes.```

@thomas-riccardi
Copy link

thomas-riccardi commented Jun 6, 2022

@goobysnack same story here, the GCP support ended up opening this feature request: https://issuetracker.google.com/230729446 for us.

After reading the code, issues, and design docs for readiness gates and this ingress-gce, I believe it's a non trivial issue to fix because the whole design of the Readiness Gates rely on transmitting the GCLB programmation success state to other components via the Pod Readiness condition.
We are at a deadlock:

  • for proper rolling update, Deployment & co use the Pod Readiness condition to know when the new pods actually receive traffic from GCLB: gce-ingress-controller marks the Pod Ready (via the readiness gates) after it has successfully added it to the GCLB; that's the whole goal of the Readiness Gate feature.
  • We would like the gce-ingress-controller to ignore Pods that are not Ready (yet?)

Maybe a way forward would be for the gce-ingress-controller to use the Pod Ready minus its own gclb-readiness-gate; but that's not an information which is exposed in Endpoints/EndpointSlices (we only have ready).

In the meantime, a possible solution would be to have a sidecar container which computes that value by self inspecting (probably asking k8s api for self pod status to get individual containers conditions; doesn't seem ideal though), and exposes it as a HTTP endpoint, to be configured as the GCLB HealthCheck for that Pod/Service.
I am not aware of any existing implementation of this idea though.

In the meantime, we forced the old Instance Group mode everywhere (vs NEG), where the traffic actually goes through kubernetes Services, which respect the Pod Ready condition; and accepted all the limitations of this old way.

@swetharepakula
Copy link
Member

Thanks @thomas-riccardi for the great explanation!

Currently the load balancer health checks are the only signal we can provide to the load balancer that the pod is ready to receive traffic. We do not have a solution at this time for making the load balancer Kubernetes aware.

For those affected by this now, my recommendation is to make sure that the health check on the application only passes once the application is ready to accept traffic.

@thomas-riccardi
Copy link

Thanks @swetharepakula

Are there plans to improve the situation in GKE? Discussions in upstream kubernetes like there was for the introduction of the Readiness Gates?
Because otherwise it seems we will be stuck with the old Ingress+IG, also loosing the much awaited Gateway API with all the new features (plus #33, #109, ...).

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 13, 2022
@goobysnack
Copy link

I learned that the backend-config isn't assigned to the ingress annotation, it's assigned to the workload service. Once you do that, it all works like magic. That was the fine print in the documentation I missed.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 13, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

10 participants