Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AKS load balancer health check fails with ingress-nginx version v1.9.5 (but works with up to and including v1.8.4) #10869

Closed
617m4rc opened this issue Jan 18, 2024 · 12 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@617m4rc
Copy link

617m4rc commented Jan 18, 2024

What happened:

The Azure load balancer stops working after upgrading the ingress-nginx Helm chart from v1.8.4 to v1.9.5. Azure LB health checks fail permanently. There is no error message in the ingress-nginx or Azure LB logs.

What you expected to happen:

Azure LB continues to work as before.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):


NGINX Ingress controller
Release: v1.9.5
Build: f503c4b
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.21.6


Kubernetes version (use kubectl version):

Client Version: v1.27.3
Kustomize Version: v5.0.1
Server Version: v1.28.3

Environment:

  • Cloud provider or hardware configuration:

Azure AKS

  • OS (e.g. from /etc/os-release):
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
  • Kernel (e.g. uname -a):

Linux version 5.15.0-1053-azure (buildd@bos03-amd64-012) (gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #61-Ubuntu SMP Tue Nov 21 14:16:01 UTC 2023

  • Install tools:

Terraform

  • Basic cluster related info:
kubectl get nodes -o wide
NAME                              STATUS   ROLES   AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-ngspark-40415517-vmss000000   Ready    agent   172m   v1.28.3   10.10.0.5     <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1
aks-ngzone1-40415517-vmss000000   Ready    agent   172m   v1.28.3   10.10.0.103   <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1
aks-ngzone1-40415517-vmss000001   Ready    agent   170m   v1.28.3   10.10.0.201   <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1
aks-ngzone1-40415517-vmss000002   Ready    agent   169m   v1.28.3   10.10.0.251   <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1
aks-ngzone2-40415517-vmss000000   Ready    agent   172m   v1.28.3   10.10.0.54    <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1
aks-ngzone3-40415517-vmss000000   Ready    agent   172m   v1.28.3   10.10.0.152   <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1

  • How was the ingress-nginx-controller installed:

Helm

helm ls -A
NAME                            NAMESPACE       REVISION        UPDATED                                 STATUS         CHART                                            APP VERSION
ingress-nginx-gh                management      3               2024-01-18 11:07:18.153458257 +0000 UTC deployed       ingress-nginx-4.9.0                              1.9.5
helm -n management get values ingress-nginx-gh
USER-SUPPLIED VALUES:
controller:
  admissionWebhooks:
    networkPolicyEnabled: true
    patch:
      image:
        digest: ""
        pullPolicy: IfNotPresent
  allowSnippetAnnotations: false
  annotations:
    fluentbit.io/parser: k8s-nginx-ingress
  config:
    enable-modsecurity: "true"
    enable-ocsp: true
    enable-owasp-modsecurity-crs: "true"
    enable-real-ip: true
    generate-request-id: true
    hsts-max-age: "31536000"
    modsecurity-snippet: |
      SecAction "id:900200,phase:1,nolog,pass,t:none,setvar:tx.allowed_methods=GET HEAD POST OPTIONS PUT PATCH DELETE"
      SecAuditEngine RelevantOnly
      SecAuditLog /dev/stdout
      SecAuditLogFormat JSON
      SecRequestBodyAccess On
      SecRequestBodyLimitAction ProcessPartial
      SecRuleEngine DetectionOnly
      SecStatusEngine Off
    proxy-connect-timeout: "60"
    proxy-read-timeout: "1800"
    proxy-send-timeout: "1800"
    real-ip-recursive: "on"
    server-snippet: |
      location /.well-known/security.txt {
        return 200 'Contact: mailto:[email protected]\nPreferred-Languages: en,de\nExpires: 2026-01-01T00:00:00.000Z';
      }
    ssl-session-cache: "true"
    ssl-session-cache-size: 10m
    upstream-keepalive-requests: "1000"
    upstream-keepalive-timeout: "55"
    use-http2: true
    worker-shutdown-timeout: 60s
  extraArgs:
    default-ssl-certificate: management/tls-secret
  image:
    digest: ""
    pullPolicy: IfNotPresent
  ingressClass: XXX-nginx
  ingressClassResource:
    controllerValue: k8s.io/XXX-ingress-nginx
    default: false
    enabled: false
    name: XXX-nginx
  metrics:
    enabled: true
    serviceMonitor:
      additionalLabels:
        release: kube-prometheus-stack
      enabled: true
  minAvailable: 1
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app.kubernetes.io/name
          operator: In
          values:
          - ingress-nginx
        - key: app.kubernetes.io/instance
          operator: In
          values:
          - XXX-ingress-nginx
        - key: app.kubernetes.io/component
          operator: In
          values:
          - controller
      topologyKey: kubernetes.io/hostname
  replicaCount: 3
  resources:
    requests:
      memory: 300Mi
  service:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-access-log-emit-interval: "60"
      service.beta.kubernetes.io/aws-load-balancer-access-log-enabled: "true"
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "false"
      service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
      service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: '*'
      service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
      service.beta.kubernetes.io/aws-load-balancer-type: external
      service.beta.kubernetes.io/azure-dns-label-name: XXX-YYY
      service.beta.kubernetes.io/azure-load-balancer-tcp-idle-timeout: "30"
    externalTrafficPolicy: Local
  tolerations:
  - effect: NoSchedule
    key: kubernetes.azure.com/scalesetpriority
    operator: Equal
    value: spot
  topologySpreadConstraints:
  - labelSelector:
      matchLabels:
        app.kubernetes.io/instance: XXX-nginx-ingress
    maxSkew: 1
    minDomains: 3
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
defaultBackend:
  enabled: false
  image:
    digest: ""
    pullPolicy: IfNotPresent
  replicaCount: 2
  • Current State of the controller:
kubectl describe ingressclasses
Name:         XXX-nginx
Labels:       k8smgmt.io/project=dev              
Controller:   k8s.io/aris-ingress-nginx
Events:       <none>
 kubectl --kubeconfig describe pod ingress-nginx-gh-controller-7c89d8c746-46rkf
Name:             ingress-nginx-gh-controller-7c89d8c746-46rkf
Namespace:        management
Priority:         0
Service Account:  ingress-nginx-gh
Node:             aks-ngzone3-40415517-vmss000000/10.10.0.152
Start Time:       Thu, 18 Jan 2024 12:08:12 +0100
Labels:           app.kubernetes.io/component=controller
                  app.kubernetes.io/instance=ingress-nginx-gh
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=ingress-nginx
                  app.kubernetes.io/part-of=ingress-nginx
                  app.kubernetes.io/version=1.9.5
                  helm.sh/chart=ingress-nginx-4.9.0
                  k8smgmt.io/project=dev
                  pod-template-hash=7c89d8c746
                  rep-addon=ingress-nginx-gh
                  rep-cluster=2l877nk
                  rep-cluster-name=az-sc-s-azure-vnet-1
                  rep-drift-reconcillation=enabled
                  rep-organization=72d74e2
                  rep-partner=rx28oml
                  rep-placement=k05vxyl
                  rep-project=kgxwgem
                  rep-project-name=dev
                  rep-workloadid=kv6jwxy
Annotations:      <none>
Status:           Running
IP:               10.10.0.173
IPs:
  IP:           10.10.0.173
Controlled By:  ReplicaSet/ingress-nginx-gh-controller-7c89d8c746
Containers:
  controller:
    Container ID:    containerd://aa9632781deef10e44cf77f984baf92ce0f684d51fb49089812e65f4cd8987a7
    Image:           registry.k8s.io/ingress-nginx/controller:v1.9.5
    Image ID:        registry.k8s.io/ingress-nginx/controller@sha256:b3aba22b1da80e7acfc52b115cae1d4c687172cbf2b742d5b502419c25ff340e
    Ports:           80/TCP, 443/TCP, 10254/TCP, 8443/TCP
    Host Ports:      0/TCP, 0/TCP, 0/TCP, 0/TCP
    SeccompProfile:  RuntimeDefault
    Args:
      /nginx-ingress-controller
      --publish-service=$(POD_NAMESPACE)/ingress-nginx-gh-controller
      --election-id=ingress-nginx-gh-leader
      --controller-class=k8s.io/XXX-ingress-nginx
      --ingress-class=XXX-nginx
      --configmap=$(POD_NAMESPACE)/ingress-nginx-gh-controller
      --validating-webhook=:8443
      --validating-webhook-certificate=/usr/local/certificates/cert
      --validating-webhook-key=/usr/local/certificates/key
      --default-ssl-certificate=management/tls-secret
    State:          Running
      Started:      Thu, 18 Jan 2024 12:08:13 +0100
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      100m
      memory:   300Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       ingress-nginx-gh-controller-7c89d8c746-46rkf (v1:metadata.name)
      POD_NAMESPACE:  management (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /usr/local/certificates/ from webhook-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4r7np (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  webhook-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ingress-nginx-gh-admission
    Optional:    false
  kube-api-access-4r7np:
    Type:                     Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:   3607
    ConfigMapName:            kube-root-ca.crt
    ConfigMapOptional:        <nil>
    DownwardAPI:              true
QoS Class:                    Burstable
Node-Selectors:               kubernetes.io/os=linux
Tolerations:                  kubernetes.azure.com/scalesetpriority=spot:NoSchedule
                              node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                              node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                              node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Topology Spread Constraints:  topology.kubernetes.io/zone:DoNotSchedule when max skew 1 is exceeded for selector app.kubernetes.io/instance=XXX-nginx-ingress
Events:
  Type    Reason     Age   From                      Message
  ----    ------     ----  ----                      -------
  Normal  Scheduled  19m   default-scheduler         Successfully assigned management/ingress-nginx-gh-controller-7c89d8c746-46rkf to aks-ngzone3-40415517-vmss000000
  Normal  Pulled     19m   kubelet                   Container image "registry.k8s.io/ingress-nginx/controller:v1.9.5" already present on machine
  Normal  Created    19m   kubelet                   Created container controller
  Normal  Started    19m   kubelet                   Started container controller
  Normal  RELOAD     19m   nginx-ingress-controller  NGINX reload triggered due to a change in configuration
 kubectl management describe svc ingress-nginx-gh-controller
Name:                     ingress-nginx-gh-controller
Namespace:                management
Labels:                   app.kubernetes.io/component=controller
                          app.kubernetes.io/instance=ingress-nginx-gh
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=ingress-nginx
                          app.kubernetes.io/part-of=ingress-nginx
                          app.kubernetes.io/version=1.9.5
                          helm.sh/chart=ingress-nginx-4.9.0
                          k8smgmt.io/project=dev
Annotations:              meta.helm.sh/release-name: ingress-nginx-gh
                          meta.helm.sh/release-namespace: management
                          rep-drift-action: notify
                          service.beta.kubernetes.io/azure-dns-label-name: XXX
                          service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
                          service.beta.kubernetes.io/azure-load-balancer-tcp-idle-timeout: 30
Selector:                 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx-gh,app.kubernetes.io/name=ingress-nginx
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.0.229.32
IPs:                      10.0.229.32
LoadBalancer Ingress:     4.225.31.13
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  31095/TCP
Endpoints:                10.10.0.162:80,10.10.0.173:80,10.10.0.190:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  32420/TCP
Endpoints:                10.10.0.162:443,10.10.0.173:443,10.10.0.190:443
Session Affinity:         None
External Traffic Policy:  Local
HealthCheck NodePort:     32061
Events:                   <none>
kubectl --kubeconfig -n management describe ingress argocd-ingress
Name:             argocd-ingress
Labels:           k8smgmt.io/project=dev
Namespace:        management
Address:          x.x.x.x
Ingress Class:    aris-nginx
Default backend:  <default>
TLS:
  management-tls-secret terminates x.x.x
Rules:
  Host                                         Path  Backends
  ----                                         ----  --------
  x.x.x
                                               /   argocd-gh-server:80 (10.10.1.10:8080)
Annotations:                                   argocd.argoproj.io/sync-wave: 1
                                               external-dns.alpha.kubernetes.io/create-dns-record: true
                                               kubernetes.io/tls-acme: true
                                               nginx.ingress.kubernetes.io/rewrite-target: /
                                               nginx.ingress.kubernetes.io/ssl-redirect: true
                                               nginx.ingress.kubernetes.io/whitelist-source-range: x.x.x/32
                                               rep-drift-action: notify
Events:                                        <none>

Anything else we need to know:

Seems to be related to #10863

@617m4rc 617m4rc added the kind/bug Categorizes issue or PR as related to a bug. label Jan 18, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 18, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@strongjz
Copy link
Member

Why is there AWS and Azure annotations in helm -n management get values ingress-nginx-gh

      service.beta.kubernetes.io/aws-load-balancer-access-log-emit-interval: "60"
      service.beta.kubernetes.io/aws-load-balancer-access-log-enabled: "true"
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "false"
      service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
      service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: '*'
      service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
      service.beta.kubernetes.io/aws-load-balancer-type: external
      service.beta.kubernetes.io/azure-dns-label-name: XXX-YYY
      service.beta.kubernetes.io/azure-load-balancer-tcp-idle-timeout: "30"

The controller pod looks healthy

Can you reach the service for the controller? The endpoints? Is there a firewall or network policy in place?

IP:                       10.0.229.32
IPs:                      10.0.229.32
LoadBalancer Ingress:     4.225.31.13
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  31095/TCP
Endpoints:                10.10.0.162:80,10.10.0.173:80,10.10.0.190:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  32420/TCP
Endpoints:                10.10.0.162:443,10.10.0.173:443,10.10.0.190:443
Session Affinity:         None
External Traffic Policy:  Local
HealthCheck NodePort:     32061

We implemented Annotation Validation in 1.9.0 are you using any annotations in the ingress objects?

@longwuyuan
Copy link
Contributor

@617m4rc We also added some instructions to do some basic testing to get status reported with data here https://kubernetes.github.io/ingress-nginx/troubleshooting/#a-simple-test-of-the-basic-ingress-controller-routing . Please try and see if you can add more info on your use of the latest version of the ingress-nginx-controller using these instructions.

@Gacko
Copy link
Member

Gacko commented Jan 19, 2024

/assign

@Gacko
Copy link
Member

Gacko commented Jan 27, 2024

Hello!

So, I just created a fresh AKS cluster with the "Production Standard" preset, but without Azure Policy & Monitor. I was able to deploy and successfully request an Ingress NGINX using the following commands and values:

helm install --namespace ingress-nginx ingress-nginx https://github.com/kubernetes/ingress-nginx/releases/download/helm-chart-4.9.1/ingress-nginx-4.9.1.tgz --values values.yaml
controller:
  service:
    annotations:
      service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
controller:
  service:
    externalTrafficPolicy: Local

Some notes on what's actually happening when setting the above values:

By default the Ingress NGINX chart comes with externalTrafficPolicy: Cluster. The Azure Cloud Controller Manager therefore creates two health probes, one of them pointing to the HTTP node port, the other pointing to the HTTPS node port. Azure then requests / to check if a node in your backend pool is healthy and ready to serve traffic.

Assuming your node's IP is 10.224.0.222 and the HTTP node port of the Ingress NGINX Controller service is 32465, then Azure calls http://10.224.0.222:32465/ and expects HTTP/1.1 200 OK as response.

Internally requests sent to the HTTP node port get forwarded to the Ingress NGINX Controller pod's HTTP port. Since Azure is requesting /, the Ingress NGINX Controller responds with HTTP/1.1 404 Not Found as / is not the registered health check path and Ingress NGINX Controller doesn't know that path in combination with a plain IP, so Azure marks the particular node as unhealthy.

This is why you're setting service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz. With this annotation, the Azure Cloud Controller Manager sets up health checks with /healthz as request path. The Ingress NGINX Controller knows this path and answers HTTP/1.1 200 OK. The nodes in your backend pool therefore get marked healthy.

For the second values example, things are a bit easier. externalTrafficPolicy: Local tells Kubernetes to only accept traffic on nodes which actually run a pod of your service. Additionally the health check is no longer getting handled by your workload itself but by kube-proxy or its replacement. Kubernetes therefore allocates a separate health check node port.

At least on Azure, the little piece of software backing this health check node port is answering with HTTP/1.1 200 OK and a JSON body for requests on /, in comparison to Ingress NGINX Controller which is not doing so. But it also answers with HTTP/1.1 200 OK on /healthz, even though the body is empty there.

So at best you set the both of them if you do not want to forward traffic to nodes not running a pod of your workload and want to use the proper health check request path.

From your service manifest I can see you are already using externalTrafficPolicy: Local and azure-load-balancer-health-probe-request-path: /healthz. So theoretically and according to my tests, it should work.

The only other differences I can spot between chart version 4.7.3 (the one you're using according to the image version you're mentioning in the title) and 4.9.0 (the one you're using according to helm ls), are the tightening of the security context settings and changes in network policies.

As you're actively enabling network policies for admission webhooks, I assume your environment is relying on network policies. In chart version 4.7.3, we had a network policy which was intended for the admission webhook port of the Ingress NGINX Controller:

https://github.com/kubernetes/ingress-nginx/blob/helm-chart-4.7.3/charts/ingress-nginx/templates/controller-webhooks-networkpolicy.yaml

Unfortunately and due to how this was defined, it simply allowed all ingress traffic to the Ingress NGINX Controller pods. Luckily you're enabling this network policy by setting controller.admissionWebhooks.networkPolicyEnabled: true.

In chart version 4.9.0 this value is not affecting the creation of a now fixed network policy anymore. Using your values and diffing the resulting templates of the both chart version, you can see that this "allow all ingress network policy" is gone. You need to actively set controller.networkPolicy.enabled: true in this version. We put some work into aligning values there.

Could you please check my suggestions? As stated before I was able to get everything up and running on AKS, so I guess it's more related to your setup and the actual changes being made between 4.7.3 and 4.9.0.

Hope to hear from you
Marco

@Gacko
Copy link
Member

Gacko commented Jan 27, 2024

I'm just wondering why your Azure Load Balancer health checks are failing with the new chart. There's no change related to that and network policies should also not affect them.

Could you please set the following values. They are the new ones for what you want to achieve:

controller:
  networkPolicy:
    enabled: true
  admissionWebhooks:
    patch:
      networkPolicy:
        enabled: true

In the meantime I will further investigate the possible root cause.

@Gacko
Copy link
Member

Gacko commented Jan 28, 2024

btw, if those values are the ones you usually use, I think you have a typo:

  topologySpreadConstraints:
  - labelSelector:
      matchLabels:
        app.kubernetes.io/instance: XXX-nginx-ingress # <-- Shouldn't this be "XXX-ingress-nginx"
    maxSkew: 1
    minDomains: 3
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule

@Gacko
Copy link
Member

Gacko commented Jan 28, 2024

I created another AKS cluster and used the values you provided with few exceptions:

% diff *.yaml
41,42c41,42
<   extraArgs:
<     default-ssl-certificate: management/tls-secret
---
>   # extraArgs:
>   #   default-ssl-certificate: management/tls-secret
57c57
<       enabled: true
---
>       # enabled: true
70c70
<           - XXX-ingress-nginx
---
>           - ingress-nginx
90c90
<       service.beta.kubernetes.io/azure-dns-label-name: XXX-YYY
---
>       # service.beta.kubernetes.io/azure-dns-label-name: XXX-YYY
91a92
>       service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
98,105c99,106
<   topologySpreadConstraints:
<   - labelSelector:
<       matchLabels:
<         app.kubernetes.io/instance: XXX-nginx-ingress
<     maxSkew: 1
<     minDomains: 3
<     topologyKey: topology.kubernetes.io/zone
<     whenUnsatisfiable: DoNotSchedule
---
>   # topologySpreadConstraints:
>   # - labelSelector:
>   #     matchLabels:
>   #       app.kubernetes.io/instance: XXX-nginx-ingress
>   #   maxSkew: 1
>   #   minDomains: 3
>   #   topologyKey: topology.kubernetes.io/zone
>   #   whenUnsatisfiable: DoNotSchedule

I needed to disable the topology spread constraints as my cluster didn't have multiple availability zones and the service monitor as I did not install Prometheus CRDs. I also commented the default SSL certificate and the DNS label name annotations as they do not apply to my setup. Last but not least I added the azure-load-balancer-health-probe-request-path annotation so my services looks the same as yours.

In the end my Ingress NGINX and the AKS load balancer health checks were working perfectly fine.

As stated above, using your values the two chart version only differ by network policies and pod security stuff. The former can be mitigated by setting the new values as mentioned above. With them in place, the only real diff remaining is the PSS stuff. You can compare them on your own using this command:

diff <(helm template --namespace ingress-nginx ingress-nginx https://github.com/kubernetes/ingress-nginx/releases/download/helm-chart-4.7.3/ingress-nginx-4.7.3.tgz --values provided.yaml | grep --invert-match --extended-regexp "(app.kubernetes.io/version|helm.sh/chart):") <(helm template --namespace ingress-nginx ingress-nginx https://github.com/kubernetes/ingress-nginx/releases/download/helm-chart-4.9.1/ingress-nginx-4.9.1.tgz --values new.yaml | grep --invert-match --extended-regexp "(app.kubernetes.io/version|helm.sh/chart):")

@Gacko
Copy link
Member

Gacko commented Feb 1, 2024

I'm closing this since we did not receive any feedback and verified the chart is working with the provided documentation. Feel free to reopen if you have further questions or information about your use case.

/close

@k8s-ci-robot
Copy link
Contributor

@Gacko: Closing this issue.

In response to this:

I'm closing this since we did not receive any feedback and verified the chart is working with the provided documentation. Feel free to reopen if you have further questions or information about your use case.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ppawiggers
Copy link

ppawiggers commented Aug 15, 2024

I experience the same issue. Thanks @Gacko, your solution (setting the health probe path) fixed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
Development

No branches or pull requests

6 participants