Server scaling causes ingress TCP retransmissions, and the client cannot access the server properly. #11508

ZhengXinwei-F · 2024-06-26T09:01:08Z

What happened:
I have a server providing list-watch capabilities exposed to the outside via an ingress-controller. When there are only 1-2 instances of the server, scaling operations cause client requests to fail but they recover quickly. However, when there are three instances, the client (using client-go) does not break its TCP connection with the ingress (this is issue 1). Additionally, the ingress continues to send TCP SYN packets to the already dead pod, resulting in continuous TCP retransmissions. As a result, the client cannot properly access the ingress.

For example:

Ingress-controller logs:

2024/06/26 07:44:16 [error] 4441#4441: *4049959 upstream prematurely closed connection while reading upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.170:80/?watch", host: "local-server.com:80"
2024/06/26 07:44:16 [error] 4617#4617: *4049960 upstream prematurely closed connection while reading upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.170:80/?watch", host: "local-server.com:80"
172.18.0.1 - - [26/Jun/2024:07:44:16 +0000] "GET /?watch HTTP/1.1" 200 170 "-" "Go-http-client/1.1" 106 5.168 [default-server-80] [] 10.244.1.170:80 170 5.169 200 6072e3f8859cfc330e1b6742700e1205
172.18.0.1 - - [26/Jun/2024:07:44:16 +0000] "GET /?watch HTTP/1.1" 200 170 "-" "Go-http-client/1.1" 106 5.168 [default-server-80] [] 10.244.1.170:80 170 5.169 200 ac5c49504b6751d2574355d5c4fd554c
2024/06/26 07:44:16 [error] 4666#4666: *4049958 upstream prematurely closed connection while reading upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.169:80/?watch", host: "local-server.com:80"
172.18.0.1 - - [26/Jun/2024:07:44:16 +0000] "GET /?watch HTTP/1.1" 200 170 "-" "Go-http-client/1.1" 106 5.173 [default-server-80] [] 10.244.1.169:80 170 5.174 200 abaa63361000599d0fbc60bac0961194
W0626 07:44:20.334066      10 controller.go:1163] Service "default/server" does not have any active Endpoint.
2024/06/26 07:45:17 [error] 4568#4568: *4050262 upstream timed out (110: Operation timed out) while connecting to upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.171:80/?watch", host: "local-server.com:80"
2024/06/26 07:45:17 [error] 4443#4443: *4050261 upstream timed out (110: Operation timed out) while connecting to upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.170:80/?watch", host: "local-server.com:80"
2024/06/26 07:45:17 [error] 4443#4443: *4050265 upstream timed out (110: Operation timed out) while connecting to upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.169:80/?watch", host: "local-server.com:80"
2024/06/26 07:46:17 [error] 4568#4568: *4050262 upstream timed out (110: Operation timed out) while connecting to upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.169:80/?watch", host: "local-server.com:80"
2024/06/26 07:46:17 [error] 4443#4443: *4050261 upstream timed out (110: Operation timed out) while connecting to upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.171:80/?watch", host: "local-server.com:80"
2024/06/26 07:46:17 [error] 4443#4443: *4050265 upstream timed out (110: Operation timed out) while connecting to upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.170:80/?watch", host: "local-server.com:80"
2024/06/26 07:47:17 [error] 4568#4568: *4050262 upstream timed out (110: Operation timed out) while connecting to upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.170:80/?watch", host: "local-server.com:80"
172.18.0.1 - - [26/Jun/2024:07:47:17 +0000] "GET /?watch HTTP/1.1" 504 160 "-" "Go-http-client/1.1" 106 180.002 [default-server-80] [] 10.244.1.171:80, 10.244.1.169:80, 10.244.1.170:80 0, 0, 0 60.001, 60.000, 60.000 504, 504, 504 e19af9f4f40a060271392f1c90d24247
2024/06/26 07:47:17 [error] 4443#4443: *4050261 upstream timed out (110: Operation timed out) while connecting to upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.169:80/?watch", host: "local-server.com:80"
172.18.0.1 - - [26/Jun/2024:07:47:17 +0000] "GET /?watch HTTP/1.1" 504 160 "-" "Go-http-client/1.1" 106 180.002 [default-server-80] [] 10.244.1.170:80, 10.244.1.171:80, 10.244.1.169:80 0, 0, 0 60.001, 60.001, 60.000 504, 504, 504 53c8f7c1fd45dd30187dd24824eeb1ec
2024/06/26 07:47:17 [error] 4443#4443: *4050265 upstream timed out (110: Operation timed out) while connecting to upstream, client: 172.18.0.1, server: local-server.com, request: "GET /?watch HTTP/1.1", upstream: "http://10.244.1.171:80/?watch", host: "local-server.com:80"
172.18.0.1 - - [26/Jun/2024:07:47:17 +0000] "GET /?watch HTTP/1.1" 504 160 "-" "Go-http-client/1.1" 106 180.002 [default-server-80] [] 10.244.1.169:80, 10.244.1.170:80, 10.244.1.171:80 0, 0, 0 60.001, 60.000, 60.000 504, 504, 504 b1e69853960f121cf045c910664e4956
172.18.0.1 - - [26/Jun/2024:07:50:06 +0000] "GET /?watch HTTP/1.1" 200 5678 "-" "Go-http-client/1.1" 106 167.703 [default-server-80] [] 10.244.1.175:80 5678 167.703 200 9eb6ce390a6ec009b46e668aba44f181
172.18.0.1 - - [26/Jun/2024:07:50:06 +0000] "GET /?watch HTTP/1.1" 200 5678 "-" "Go-http-client/1.1" 106 167.708 [default-server-80] [] 10.244.1.175:80 5678 167.708 200 fd3243efecdc005c0fdbee007772a1b1
172.18.0.1 - - [26/Jun/2024:07:50:06 +0000] "GET /?watch HTTP/1.1" 200 5678 "-" "Go-http-client/1.1" 106 167.708 [default-server-80] [] 10.244.1.173:80 5678 167.708 200 7a388eef75f9d99a22a63798782f65e8

Service "default/server" does not have any active Endpoint.

This indicates that the ingress recognizes there are no active endpoints after a failed request. However, it still attempts to access the already dead pod. Packet capture shows TCP retransmissions and triggers backoff time.

What you expected to happen:

The client can properly access the ingress.

This issue seems to be caused by some form of endpoint caching in the ingress. However, it does not occur when there are only 1-2 server instances, but it does occur when there are 3 or more instances. Could you please help me analyze this issue? Thank you very much.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

NGINX Ingress controller
Release: v1.6.3
Build: 7ae9ca2
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.21.6

&

NGINX Ingress controller
Release: v0.49.3
Build: git-0a2ec01eb
Repository: https://github.com/kubernetes/ingress-nginx.git
nginx version: nginx/1.20.1

Kubernetes version (use kubectl version):

Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.21.5
WARNING: version difference between client (1.30) and server (1.21) exceeds the supported minor version skew of +/-1

Environment:

Cloud provider or hardware configuration: NONE
OS (e.g. from /etc/os-release):
ubuntu
Kernel (e.g. uname -a):
Linux CN0314000682W 5.15.153.1-microsoft-standard-WSL2 #1 SMP Fri Mar 29 23:14:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Install tools:
- Please mention how/where was the cluster created like kubeadm/kops/minikube/kind etc.
  kind
Basic cluster related info:
- kubectl version
- kubectl get nodes -o wide

Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.21.5
WARNING: version difference between client (1.30) and server (1.21) exceeds the supported minor version skew of +/-1

How was the ingress-nginx-controller installed:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.6.4/deploy/static/provider/kind/deploy.yaml

Current State of the controller:
- kubectl describe ingressclasses

Name:         nginx
Labels:       app.kubernetes.io/component=controller
             app.kubernetes.io/instance=ingress-nginx
             app.kubernetes.io/name=ingress-nginx
             app.kubernetes.io/part-of=ingress-nginx
             app.kubernetes.io/version=1.6.3
Annotations:  <none>
Controller:   k8s.io/ingress-nginx
Events:       <none>

kubectl -n <ingresscontrollernamespace> get all -A -o wide

➜  ~ kubectl get all -n ingress-nginx
NAME                                           READY   STATUS      RESTARTS   AGE
pod/ingress-nginx-admission-create-x2jf2       0/1     Completed   0          27h
pod/ingress-nginx-admission-patch-528bn        0/1     Completed   0          27h
pod/ingress-nginx-controller-67d59d7bc-rt9wm   1/1     Running     0          24h

NAME                                         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
service/ingress-nginx-controller             NodePort    10.96.252.74   <none>        80:31852/TCP,443:32612/TCP   27h
service/ingress-nginx-controller-admission   ClusterIP   10.96.82.89    <none>        443/TCP                      27h

NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ingress-nginx-controller   1/1     1            1           27h

NAME                                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/ingress-nginx-controller-67d59d7bc   1         1         1       27h

NAME                                       COMPLETIONS   DURATION   AGE
job.batch/ingress-nginx-admission-create   1/1           24s        27h
job.batch/ingress-nginx-admission-patch    1/1           26s        27h

kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname>

➜  ~ kubectl describe pods -n ingress-nginx ingress-nginx-controller-67d59d7bc-rt9wm
Name:             ingress-nginx-controller-67d59d7bc-rt9wm
Namespace:        ingress-nginx
Priority:         0
Service Account:  ingress-nginx
Node:             local-control-plane/172.18.0.2
Start Time:       Tue, 25 Jun 2024 17:01:51 +0800
Labels:           app.kubernetes.io/component=controller
                  app.kubernetes.io/instance=ingress-nginx
                  app.kubernetes.io/name=ingress-nginx
                  pod-template-hash=67d59d7bc
Annotations:      <none>
Status:           Running
IP:               10.244.0.10
IPs:
  IP:           10.244.0.10
Controlled By:  ReplicaSet/ingress-nginx-controller-67d59d7bc
Containers:
  controller:
    Container ID:  containerd://e47ac4872bbca44f589f2c1aef386139cda86b616618b8342871f177c89ad96b
    Image:         registry.k8s.io/ingress-nginx/controller:v1.6.3@sha256:b92667e0afde1103b736e6a3f00dd75ae66eec4e71827d19f19f471699e909d2
    Image ID:      registry.k8s.io/ingress-nginx/controller@sha256:b92667e0afde1103b736e6a3f00dd75ae66eec4e71827d19f19f471699e909d2
    Ports:         80/TCP, 443/TCP, 8443/TCP
    Host Ports:    80/TCP, 443/TCP, 0/TCP
    Args:
      /nginx-ingress-controller
      --election-id=ingress-nginx-leader
      --controller-class=k8s.io/ingress-nginx
      --ingress-class=nginx
      --configmap=$(POD_NAMESPACE)/ingress-nginx-controller
      --validating-webhook=:8443
      --validating-webhook-certificate=/usr/local/certificates/cert
      --validating-webhook-key=/usr/local/certificates/key
      --watch-ingress-without-class=true
      --publish-status-address=localhost
    State:          Running
      Started:      Tue, 25 Jun 2024 17:01:52 +0800
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      100m
      memory:   90Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       ingress-nginx-controller-67d59d7bc-rt9wm (v1:metadata.name)
      POD_NAMESPACE:  ingress-nginx (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /usr/local/certificates/ from webhook-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zljhb (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  webhook-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ingress-nginx-admission
    Optional:    false
  kube-api-access-zljhb:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              ingress-ready=true
                             kubernetes.io/os=linux
Tolerations:                 node-role.kubernetes.io/control-plane:NoSchedule
                             node-role.kubernetes.io/master:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>

kubectl -n <ingresscontrollernamespace> describe svc <ingresscontrollerservicename>

➜  ~ kubectl describe svc -n ingress-nginx  ingress-nginx-controller
Name:                     ingress-nginx-controller
Namespace:                ingress-nginx
Labels:                   app.kubernetes.io/component=controller
                          app.kubernetes.io/instance=ingress-nginx
                          app.kubernetes.io/name=ingress-nginx
                          app.kubernetes.io/part-of=ingress-nginx
                          app.kubernetes.io/version=1.6.3
Annotations:              <none>
Selector:                 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type:                     NodePort
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.96.252.74
IPs:                      10.96.252.74
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  31852/TCP
Endpoints:                10.244.0.10:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  32612/TCP
Endpoints:                10.244.0.10:443
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2024-06-26T09:01:16Z

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ZhengXinwei-F · 2024-06-26T09:05:22Z

Currently, I'm using nginx.ingress.kubernetes.io/service-upstream: "true" to avoid this issue by utilizing kube-proxy to forward traffic to the pods.

longwuyuan · 2024-06-26T09:11:11Z

@ZhengXinwei-F the information you have provided is useful to become aware but useless to do much analysis or debugging

Click on the button to create a new bug report
Look at the template
Look at the questions asked in the template
Edit this issue description and answer all those questions that are asked in the template, by editing this issue description
Diagram for architecture and concept is great
Screenshot for info like logs and kubectl state is useless. Replace the screenshots for logs and K8S state with commands and outputs like kubectl describe and kubectl logs
Explain the relationship of your app scaling with the ingress-controller log error message
If your app needs affinity and you have not configured affinity to a pod, then the controller is not going to automatically send all traffic to the same backend pod. LoadBalancing is inbuilt to the controller and backend pods are expected to be idempotent by default. This is only if your app is dependent on persistent connection to same pod so same process in pod (if)
There are users who use many many replicas of backend pods and don't have problems so there is no bug as such with the controller and multiple replicas of backend

ZhengXinwei-F · 2024-06-26T09:12:21Z

Here is my simple demo code:

server code:

package main

import (
	"fmt"
	"net/http"
	"strings"
	"time"

	"github.com/gin-gonic/gin"
)

func watchHandler(c *gin.Context) {
	if strings.Contains(c.Request.URL.RawQuery, "watch") {
		c.Header("Transfer-Encoding", "chunked")
		c.Header("Content-Type", "text/plain")

		w := c.Writer
		ctx := c.Request.Context()

		ticker := time.NewTicker(1 * time.Second)
		defer ticker.Stop()
		for {
			select {
			case <-ticker.C:
				now := time.Now().Format(time.RFC3339)
				fmt.Fprintf(w, "%x\r\n%s\r\n\r\n", len(now), now)
				flusher, ok := w.(http.Flusher)
				if ok {
					flusher.Flush()
				}

			case <-ctx.Done():
				fmt.Println("Client disconnected")
				return
			}
		}
	} else {
		c.String(http.StatusOK, "Not watching...")
	}
}

func main() {
	r := gin.Default()

	r.GET("/", watchHandler)

	fmt.Println("Listening on http://127.0.0.1:8080/")
	err := r.Run(":80")
	if err != nil {
		panic(err)
	}
}

client:

package main

import (
	"bufio"
	"fmt"
	"io"
	"net/http"
	"strings"
	"time"
)

func main() {
	url := "http://local-server.com:80/?watch"
	for i := 0; i < 3; i++ {
		go func() {
			for {
				err := watch(url)
				if err != nil {
					fmt.Printf("Error watching %s: %v\n", url, err)
				}
				time.Sleep(1 * time.Second)
			}
		}()
	}
	time.Sleep(1 * time.Hour)
}

func watch(url string) error {
	fmt.Println("=== start watching at ",time.Now())
	resp, err := http.Get(url)
	if err != nil {
		return err
	}
	defer resp.Body.Close()

	if resp.StatusCode != http.StatusOK {
		return fmt.Errorf("[%s] unexpected status code: %d", time.Now(), resp.StatusCode)
	}

	reader := bufio.NewReader(resp.Body)
	for {
		chunk, err := reader.ReadBytes('\n')
		if err != nil {
			if err == io.EOF {
				return nil
			}
			fmt.Println(err.Error())
			return err
		}

		fmt.Println(strings.TrimSpace(string(chunk)))
	}
}

server.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: server
  template:
    metadata:
      labels:
        app: server
    spec:
      terminationGracePeriodSeconds: 30
      containers:
      - name: server-container
        image: server:1.0.5 
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: server
spec:
  selector:
    app: server
  ports:
    - protocol: TCP
      port: 80  
      targetPort: 80 
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/cors-allow-origin: '*'
    nginx.ingress.kubernetes.io/enable-cors: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: 1024m
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "60"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "2592000"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "2592000"
      #nginx.ingress.kubernetes.io/websocket-services: "server"
  name: server-ingress
spec:
  ingressClassName: nginx
  rules:
    - host: local-server.com 
      http:
        paths:
          - path: /  
            pathType: Prefix
            backend:
              service:
                name: server
                port:
                  number: 80

hosts:
172.18.0.2 local-server.com (kind control-plane nodes)

ZhengXinwei-F · 2024-06-26T09:20:43Z

@longwuyuan

If your app needs affinity and you have not configured affinity to a pod, then the controller is not going to automatically send all traffic to the same backend pod. LoadBalancing is inbuilt to the controller and backend pods are expected to be idempotent by default. This is only if your app is dependent on persistent connection to same pod so same process in pod (if)

There are users who use many many replicas of backend pods and don't have problems so there is no bug as such with the controller and multiple replicas of backend

Thank you so much for your response. I will pay attention to my information format in future issues. However, I'm particularly curious about why the ingress continues to send TCP requests to the removed pod even after all pods have been deleted.

longwuyuan · 2024-06-26T09:43:11Z

At a glance you did NOT test with a NON-golang HTTP library like ;

kubectl create deploy ing test0 --image nginx:alpine --port 80 --replicas 3
kubectl expose deploy test0 --port 80
kubectl create ing test0 -- class nginx --rule test0.mydomain.com/"*"=test0:80
curl test0.mydomain.com --resolve test0.mydomain.com:80:$EXTERNAL_IP

So there is no data about established UN-Terminated connections to a backend.

Secondly, I can't find this annotation "websocket-services" anywhere on this page https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/ .

This means that the data & information you are presenting here is not really as reliable as you desire it to be.

I think you are experiencing some genuine problem so its fair to assume that you want to ask about the persistent connection and continued requests, here. But you are not putting together the impact of the factors like ;

the application profile [(the http.get(url) ]
the pod termination specs for scaling down or up
the timeout configurations
the invalid annotation "websocket-services"

in a way that other readers here would need for analysis.

Endpointslice is used to maintain the list of backend pods. So a lot of intricate debugging steps are involved here to answer your question.

Please wait for comments from others also. I have no data to comment on your question

longwuyuan · 2024-06-26T09:45:11Z

/remove-kind bug
/kind support
/triage needs-information

ZhengXinwei-F · 2024-06-26T09:51:12Z

At a glance you did NOT test with a NON-golang HTTP library like ;

kubectl create deploy ing test0 --image nginx:alpine --port 80 --replicas 3

kubectl expose deploy test0 --port 80

kubectl create ing test0 -- class nginx --rule test0.mydomain.com/"*"=test0:80

curl test0.mydomain.com --resolve test0.mydomain.com:80:$EXTERNAL_IP

So there is no data about established UN-Terminated connections to a backend.

Secondly, I can't find this annotation "websocket-services" anywhere on this page https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/ .

This means that the data & information you are presenting here is not really as reliable as you desire it to be.

I think you are experiencing some genuine problem so its fair to assume that you want to ask about the persistent connection and continued requests, here. But you are not putting together the impact of the factors like ;

the application profile [(the http.get(url) ]

the pod termination specs for scaling down or up

the timeout configurations

the invalid annotation "websocket-services"

in a way that other readers here would need for analysis.

Endpointslice is used to maintain the list of backend pods. So a lot of intricate debugging steps are involved here to answer your question.

Please wait for comments from others also. I have no data to comment on your question

I think your response is hasty. The websocket-services annotation does not affect functionality. From the logs, it is clear that the ingress attempted to access an already deleted pod. I believe this is a bug, not a support issue.

/remove-kind support
/kind bug

ZhengXinwei-F · 2024-06-26T10:00:44Z

What happened: I have a server providing list-watch capabilities exposed to the outside via an ingress-controller. When there are only 1-2 instances of the server, scaling operations cause client requests to fail but they recover quickly. However, when there are three instances, the client (using client-go) does not break its TCP connection with the ingress (this is issue 1). Additionally, the ingress continues to send TCP SYN packets to the already dead pod, resulting in continuous TCP retransmissions. As a result, the client cannot properly access the ingress.

It's important to note that both the connections between the ingress and the client, as well as between the ingress and the server, are HTTP long connections using HTTP chunked transfer to continuously transmit information.

ZhengXinwei-F · 2024-06-26T10:03:17Z

The core issues are as follows:

When there are 1-2 instances, the ingress does not attempt to access deleted pods.
When there are 3 or more instances, the client successfully establishes a TCP connection with the ingress, but the ingress attempts to access deleted pods.

longwuyuan · 2024-06-26T10:03:23Z

ok sorry. wait for comments from others who can reproduce the problem.

ZhengXinwei-F · 2024-06-26T10:05:25Z

ok sorry. wait for comments from others who can reproduce the problem.

Thank you again for your prompt response.

longwuyuan · 2024-06-26T10:31:53Z

I tried to reproduce with image nginx:alpine and I am not able to reproduce that error message so I don't think this can be classified as a bug because there is no reproduce procedure

ZhengXinwei-F · 2024-06-27T02:03:33Z

I tried to reproduce with image nginx:alpine and I am not able to reproduce that error message so I don't think this can be classified as a bug because there is no reproduce procedure

I apologize for the incomplete reproduction information. I will provide a more detailed reproduction scenario shortly. Currently, I suspect it might be related to endpointSlice informer delays. I am also attempting to read through the ingress-controller code, but it is not a simple task.

longwuyuan · 2024-06-27T02:21:33Z

this issue can tagged as a bug is invalid because the concern is related to the unmet basic requirements of a bug ;

The versions of the controller you posted are not supported anymore simply because there are too many changes to code. How can it be a bug when the version of the controller is not supported
Too many users are scaling up & down 24/7/365 but there are hardly any other reports of scaling being broken. How can scaling be broken only for you. The thoughts revolve cause being a impact of your use-case and your config. Until scaling can be seen as broken with a simple test like ;

kubectl create deploy test0 --image nginx:alpine --port 80 --replicas 5
kubectl scale deploy test0 --replicas 1

there is no data for a developer to accept the triage as scaling broken in the controller

ZhengXinwei-F · 2024-06-27T02:42:23Z

this issue can tagged as a bug is invalid because the concern is related to the unmet basic requirements of a bug ;

The versions of the controller you posted are not supported anymore simply because there are too many changes to code. How can it be a bug when the version of the controller is not supported

Too many users are scaling up & down 24/7/365 but there are hardly any other reports of scaling being broken. How can scaling be broken only for you. The thoughts revolve cause being a impact of your use-case and your config. Until scaling can be seen as broken with a simple test like ;
kubectl create deploy test0 --image nginx:alpine --port 80 --replicas 5
kubectl scale deploy test0 --replicas 1
there is no data for a developer to accept the triage as scaling broken in the controller

I will upgrade the ingress-controller to the latest version to see if this resolves the issue. Additionally, since I am using HTTP chunked transfer over TCP, the scenario is little complex.

longwuyuan · 2024-06-27T03:12:30Z

I wanted to get to it later, if data showed any relevance. But since you repeat mentioned chunking, I have to make it clear that setting chunking in app code is a big problem.

That is because v1.10.1 of the controller sets chunking on, out of the box, as per latest nginx design. Another user removed the code for setting chunking header, from his app #11162

But lets no go there aimless. Just try to scale down from 5 to 1, with a vanilla nginx:alpine image, and gather the same same debugging data, from controller v1.10.1 . Do it on a kind cluster or a minikube cluster so others can replicate your test and reproduce the problem you are reporting. Because the controller code will work the same, regardless if its your golang app or a simple curl client.

ZhengXinwei-F · 2024-07-02T07:00:53Z

I've been researching this issue for the past few days. During the deployment rollout of the backend server, ingress indeed continues to route traffic to deleted pods. Through code analysis, it appears that this is related to ingress list-watching configmaps, ingress, endpointSlices, and other resources. Each event triggers a complete upstream refresh, and this process is serial. Reducing the proxy-connect-timeout can effectively minimize the time for closing long connections. This is my preliminary understanding.

Additionally, there are other issues I'm encountering. It seems that the ingress's TCP keep-alive uses the kernel configuration within the container, causing it to wait a long time without sending TCP keep-alive packets. Only the server's TCP keep-alive is functioning, which is ineffective when the server restarts.

Furthermore, Kubernetes's client-go watchHandler can hang due to this, and despite waiting well beyond the watchHandler timeout, it still doesn't trigger a relist. This issue is quite complex and cannot be easily reproduced. It's evidently not just a problem within the ingress-nginx alone.

Thanks to @longwuyuan for the assistance.

I will change this issue to support. If anyone has encountered similar problems, feel free to discuss.

/remove-kind bug
/kind support

longwuyuan · 2024-07-02T07:13:06Z

Thank you for the udpate @ZhengXinwei-F

Slowly, the requirement is shifting to being completely agnostic to platform, infra & network in the apps, simply from the fact that app code needs to be portable/flexible.

You can close the issue as there are no more questions to the project and continue with discussions in a closed issue. Open issues just add to the tally. You can reopen if a action-item on the project comes up.

ZhengXinwei-F · 2024-07-02T07:22:51Z

Thank you for the udpate @ZhengXinwei-F

Slowly, the requirement is shifting to being completely agnostic to platform, infra & network in the apps, simply from the fact that app code needs to be portable/flexible.

You can close the issue as there are no more questions to the project and continue with discussions in a closed issue. Open issues just add to the tally. You can reopen if a action-item on the project comes up.

My issue has not been fully resolved, so I need to keep this issue open for a while. I plan to close it in about a week.

longwuyuan · 2024-07-02T07:30:06Z

ok thanks for updating that. cheers.

ZhengXinwei-F · 2024-07-09T02:19:51Z

/close

k8s-ci-robot · 2024-07-09T02:19:56Z

@ZhengXinwei-F: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ZhengXinwei-F added the kind/bug Categorizes issue or PR as related to a bug. label Jun 26, 2024

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jun 26, 2024

k8s-ci-robot added the needs-priority label Jun 26, 2024

strongjz added this to [SIG Network] Ingress NGINX Jun 26, 2024

k8s-ci-robot added kind/support Categorizes issue or PR as a support question. triage/needs-information Indicates an issue needs more information in order to work on it. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jun 26, 2024

k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. and removed kind/support Categorizes issue or PR as a support question. labels Jun 26, 2024

longwuyuan mentioned this issue Jun 28, 2024

Long worker-shutdown-timeout causes long-lived connections to fail #11515

Open

k8s-ci-robot added kind/support Categorizes issue or PR as a support question. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jul 2, 2024

k8s-ci-robot closed this as completed Jul 9, 2024

github-project-automation bot moved this to Done in [SIG Network] Ingress NGINX Jul 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server scaling causes ingress TCP retransmissions, and the client cannot access the server properly. #11508

Server scaling causes ingress TCP retransmissions, and the client cannot access the server properly. #11508

ZhengXinwei-F commented Jun 26, 2024 •

edited

Loading

k8s-ci-robot commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

longwuyuan commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

longwuyuan commented Jun 26, 2024

longwuyuan commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

longwuyuan commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

longwuyuan commented Jun 26, 2024

ZhengXinwei-F commented Jun 27, 2024

longwuyuan commented Jun 27, 2024

ZhengXinwei-F commented Jun 27, 2024

longwuyuan commented Jun 27, 2024 •

edited

Loading

ZhengXinwei-F commented Jul 2, 2024

longwuyuan commented Jul 2, 2024

ZhengXinwei-F commented Jul 2, 2024

longwuyuan commented Jul 2, 2024

ZhengXinwei-F commented Jul 9, 2024

k8s-ci-robot commented Jul 9, 2024

Server scaling causes ingress TCP retransmissions, and the client cannot access the server properly. #11508

Server scaling causes ingress TCP retransmissions, and the client cannot access the server properly. #11508

Comments

ZhengXinwei-F commented Jun 26, 2024 • edited Loading

k8s-ci-robot commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

longwuyuan commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

longwuyuan commented Jun 26, 2024

longwuyuan commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

longwuyuan commented Jun 26, 2024

ZhengXinwei-F commented Jun 26, 2024

longwuyuan commented Jun 26, 2024

ZhengXinwei-F commented Jun 27, 2024

longwuyuan commented Jun 27, 2024

ZhengXinwei-F commented Jun 27, 2024

longwuyuan commented Jun 27, 2024 • edited Loading

ZhengXinwei-F commented Jul 2, 2024

longwuyuan commented Jul 2, 2024

ZhengXinwei-F commented Jul 2, 2024

longwuyuan commented Jul 2, 2024

ZhengXinwei-F commented Jul 9, 2024

k8s-ci-robot commented Jul 9, 2024

ZhengXinwei-F commented Jun 26, 2024 •

edited

Loading

longwuyuan commented Jun 27, 2024 •

edited

Loading