proxy-read-timeout annotations getting ignored after v1.11.1 upgrade from 1.10.1 #11850

varunthakur2480 · 2024-08-22T11:54:30Z

What happened:

2024/08/22 10:51:43 [error] 41#41: *3030 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 10.124.70.10, server: xxxx-gateway-xxxx.l7.dev2.xx.gcp.xxx.net, request: "POST /rbs.gbm.xxx.web_service_core.gateway.structured_document.MdxStructuredDocumentService/QueryViewPaginated HTTP/2.0", upstream: "grpc://100.71.1.170:5000", host: "xxx-gateway-xxx.l7.dev2.xxx.gcp.xxx.net:443"

Application logs - https://sxxxxxl/KkgESm6Lp3cdbWJDA
Retrying client request due to: [Status(StatusCode="Unknown", Detail="Stream removed", DebugException="Grpc.Core.Internal.CoreErrorDetailException: {"created":"@1724323903.302000000","description":"Error received from peer ipv4:10.124.66.63:443","file":"......\src\core\lib\surface\call.cc","file_line":953,"grpc_message":"Stream removed","grpc_status":2}")]. Retry number [1/10]

What you expected to happen:

Client should not have timed out

It looks like something has changed between 1.10.1 and v1.11.1 after which client side annotations are not being honoured
nginx.ingress.kubernetes.io/ssl-redirect: "true" nginx.ingress.kubernetes.io/proxy-body-size: "500m" nginx.ingress.kubernetes.io/proxy-buffer-size: "16k" nginx.ingress.kubernetes.io/proxy-connect-timeout: 600s nginx.ingress.kubernetes.io/proxy-read-timeout: 600s

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.): 1.11.1

Kubernetes version (use kubectl version): 1.28

Environment:

Cloud provider or hardware configuration: GCP
OS (e.g. from /etc/os-release): Continer optimised OS
Kernel (e.g. uname -a): 6.1.85
Install tools:
- Please mention how/where was the cluster created like kubeadm/kops/minikube/kind etc. Terraform + kustomization + helm
Basic cluster related info:
- kubectl version Client Version: v1.29.3
  Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
  Server Version: v1.29.7-gke.100800
- kubectl get nodes -o wide
- NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
  gke-xxx-xxx-xxx-6-n2-16-2023071905-0e2b7d00-y2lc Ready 2d22h v1.29.7-gke.1008000 10.124.64.160 Container-Optimized OS from Google 6.1.85+ containerd://1.7.15
How was the ingress-nginx-controller installed:
- helm template --values values.yaml --namespace nwm-ingress-nginx --version $chart_version ingress-nginx ingress-nginx/ingress-nginx > manifests.yaml
additional config map has been added to address Alpine 3.17 images causes SSL Error "unsafe legacy renegotiation disabled" dotnet/dotnet-docker#4332 and config is upto date with alpine 3.20
Current State of the controller:
- kubectl describe ingressclasses
  Name: nginx
  Labels: app.kubernetes.io/component=controller
  app.kubernetes.io/instance=ingress-nginx
  app.kubernetes.io/managed-by=Helm
  app.kubernetes.io/name=ingress-nginx
  app.kubernetes.io/part-of=ingress-nginx
  app.kubernetes.io/version=1.10.1
  helm.sh/chart=ingress-nginx-4.10.1
  kustomize.toolkit.fluxcd.io/name=gke-cluster-services
  kustomize.toolkit.fluxcd.io/namespace=ddd-flux-system
  Annotations: ingressclass.kubernetes.io/is-default-class: true
  nwm.io/contact: *[email protected]
  Controller: k8s.io/ingress-nginx
  Events:
Current state of ingress object, if applicable:
- kubectl -n <appnamespace> get all,ing -o wide
- kubectl -n <appnamespace> describe ing <ingressname>
- If applicable, then, your complete and exact curl/grpcurl command (redacted if required) and the reponse to the curl/grpcurl command with the -v flag
Others:
- Any other related information like ;
  - copy/paste of the snippet (if applicable)
  - kubectl describe ... of any custom configmap(s) created and in use
  - Any other related information that may help

How to reproduce this issue:
deploy ingress with following annotations
metadata:
annotations:
kubernetes.io/ingress.class: nginx
meta.helm.sh/release-name: mdx
meta.helm.sh/release-namespace: dev2-e2-tst1-mdx-mdx-demo2
nginx.ingress.kubernetes.io/backend-protocol: GRPC
nginx.ingress.kubernetes.io/limit-connections: "1000"
nginx.ingress.kubernetes.io/proxy-body-size: 500m
nginx.ingress.kubernetes.io/proxy-buffer-size: 16k
nginx.ingress.kubernetes.io/proxy-connect-timeout: 600s
nginx.ingress.kubernetes.io/proxy-next-upstream-timeout: 600s
nginx.ingress.kubernetes.io/proxy-read-timeout: 600s
nginx.ingress.kubernetes.io/proxy-send-timeout: 600s

Run regression tasks and long running queries

Anything else we need to know:

No issues are reported in version 1.10.1 where as 1.11.1 consistently times out at 60 seconds
-->

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2024-08-22T11:54:38Z

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

longwuyuan · 2024-08-22T13:53:32Z

The information you have provided is incomplete as most of the important questions from the template are not answered.

Whatever little information you provided can not be used to analyze any problems as such as no reader would be able to recreate the environment you have or the tests you performed. (The information is also not formatted in markdown)

You can help out by answering the questions asked in the new bug report template. And then you could add complete detailed precise and real-use-as-is information from your tests like the output of kubectl describe command for all the resources related to the issue, from your test cluster. Add the logs and the curl command with -v inclusind the curl response so that a reader can copy/paste it all from your tests.

Once triaging results in the data available here showing the bug details, we can re-apply the bug label here.

You can also check the changelog and release notes for relevance to your use-case.

/remove-kind bug
/kind support
/triage needs-information

longwuyuan · 2024-08-22T14:02:55Z

This PR is related to the grpc timeouts #11258

Anddd7 · 2024-08-26T01:35:48Z

Hi @varunthakur2480 , could you try to get

nginx.conf: k exec -it nginx-controller-z2hp4 -- cat /etc/nginx/nginx.conf > nginx.conf (mask any credential urls)
nginx log: k logs nginx-controller-z2hp4 --tail 100, after timeout
grpcbin test: use the same ingress config and deploy a test server grpcbin to verify the connection

varunthakur2480 · 2024-08-27T03:34:43Z

this seems to be sorted after we changed

nginx.ingress.kubernetes.io/ssl-redirect: "true" nginx.ingress.kubernetes.io/proxy-body-size: "500m" nginx.ingress.kubernetes.io/proxy-buffer-size: "16k" nginx.ingress.kubernetes.io/proxy-connect-timeout: 600s nginx.ingress.kubernetes.io/proxy-read-timeout: 600s

to
nginx.ingress.kubernetes.io/ssl-redirect: "true" nginx.ingress.kubernetes.io/proxy-body-size: "500m" nginx.ingress.kubernetes.io/proxy-buffer-size: "16k" nginx.ingress.kubernetes.io/proxy-connect-timeout: "600" nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
removed "s" and added quotes as per documentation here https://github.com/kubernetes/ingress-nginx/blob/main/docs/user-guide/nginx-configuration/annotations.md

see tip !!! tip Annotation keys and values can only be strings. Other types, such as boolean or numeric values must be quoted, i.e. "true", "false", "100".

but still worth highlighting that old annotations without quotes and "s" still work in older version

longwuyuan · 2024-09-07T02:28:23Z

Closing the issue as seems resolved
/close

k8s-ci-robot · 2024-09-07T02:28:29Z

@longwuyuan: Closing this issue.

In response to this:

Closing the issue as seems resolved
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

varunthakur2480 added the kind/bug Categorizes issue or PR as related to a bug. label Aug 22, 2024

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 22, 2024

k8s-ci-robot added the needs-priority label Aug 22, 2024

k8s-ci-robot added kind/support Categorizes issue or PR as a support question. triage/needs-information Indicates an issue needs more information in order to work on it. and removed kind/bug Categorizes issue or PR as related to a bug. labels Aug 22, 2024

k8s-ci-robot closed this as completed Sep 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proxy-read-timeout annotations getting ignored after v1.11.1 upgrade from 1.10.1 #11850

proxy-read-timeout annotations getting ignored after v1.11.1 upgrade from 1.10.1 #11850

varunthakur2480 commented Aug 22, 2024

k8s-ci-robot commented Aug 22, 2024

longwuyuan commented Aug 22, 2024

longwuyuan commented Aug 22, 2024

Anddd7 commented Aug 26, 2024 •

edited

Loading

varunthakur2480 commented Aug 27, 2024 •

edited

Loading

longwuyuan commented Sep 7, 2024

k8s-ci-robot commented Sep 7, 2024

proxy-read-timeout annotations getting ignored after v1.11.1 upgrade from 1.10.1 #11850

proxy-read-timeout annotations getting ignored after v1.11.1 upgrade from 1.10.1 #11850

Comments

varunthakur2480 commented Aug 22, 2024

k8s-ci-robot commented Aug 22, 2024

longwuyuan commented Aug 22, 2024

longwuyuan commented Aug 22, 2024

Anddd7 commented Aug 26, 2024 • edited Loading

varunthakur2480 commented Aug 27, 2024 • edited Loading

longwuyuan commented Sep 7, 2024

k8s-ci-robot commented Sep 7, 2024

Anddd7 commented Aug 26, 2024 •

edited

Loading

varunthakur2480 commented Aug 27, 2024 •

edited

Loading