Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support grpc_read_timeout and grpc_send_timeout in annotations #11250

Closed
Anddd7 opened this issue Apr 11, 2024 · 17 comments · Fixed by #11258
Closed

Support grpc_read_timeout and grpc_send_timeout in annotations #11250

Anddd7 opened this issue Apr 11, 2024 · 17 comments · Fixed by #11258
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@Anddd7
Copy link
Contributor

Anddd7 commented Apr 11, 2024

What do you want to happen?

We want to configure the grpc_read_timeout and grpc_send_timeout with annotations on ingress.

Since the server-snippet has some potential risk, we have to migrate all snippets to common annotations.
https://github.com/search?q=repo%3Akubernetes%2Fingress-nginx+path%3A%2F%5Einternal%5C%2Fingress%5C%2Fannotations%5C%2F%2F+parser.AnnotationRiskCritical&type=code

Is there currently another issue associated with this?

#2475, but it's inactive and closed.

Does it require a particular kubernetes version?

No

Solutions

1. Add new annotations, e.g.

...
annotations:
  nginx.ingress.kubernetes.io/grpc_read_timeout: 3600
  nginx.ingress.kubernetes.io/grpc_send_timeout: 3600
...

2. Set grpc_read_timeout and grpc_send_timeout with the same value in proxy_read_timeout

Similar like nginx-ingress-controller, just set grpc with proxy settings by default

@Anddd7 Anddd7 added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 11, 2024
@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels Apr 11, 2024
@Anddd7
Copy link
Contributor Author

Anddd7 commented Apr 11, 2024

i'd like to contribute if it's ready to implement

@longwuyuan
Copy link
Contributor

/triage accepted

  • I think a PR will be reviewed because it will offset using snippets
  • But need to improve he accuracy of reference because that link is a nginx webserver and reverseproxy link and this is a go+lua implementation of ingress API. But yeah it is obviously originating in nginx upstream
  • Do you want to submit a PR
  • There is even a video on how to create new annotation in the docs on the developer's help page

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 11, 2024
@longwuyuan
Copy link
Contributor

/assign

@Anddd7
Copy link
Contributor Author

Anddd7 commented Apr 13, 2024

@longwuyuan yes, i have read the docs and read to create a PR.

I refer to exsiting proxy settings to create 3 annotations for grpc timeout, please take a look, #11258

I'll refine the documentation later.

the upstream nginx is already support this, https://nginx.org/en/docs/http/ngx_http_grpc_module.html#grpc_connect_timeout.

@strongjz
Copy link
Member

/priority backlog

@k8s-ci-robot k8s-ci-robot added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-priority labels Apr 25, 2024
Copy link

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

@github-actions github-actions bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label May 26, 2024
@flphvlck
Copy link
Contributor

flphvlck commented Aug 19, 2024

Hi @strongjz , maybe this should be noted as breaking change in release notes for 1.11
Because we are using:

      nginx.ingress.kubernetes.io/configuration-snippet: |
        grpc_next_upstream_tries X;
        grpc_read_timeout Xs;
        grpc_send_timeout Xs;

and after upgrade from 1.10 to 1.11 we ended up with these errors:

Error: exit status 1
2024/08/19 12:10:49 [emerg] 60#60: "grpc_read_timeout" directive is duplicate in /tmp/nginx/nginx-cfg658058193:2367
nginx: [emerg] "grpc_read_timeout" directive is duplicate in /tmp/nginx/nginx-cfg658058193:2367
nginx: configuration file /tmp/nginx/nginx-cfg658058193 test failed

In the /tmp/nginx/nginx-cfg658058193 there really are duplicated directives for grpc (default values from annotations and our values from configuration-snippet

			# Grpc settings
			grpc_connect_timeout                    5s;
			grpc_send_timeout                       60s;
			grpc_read_timeout                       60s;

			grpc_next_upstream error timeout http_502 http_503 http_504 non_idempotent;
			grpc_next_upstream_tries X;
			grpc_read_timeout Xs;
			grpc_send_timeout Xs;

We will switch from snippet to annotations to fix this, but I think it's better to know about this from release notes.

@Anddd7
Copy link
Contributor Author

Anddd7 commented Aug 21, 2024

So we may need a 'merge' mechanism to ensure the annotations and snippets can work together, if there is a duplicate config. Then, we can migrate the snippets to annotations. (if we create more annotations)

@flphvlck
Copy link
Contributor

flphvlck commented Aug 21, 2024

Yeah, that's another option how to deal with it (except "breaking change" note)
I'm in the beggining of the upgrade process in our clusters and I will investigate how big problem is this for us. But it will deffinitely make the whole process harder

Possible merge mechanism - just don't apply default values from annotations when GRPC is used?

@Anddd7
Copy link
Contributor Author

Anddd7 commented Aug 22, 2024

Yeah, that's another option how to deal with it (except "breaking change" note) I'm in the beggining of the upgrade process in our clusters and I will investigate how big problem is this for us. But it will deffinitely make the whole process harder

Possible merge mechanism - just don't apply default values from annotations when GRPC is used?

ok, the solution 1 would be safer -- add new annotations, it doesn't break the existing "configurations", you can add new annotations and remove snippets at the same time.

how do you think? @tao12345666333 @longwuyuan

@sepich
Copy link
Contributor

sepich commented Aug 26, 2024

Sorry guys, but that is a really breaking change not mentioned in any changelogs.
Could you please at least add this to Release and https://github.com/kubernetes/ingress-nginx/tree/main/changelog ?
What happens on upgrade now is that Ingress completely stops serving any sites, as it has some typo in some single config.
Is it possible to have nginx_ingress_controller_config_last_reload_successful as a readinessCheck, so that new version with broken config cannot even be rolled out?

@flphvlck
Copy link
Contributor

@sepich it's because of duplicities in nginx.conf - in case you set grpc timeouts in nginx.ingress.kubernetes.io/configuration-snippet

@sepich
Copy link
Contributor

sepich commented Aug 26, 2024

Thank you, I know why it happend.
Question is how should I know it before the upgrade?
No any mentions to "check all your snippets or everything would be down" in changelogs.

@flphvlck
Copy link
Contributor

I raised issue #11866

@flphvlck
Copy link
Contributor

ok, the solution 1 would be safer -- add new annotations, it doesn't break the existing "configurations", you can add new annotations and remove snippets at the same time.

@Anddd7 that would be absolutely marvellous

@jon-rei
Copy link

jon-rei commented Sep 26, 2024

We are in the process of upgrading our clusters to the latest version and have encountered this problem. For each cluster we have about 50 ingress resources that have grpc_read_timeout configured as a snippet.

Is there a recommended strategy for handling the upgrade to 1.11 for a large number of ingress resources?

@flphvlck
Copy link
Contributor

@jon-rei we are also in the process of upgrading
I have to change settings for the 3 affected options manually in each ingress resource which uses GRPC

worse thing is that the default for grpc_connect_timeout was changed from 60s to 5s - as I understood this change
it's not unusual that you don't even know that you rely on default value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging a pull request may close this issue.

7 participants