-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gep: add GEP-1731 Configurable Retries #3199
Conversation
Skipping CI for Draft Pull Request. |
10a32e4
to
63f5f63
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if some of the pending TODOs are finished up. I really, really like the background detail here, that is a great (if disheartening) read.
/approve
but we definitely need some more eyes here for further LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done @mikemorris! Very thorough and well written GEP. Seems like the most complex part will be having some portability around what status codes can be retried, left some comments/ideas around that.
b80bb50
to
2367d35
Compare
@kate-osborn Do you have a suggestion for the language you would want around behavior with singleton backends? |
Co-authored-by: Flynn <[email protected]>
Co-authored-by: Flynn <[email protected]>
- change connection error and backend timeout retry from MUST to SHOULD - add conformance flags for connection errors and backend timeouts - add UNRESOLVED warning
Status Update: I believe we should merge this GEP for v1.2. Despite this being just past the extension for the GEP deadline, this one was largely good to go yesterday with only formatting bits to cleanup. We've had multiple review cycles with a variety of reviewers, and the resulting API change is fairly safe and contained. I'll defer to @shaneutt or @mlavacca for the final call here. Thanks to @mikemorris for all the work on this one! /approve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the work on this, @mikemorris!
/lgtm
/unhold
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mikemorris, mlavacca, robscott, youngnick The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* gep: add GEP-1731 Configurable Retries * minor updates to NGINX summary * add clarification around retry on connection errors * fixup details formatting for implementation background * add Linkerd retry summary * Update geps/gep-1731/index.md * Update geps/gep-1731/index.md * website: add GEP-1731 to mkdocs.yml index * update status code rules for retries * update conformance details for 5xx * add note on streams out of scope as non-goal * add note on mesh consumer vs product routes * add introduction * Fixup YAML to nest retry stanza under HTTPRouteRule, not BackendRefs * add note on HTTPRouteFilter alternative * remove allowance for [1-9]xx range values other than 5xx * Update geps/gep-1731/index.md Co-authored-by: Kate Osborn <[email protected]> * Update geps/gep-1731/index.md Co-authored-by: Kate Osborn <[email protected]> * update GEP-1731 metadata * rename GEP-1731 * add SupportHTTPRouteRetryBackoff conformance details * split out Other Considerations section, update retry budget future-proofing question * fixup nested list formatting in conformance details * add future excludeRetryOnTimeout consideration * move gRPC retries to non-goals, remove Envoy gRPC retry details * clarify streaming support non-goal as needing a separate GEP * update Linkerd retry background section Co-authored-by: Flynn <[email protected]> * update streaming language in non-goals Co-authored-by: Flynn <[email protected]> * update connection error and timeout, add warning - change connection error and backend timeout retry from MUST to SHOULD - add conformance flags for connection errors and backend timeouts - add UNRESOLVED warning * remove 5xx shorthand, switch HTTPRouteStatusCode to int * Update geps/gep-1731/index.md --------- Co-authored-by: Mike Morris <[email protected]> Co-authored-by: Kate Osborn <[email protected]> Co-authored-by: Flynn <[email protected]>
What type of PR is this?
/kind gep
What this PR does / why we need it:
Proposes configuration within HTTPRoute to retry unsuccessful requests to backends before sending a response to a client.
Which issue(s) this PR fixes:
Fixes #1731
Does this PR introduce a user-facing change?: