-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the ability to retry on reset connection to service-routers #12890
Add the ability to retry on reset connection to service-routers #12890
Conversation
Thanks for the first-time contribution ( hope to see many more 👀 )! Generally we expose envoy-config on a per-case basis, but on first glance this use-case seems pretty solid. Since this wasn't made off an associated github issue, i'll talk this over with the team and let you know tommorow if we'd like to expose this config to everyone. Then hopefully we'll get this reviewed and merged shortly after 👍 |
Blake just let me know that this change came from a conversation on gitter, so you're all good to go 👍 . We'll get this reviewed shortly and thank you again for the contribution! |
Awesome. Thanks! |
Thank you for this contribution. I had an opportunity to discuss this PR with the team. Here's the feedback from our review. Envoy supports a number of retry conditions. At the time of writing, there are ten (10) Would you be willing to modify your PR to accommodate this alternative UX? If so, we can share a few UX options we are considering and get your feedback on them prior to you making any changes to the PR. |
Sure, I can modify the PR. What UX options are you considering? |
Hi @aoskotsky-amplify, here's the UX we were considering. It would probably be beneficial to have input validation on the array to limit the retry conditions that can be specified. After looking at Envoy's docs, I believe we'd only want to allow the following conditions to be set:
gRPC retry conditions
Special handling for certain conditions
|
Hi @blake, that UX makes sense to me. I'll start working on it. I have a couple of questions though:
|
Service routers do support routing to gRPC services. The beginning of the doc implies under requirements that service routers can only be used with services using the
Yes, we would like to support this feature as well. I think it makes sense to add it in a separate PR. |
Oh I see. Do you think there should be separate options like |
This pull request has been automatically flagged for inactivity because it has not been acted upon in the last 60 days. It will be closed if no new activity occurs in the next 30 days. Please feel free to re-open to resurrect the change if you feel this has happened by mistake. Thank you for your contributions. |
Add a RetryOn option instead of RetryOnReset so that we can support the remaining envoy retry conditions
@blake I updated this PR with your suggestions. Can you give this another review? Thanks |
@mkeeler could someone from Hashicorp review this PR? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me. In addition to the one request I added a comment for I would also ask that the ServiceRouteDestination
type in api/config_entry_discoverychain.go also be updated to reflect the RetryOn
field (and a test case added to TestAPI_ConfigEntry_DiscoveryChain
in the corresponding test file).
As for your question about whether we should have a RetryOnGRPC
separate from the main RetryOn
field. My opinion is that since Envoy makes no distinction in the routing policy and can do the right thing, then its best to keep it as 1 field in Consul.
@@ -251,6 +257,26 @@ func isValidHTTPMethod(method string) bool { | |||
} | |||
} | |||
|
|||
func isValidRetryCondition(retryOn string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a test for this validation function in agent/structs/config_entry_discoverychain_test.go to enumerate all the valid retry conditions. It is obvious that it works today but if we refactor things in the future it would be great to know if we break something.
Additionally it would be great if you could add two more cases to TestServiceRouterConfigEntry
in that same file. One to ensure that a router with this field configured with correct entries is accepted and one to test the validation error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Let me know if the tests look fine now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a clarification to the docs
{ | ||
name: 'RetryOn', | ||
type: 'array<string>', | ||
description: `Allows retrying requests for a list of conditions. One of: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: `Allows retrying requests for a list of conditions. One of: | |
description: `Allows Consul to retry requests when the requests meet a specific set of conditions. You can specify one of the following lists: |
Is that what this means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. How about this? Allows Consul to retry requests when they meet one of a set of conditions. The available conditions are:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's pretty good. We can probably even trim it down a bit more and improve consistency with the style guide:
Allows Consul to retry requests when they meet one of the following sets of conditions:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
This adds a
RetryOnReset
RetryOn
option toservice-router
configs.My apps are getting an error intermittently when using consul connect:
upstream connect error or disconnect/reset before headers. reset reason: connection termination
. Envoy has an option to automatically retry in case of this kind of error. Consul does not support this option at the moment so I'm implementing it here.Testing & Reproduction steps
Create a
service-router
with theRetryOn
option like below:Links
Issue #10274
PR Checklist