Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add basic retry middleware #283

Merged
merged 1 commit into from
Oct 7, 2022
Merged

feat: add basic retry middleware #283

merged 1 commit into from
Oct 7, 2022

Conversation

cprice404
Copy link
Contributor

This commit adds an initial implementation of a retry middleware, along with fleshing out the FixedCountRetryStrategy for compatibility with the new middleware. As of this commit, any Get/Set requests that fail with an Unavailable or Internal error gRPC status code will be retried up to 3 attempts.

There is no backoff or jitter yet; we should implement that in the future.

The determination of which gRPC status codes and request types are eligible for retries is implemented via a new IRetryEligibilityStrategy interface. This way, individual retry strategies have the ability to override the behavior, but in the 90% case they can just re-use the default eligibility strategy.

NOTE: we implement retries as a bespoke middleware rather than using the retry configuration that is built into the .NET gRPC library because the built-in implementation has some strict restrictions that aren't compatible with our goals here (e.g. no response that includes any headers is eligible for retry in the built-in implementation, and we always get headers back from envoy.)

@cprice404
Copy link
Contributor Author

i am planning to add another commit that adds some unit tests, but wanted to get this up for review since I believe the bulk of the work is done.

@cprice404 cprice404 force-pushed the migrate-header-interceptor-to-middleware branch from 3d5ef7c to 3e10ed2 Compare October 7, 2022 18:58
Base automatically changed from migrate-header-interceptor-to-middleware to main October 7, 2022 20:24
pgautier404
pgautier404 previously approved these changes Oct 7, 2022
Copy link
Contributor

@pgautier404 pgautier404 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This commit adds an initial implementation of a retry middleware,
along with fleshing out the FixedCountRetryStrategy for compatibility
with the new middleware.  As of this commit, any Get/Set requests
that fail with an Unavailable or Internal error gRPC status code
will be retried up to 3 attempts.

There is no backoff or jitter yet; we should implement that in the
future.

The determination of which gRPC status codes and request types are
eligible for retries is implemented via a new IRetryEligibilityStrategy
interface.  This way, individual retry strategies have the ability
to override the behavior, but in the 90% case they can just re-use
the default eligibility strategy.

NOTE: we implement retries as a bespoke middleware rather than using
the retry configuration that is built into the .NET gRPC library
because the built-in implementation has some strict restrictions
that aren't compatible with our goals here (e.g. no response that
includes any headers is eligible for retry in the built-in implementation,
and we always get headers back from envoy.)
@cprice404
Copy link
Contributor Author

rebased

@cprice404 cprice404 merged commit 3bf6938 into main Oct 7, 2022
@cprice404 cprice404 deleted the implement-retries branch October 7, 2022 21:38
{
_logger.LogDebug($"Determining whether request is eligible for retry; status code: {grpcStatus.StatusCode}, request type: {grpcRequest.GetType()}, attemptNumber: {attemptNumber}, maxAttempts: {MaxAttempts}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and elsewhere I notice you're charging the users' runtimes string interpolation allocations unconditionally. I think you should rewrite all these LogDebugs like:

Suggested change
_logger.LogDebug($"Determining whether request is eligible for retry; status code: {grpcStatus.StatusCode}, request type: {grpcRequest.GetType()}, attemptNumber: {attemptNumber}, maxAttempts: {MaxAttempts}");
_logger.LogDebug(
"Determining whether request is eligible for retry; status code: {StatusCode}, request type: {RequestType}, attemptNumber: {AttemptNumber}, maxAttempts: {MaxAttempts}",
grpcStatus.StatusCode,
grpcRequest.GetType(),
attemptNumber,
MaxAttempts
);

see for more examples and detail: https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.logging.loggerextensions.logdebug?view=dotnet-plat-ext-6.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, agreed, it was my intent to write them that way but I always forget. thx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kotlin-structured-logging is an addictive library. It's so nice to not have to care about anything w.r.t. logging invocation cost inside of those logging blocks. I catch myself writing logs like this too so I look out for them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i hope we will have a rust-structured-logging (or maybe one already exists in the rust universe)

Func<TRequest, CallOptions, Task<MiddlewareResponseState<TResponse>>> continuation
) where TRequest : class where TResponse : class
{
var foo = request.GetType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be removed

Comment on lines +53 to +54
attemptNumber++;
nextState = await continuation(request, callOptions);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I see how this will disambiguate between the likes of ListPushBack() or DictionaryIncrement() (non-idempotent methods) and Get() or Set() (idempotent methods). What's your plan there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the code in DefaultRetryEligibilityStrategy, which is called as part of the FixedCountRetryStrategy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely missed _retryableRequestTypes, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants