feedback

Signed-off-by: Alex Leong <[email protected]>
linkerd · Aug 6, 2024 · dcdcd82 · dcdcd82
1 parent 024bfbb
commit dcdcd82
Show file tree

Hide file tree

Showing 6 changed files with 47 additions and 30 deletions.
diff --git a/linkerd.io/content/2.16/features/retries-and-timeouts.md b/linkerd.io/content/2.16/features/retries-and-timeouts.md
@@ -4,10 +4,11 @@ description = "Linkerd can perform service-specific retries and timeouts."
 weight = 3
 +++
 
-Automatic retries are one the most powerful and useful mechanisms a service mesh
-has for gracefully handling partial or transient application failures.
+Timeouts and automatic retries are two of the most powerful and useful
+mechanisms a service mesh has for gracefully handling partial or transient
+application failures.
 
-Timeouts and retries can be configured using [HTTPRoute], GrpcRoute, or Service
+Timeouts and retries can be configured using [HTTPRoute], GRPCRoute, or Service
 resources. Retries and timeouts are always performed on the *outbound* (client)
 side.
 

diff --git a/linkerd.io/content/2.16/reference/retries.md b/linkerd.io/content/2.16/reference/retries.md
@@ -10,14 +10,14 @@ failures.
 
 Retries are a client-side behavior, and are therefore performed by the
 outbound side of the Linkerd proxy.[^1] If retries are configured on an
-HttpRoute or GrpcRoute with multiple backends, each retry of a request can
+HTTPRoute or GRPCRoute with multiple backends, each retry of a request can
 potentially get sent to a different backend. If a request has a body larger than
 64KiB then it will not be retried.
 
 ## Configuring Retries
 
 Retries are configured by a set of annotations which can be set on a Kubernetes
-Service resource or on a HttpRoute or GrpcRoute which has a Service as a parent.
+Service resource or on a HTTPRoute or GRPCRoute which has a Service as a parent.
 Client proxies will then retry failed requests to that Service or route. If any
 retry configuration annotations are present on a route resource, they override
 all retry configuration annotations on the parent Service.
@@ -29,15 +29,22 @@ proxies will use the ServiceProfile retry configuration and ignore any retry
 annotations.
 {{< /warning >}}
 
-+ `retry.linkerd.io/http`: A comma seperated list of HTTP response codes which
-should be retried. Valid values include `5xx` to retry all 5XX response codes,
-`gateway-error` to retry response codes 502-504, or a range in the form
-`xxx-yyy` (for example, `500-504`). This annotation is not valid on GrpcRoute
-resources.
++ `retry.linkerd.io/http`: A comma separated list of HTTP response codes which
+should be retried. Each element of the list may be
+  + `xxx` to retry a single response code (for example, `"504"` -- remember,
+    annotation values must be strings!);
+  + `xxx-yyy` to retry a range of response codes (for example, `500-504`);
+  + `gateway-error` to retry response codes 502-504; or
+  + `5xx` to retry all 5XX response codes.
+This annotation is not valid on GRPCRoute resources.
 + `retry.linkerd.io/grpc`: A comma seperated list of gRPC status codes which
-should be retried. Valid values include: `cancelled`, `deadline-exceeded`,
-`internal`, `resource-exhausted`, and `unavailable`. This annotation is not
-valid on HttpRoute resources.
+should be retried. Each element of the list may be
+  + `cancelled`
+  + `deadline-exceeded`
+  + `internal`
+  + `resource-exhausted`
+  + `unavailable`
+This annotation is not valid on HTTPRoute resources.
 + `retry.linkerd.io/limit`: The maximum number of times a request can be
 retried. If unspecified, the default is `1`.
 + `retry.linkerd.io/timeout`: A retry timeout after which a request is cancelled

diff --git a/linkerd.io/content/2.16/reference/timeouts.md b/linkerd.io/content/2.16/reference/timeouts.md
@@ -7,14 +7,15 @@ Linkerd can be configured with timeouts to limit the maximum amount of time on
 a request before aborting.
 
 Timeouts are a client-side behavior, and are therefore performed by the
-outbound side of the Linkerd proxy.[^1] Note that if these timeouts are reached,
-the request will not be retried. Retry timeouts can be configured as part of
+outbound side of the Linkerd proxy.[^1] Note that timeouts configured in this
+way are not retryable -- if these timeouts are reached, the request will not be
+retried. Retryable timeouts can be configured as part of
 [retry configuration](../retries/).
 
 ## Configuring Timeouts
 
 Timeous are configured by a set of annotations which can be set on a Kubernetes
-Service resource or on a HttpRoute or GrpcRoute which has a Service as a parent.
+Service resource or on a HTTPRoute or GRPCRoute which has a Service as a parent.
 Client proxies will then fail requests to that Service or route once they exceed
 the timeout. If any timeout configuration annotations are present on a route
 resource, they override all timeout configuration annotations on the parent
@@ -34,6 +35,11 @@ may be in-flight.
 + `timeout.linkerd.io/idle`: The maximum amount of time a stream may be idle,
 regardless of its state.
 
+If the [request timeout](https://gateway-api.sigs.k8s.io/api-types/httproute/#timeouts-optional)
+field is set on an HTTPRoute resource, it will be used as the
+`timeout.linkerd.io/request` timeout. However, if both the field and the
+annotation are specified, the annotation will take priority.
+
 ## Examples
 
 ```yaml

diff --git a/linkerd.io/content/2.16/tasks/books.md b/linkerd.io/content/2.16/tasks/books.md
@@ -147,8 +147,9 @@ responses from the `books` Service on port 7002.
 
 We know that the webapp component is getting 500s from the books component, but
 it would be great to narrow this down further and get per route metrics. To do
-this, we leverage the Gateway API and define a set of HTTPRoute resources, each
-attached to the `books` Service by specifying it as their `parent_ref`.
+this, we take advantage of the Gateway API and define a set of HTTPRoute
+resources, each attached to the `books` Service by specifying it as their
+`parent_ref`.
 
 ```bash
 kubectl apply -f - <<EOF
@@ -207,7 +208,7 @@ spec:
 EOF
 ```
 
-We can then check that these HTTPRoute have been accepted by their parent
+We can then check that these HTTPRoutes have been accepted by their parent
 Service by checking their status subresource:
 
 ```bash
@@ -299,10 +300,12 @@ outbound_http_route_retry_requests_total{...} 469
 outbound_http_route_retry_successes_total{...} 247
 ```
 
-This tells us that Linkerd make a total of 469 retry requests and 247 of those
-were successful and the other 222 were not and hit the default retry limit of
-`1`. We can improve this further by increasing this limit to allow more than
-1 retry per request:
+This tells us that Linkerd made a total of 469 retry requests, of which 247 were
+successful. The remaining 222 failed and could not be retried again, since we
+didn't raise the retry limit from its default of 1.
+
+We can improve this further by increasing this limit to allow more than 1 retry
+per request:
 
 ```bash
 kubectl -n booksapp annotate httproutes.gateway.networking.k8s.io/books-create \

diff --git a/linkerd.io/content/2.16/tasks/configuring-retries.md b/linkerd.io/content/2.16/tasks/configuring-retries.md
@@ -9,8 +9,8 @@ questions that need to be answered:
 - Which requests should be retried?
 - How many times should the requests be retried?
 
-Both of these questions can be answered by adding annotations to the Service
-or HttpRoute resource you're sending requests to.
+Both of these questions can be answered by adding annotations to the Service,
+HTTPRoute, or GRPCRoute resource you're sending requests to.
 
 The reason why these pieces of configuration are required is because retries can
 potentially be dangerous. Automatically retrying a request that changes state
@@ -32,7 +32,7 @@ annotations.
 
 ## Retries
 
-For HttpRoutes that are idempotent, you can add the `retry.linkerd.io/http: 5xx`
+For HTTPRoutes that are idempotent, you can add the `retry.linkerd.io/http: 5xx`
 annotation which instructs Linkerd to retry any requests which fail with an HTTP
 response status in the 500s.
 
@@ -43,9 +43,9 @@ Note that requests will not be retried if the body exceeds 64KiB.
 You can also add the `retry.linkerd.io/limit` annotation to specify the maximum
 number of times a request may be retried. By default, this limit is `1`.
 
-## Grpc Retries
+## gRPC Retries
 
 Retries can also be configured for gRPC traffic by adding the
-`retry.linkerd.io/grpc` annotation to a GrpcRoute or Service resource. The value
+`retry.linkerd.io/grpc` annotation to a GRPCRoute or Service resource. The value
 of this annotation is a comma seperated list of gRPC status codes that should
 be retried.
diff --git a/linkerd.io/content/2.16/tasks/getting-per-route-metrics.md b/linkerd.io/content/2.16/tasks/getting-per-route-metrics.md
@@ -4,9 +4,9 @@ description = "Configure per-route metrics for your application."
 +++
 
 To get per-route metrics, you must create [HTTPRoute] resources. If a route has
-a `parent_ref` which points to a Service resource, Linkerd will generate
+a `parent_ref` which points to a **Service** resource, Linkerd will generate
 outbound per-route traffic metrics for all HTTP traffic that it sends to that
-Service. If a route has a `parent_ref` which points to a Server resource,
+Service. If a route has a `parent_ref` which points to a **Server** resource,
 Linkerd will generate inbound per-route traffic metrcs for all HTTP traffic that
 it receives on that Server. Note that an [HTTPRoute] can have multiple
 `parent_ref`s which means that the same [HTTPRoute] resource can be used to