Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating HTTP status code documentation to use 404 instead of 503 #1151

Merged
merged 1 commit into from
May 12, 2022

Conversation

robscott
Copy link
Member

@robscott robscott commented May 6, 2022

What type of PR is this?
/kind documentation
/kind api-change
/area conformance

What this PR does / why we need it:
As suggested by @michaelbeaumont, this updates our HTTP status code guidance to require 404 status codes and removes any recommendations to return 503 status codes.

Does this PR introduce a user-facing change?:

HTTP 404 responses should be returned whenever a matching route rule does not exist or a backend reference can not be resolved.

Adding a hold for consensus.
/hold

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/documentation Categorizes issue or PR as related to documentation. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API area/conformance cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 6, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: robscott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 6, 2022
@robscott
Copy link
Member Author

robscott commented May 6, 2022

cc @mikemorris

@youngnick
Copy link
Contributor

youngnick commented May 9, 2022

I thought the reason we specified a 503 originally is that it allows users to tell that the error is originating from something in the path rather than the end app? Personally, I would expect a 404 to mean that the request got to the end app and then it wasn't found, not that the proxy is misconfigured.

I guess another way to put this is: If we're sending 404s for misconfigurations, how will the app dev know that those 404s are not supposed to happen? 503s are used because they are a signal that something is wrong, and someone needs to take some action.

@robscott
Copy link
Member Author

robscott commented May 9, 2022

Meant to link the context from Slack.

If we're sending 404s for misconfigurations, how will the app dev know that those 404s are not supposed to happen? 503s are used because they are a signal that something is wrong, and someone needs to take some action.

I think it's useful to think about what an implementation should do when a matching route rule does not exist. I think that a 404 is the most appropriate response in that case. The question then becomes if we should return different HTTP status code if someone has tried and failed to attach a Route. My thought is that failed attempts to attach a Route should be treated the same as if the Route was not attached to the Gateway. I think the same logic applies when a BackendRef is not valid/resolvable.

For reference here are the RFC 7231 definitions of the HTTP status codes:

The 503 (Service Unavailable) status code indicates that the server
is currently unable to handle the request due to a temporary overload
or scheduled maintenance, which will likely be alleviated after some
delay. The server MAY send a Retry-After header field
(Section 7.1.3) to suggest an appropriate amount of time for the
client to wait before retrying the request.

The 404 (Not Found) status code indicates that the origin server did
not find a current representation for the target resource or is not
willing to disclose that one exists. A 404 status code does not
indicate whether this lack of representation is temporary or
permanent; the 410 (Gone) status code is preferred over 404 if the
origin server knows, presumably through some configurable means, that
the condition is likely to be permanent.

Although neither seem like a perfect fit here, 404 seems more accurate to me.

@robscott
Copy link
Member Author

robscott commented May 9, 2022

Another use case that we should document - Service exists and is valid to reference but no healthy endpoints.

@robscott
Copy link
Member Author

robscott commented May 9, 2022

Also cover unsupported types.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 10, 2022
@robscott
Copy link
Member Author

Updated to account for feedback from today's community meeting.

@youngnick
Copy link
Contributor

After the discussion in our meeting, I'm behind this change.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 10, 2022
@robscott
Copy link
Member Author

Seems like we had consensus at community meeting.

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 12, 2022
@k8s-ci-robot k8s-ci-robot merged commit c7da26b into kubernetes-sigs:master May 12, 2022
@mikemorris
Copy link
Contributor

mikemorris commented May 20, 2022

Sorry for not looking at this closely earlier - the part that feels a bit strange to me with this change is

// If multiple backends are specified, and some are invalid, the proportion of
// requests that would otherwise have been routed to an invalid backend
// MUST receive a 404 status code.

Because (I think?) some sort of route would have been matched for the logic to reach the backend weighting, and if some backends are valid, retrying the request could potentialy reach a live backend (closer to 503 logic).


This could be tricky for some implementations to differentiate (as it might require some sort of health monitoring of backends), but I think what I would expect is:

  • no matching route (no match for path, pathprefix, regex, or filters = things that the gateway would have programmed/known) -> 404
    • this also covers any routes not accepted due to an issue like mismatched supportedKinds
    • I believe this was the intended focus of the linked Slack context
  • attempted routing to "bad" backend -> 503
    • a request was matched to a configuration of an accepted route from the gateway perspective, and retrying it may be successful
    • for the "no healthy endpoints" case, a gateway may not know in advance if this request would fail, i think the current 503 recommendation (but not MUST) here makes sense
    • the tricky parts here are if "none" vs "at least one" valid backends should be handled differently when deciding if a route should be accepted by a listener (and corresponding lifecycle watch, if all backends become invalid due to revoking a referencepolicy can a route be deatched, if one backend becomes valid does it attach automatically?), refs Route handling of invalid BackendRefs needs clarification #1112
    • some failure cases for "known" invalid backends may be possible (unpermitted, nonexistant) if we decide to have these routes accepted by the gateway listeners - I think a 503 may be appropriate for these if the route configuration has been accepted/matched but the backend is known to be unavailable/invalid?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/conformance cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/documentation Categorizes issue or PR as related to documentation. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants