-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Federation: Undefined behaviour and INTERNAL_SERVER_ERROR when returning external types that do not exist #376
Comments
Hey @nihalgonsalves, thanks for reporting this and apologies on the delay. I agree the expected behavior should certainly be documented and well-defined by test cases. I'm having a bit of an internal struggle deciding what I think the expected behavior should be and I'd love to hear your thoughts. Here's the direction my mind has gone while thinking about the implications introduced in this issue and apollographql/apollo-server#3914: Should the gateway be tolerant to data inconsistencies between services? In its current form, I think we can say there's a strict contract via the Unfortunately I feel like there's no "just graphql" equivalent to analogize with since the joins via |
@trevor-scheer Yeah, I agree that there isn't an easy solution to this. There are common two examples (not necessarily exhaustive) where this is problematic and could cause situations that are hard to work-around:
Both of these can be worked around with some effort, and perhaps improving the documentation around the contract and making this explicit would prevent users from getting to such a situation, but they can cause nasty production bugs that cannot be immediately resolved. Regarding the general question about the gateway being tolerant to data inconsistencies: when another error occurs, it currently is tolerant. Normally, if a nullable field comes from another service and that service fails to resolve (for whatever reason - permissions, HTTP failure, service down, DB down, etc), the gateway returns partial data and also the error (even though it's null not because it was explicitly set to null. Within the limitation of the current GraphQL primitives, I think the null + error is acceptable). It's only this special case that kills the entire query. In general, especially when dealing with eventually consistent systems, I believe that the gateway should be as fault-tolerant as possible while also reporting errors. It otherwise requires coordination between systems. I feel it should be up to the application to decide - through a combination of nullability in schema design and the error policy on the client (i.e. a view that is critical can choose to discard any partial data in Apollo Client). P.S.: The current situation is that the entire query fails with an internal server error. I think that the minimum we could agree on, regardless of whether it's decided that it is a fault-tolerant error or query-failing error, is that the error should be well-defined and descriptive. |
@trevor-scheer I agree to what Nihal said about the gateway being tolerant in other cases. So I think the solution I proposed here apollographql/apollo-server#3914 aligns nicely with what the gateway is doing in other cases
I think at the moment what you get is an unexpectedly partial result. The gateway stops resolving data when it cannot resolve an expected entity and returns the data it has retrieved up to this point. See the example I described here. apollographql/apollo-server#3914 (comment)
|
Thank you both for the input! I did some more thinking after I posted my response and came to the conclusion that my initial argument about data consistency across services is too strict and unrealistic. The scenario I played through my head was as simple as this: {
user {
reviews {
content
}
}
} It's silly to expect the Furthermore, I agree with what you're saying @mduesterhoeft about what's currently happening, and that this is an edge case for I'll take another look at the associated PR! |
@nihalgonsalves This is the exact same issue we are facing right now. Were you able to find a stop-gap solution for this? @mduesterhoeft Thanks for the PR that addresses this issue. I can see that the PR was closed and is put "below the line". I am gonna be selfish here and ask if we please re-open that PR again @trevor-scheer @abernix 😄? |
I am also running into this issue. It seems strange that an entity from an external service would still try to resolve its fields if the returned is value is For example: Using the example in the docs for resolving in federation Product Service type Product @key(fields: "upc") @key(fields: "sku") {
upc: String!
sku: String!
price: String!
} Review Service type Review {
product: Product
}
# This is a "stub" of the Product entity (see below)
extend type Product @key(fields: "upc") {
upc: String! @external
} {
Review: {
product(review) {
return { __typename: "Product", upc: review.upc };
}
}
} {
Product: {
__resolveReference(reference) {
return fetchProductByUPC(reference.upc);
}
}
} {
reviews {
product {
name
price
}
}
} In the above query, if the product is Why would the resolving of the fields continue if the base entity If you short circuit on the {
Product: {
__resolveReference(reference) {
return null;
}
}
} |
I agree we should fix this and want to try getting a fix committed. Now, while I agree apollographql/apollo-server#3914 is targeting the right part of the code, I do think it should be slightly modified. Anyway, as said PR couldn't be re-open per-se (since it is against the wrong repository), I took the liberty to create a new PR, #1305 (there is one for |
Some times, when a @requires is just after a key, we can simply collect the required field before taking the key. Other times, we encounter a @require in a type T of a subgraph A, and have to jump to subgraph B to get the required field. In that case, once we've collected the required fields, we need to "resume" query on T in A, and that means using a key there. The code dealing with that post-require "return key" later case was incorrect, essentially using a key on subgraph B instead of one on subgraph A. As a consequence, subgraphs that shouldn't have been allowed to compose (because they were missing the needed key on A) were allowed to composed, and the code later failed during query planning. This commit fixes this issue, and adds tests for that case. One of the test introduced _fails_ as of this commit, but that is due to the problem describe on apollographql#376 and the fix will be in a followup commit.
Fixes apollographql#376 Co-authored-by: Sylvain Lebresne <[email protected]>
Fixes apollographql#376 Co-authored-by: Sylvain Lebresne <[email protected]>
@epitaphmike I'm having the exact same problem as you had. Did you ever find a solution to this? |
howdy @johnciprian! Are you able to use |
@benweatherman I just tried it with v2 and it works as expected. Thank you! |
If I return a Federated representation (i.e. the fields defined in
@key
) of another service, the behaviour is undefined when the other service returns null or throws an error in__resolveReference
.Reproduction Repo: https://github.com/nihalgonsalves/apollo-server-issue3859 - this includes the schema I will use as an example here. You can try out queries with different selection sets and using the IDs
should-exist
,should-not-exist
andshould-throw
, where service B returns something, null, or throws an error respectively in its__resolveReference
resolver.Service A defines this schema:
Note that it
@requires
a field from Service B to extend a type, but also returns this type from its own query.Service B defines this type:
Now, when Service B doesn't need to be called or only has to be called once, this fails gracefully:
However, if another service (in this case, when querying
extendedByServiceA
) requires a field from Service B, the null is sent to Service A, causing an internal server error:Field "someFieldFromB" was not found in response
:Expected Behaviour:
serviceAQueryReturningNullableBType
in the bug reproduction repo), you still get an internal server errorids: ["should-exist", "should-not-exist"]
, both objects come back as null due to the internal server error, even though one of the objects can be resolved on its own.undefined
to the next service (the error is thrown from here: https://github.com/apollographql/apollo-server/blob/master/packages/apollo-gateway/src/executeQueryPlan.ts#L399-L406). In general it should not throw an internal server error as even non-affected queries in the request return no data when this happens.I believe that there should also be defined behaviour/documentation about exactly what happens when
__resolveReference
throws an error or returns null.Versions:
@apollo/federation
and@apollo/gateway
version0.13.2
apollo-server
version2.10.
The text was updated successfully, but these errors were encountered: