-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apollo engine reporting blocks data and error propagation in '@apollo/gateway' managed federation deployment #3052
Comments
Here is a bare bones example. I'm excited to push this further after the Apollo Day in Seattle (I'm the one who asked about errors lolol). |
Hey awesome @jacob-ebey - apologies for no reproduction, I was using an ancient errors handling setup and was making sure it wasn't just due to that. Throwing standard In my example above, the error url field suggests engage.xxx.xxx is generating the problem. However, engage is playing nice and passing a well formatted Apollo error. |
It's a strange repro case because if you use the gateway and specify the services manually the errors are relayed properly. It's only when the gateway is retrieving it's list of services from the managed apollo platform. |
Yes, I think the error passing is only interuptted by the Apollo engine reporting mechanism. When you don't provide a key for the managed federation, no reporting occurs. The Apollo engine reporting is stumbling when it reports errors to the engine API, and its own error is preventing propagation of both the resolved data and the error to the user. I've encountered another error that oddly prevents more than one successful resolved query - after which it throws with I think a production Apollo engine reporting solution will need to gracefully handle errors within itself without blocking graphql resolvers. |
@heysailor I'm taking a look into this now, and following the reproduction case supplied by @jacob-ebey , I actually found that the same error of I'd also love to investigate the issue that you reported in a comment @heysailor, where the resolver is blocked(!!!) by the federatedExtension. That is certainly not the intended behavior, and a reproduction would help further investigate there, too! |
I opened up a PR that I believe fixes the problem. I'll check on getting an alpha build out of apollo-engine-reporting so that you can verify that it fixes the repro! :) |
Cheers @zionts - I see you've solved the second error. Looking forward to the PR merging, federation works a treat otherwise. |
The extension stack executionDidEnd method gets called before didEncounterErrors for GraphQL errors returned from execution (although confusingly the plugin executionDidEnd method gets called after), which caused the assertion that nothing gets added to the EngineReportingTreeBuilder after stopTiming to fail. Fix by moving stopTiming to the last possible moment: format(). Actually test error reporting, including both kinds of rewriting. Add a comment noting that backend parse and validation errors don't get reported. Fixes #3052.
The extension stack executionDidEnd method gets called before didEncounterErrors for GraphQL errors returned from execution (although confusingly the plugin executionDidEnd method gets called after), which caused the assertion that nothing gets added to the EngineReportingTreeBuilder after stopTiming to fail. Fix by moving stopTiming to the last possible moment: format(). Actually test error reporting, including both kinds of rewriting. Add a comment noting that backend parse and validation errors don't get reported. Fixes #3052.
I've rewritten @zionts' PR (with tests this time) to fix the blatant brokenness of error reporting from federated backends. I don't really understand what you've written here so I'm not sure if this fix addresses it as well:
|
The extension stack executionDidEnd method gets called before didEncounterErrors for GraphQL errors returned from execution (although confusingly the plugin executionDidEnd method gets called after), which caused the assertion that nothing gets added to the EngineReportingTreeBuilder after stopTiming to fail. Fix by moving stopTiming to the last possible moment: format(). Actually test error reporting, including both kinds of rewriting. Add a comment noting that backend parse and validation errors don't get reported. Fixes #3052.
The extension stack executionDidEnd method gets called before didEncounterErrors for GraphQL errors returned from execution (although confusingly the plugin executionDidEnd method gets called after), which caused the assertion that nothing gets added to the EngineReportingTreeBuilder after stopTiming to fail. Fix by moving stopTiming to the last possible moment: format(). Actually test error reporting, including both kinds of rewriting. Add a comment noting that backend parse and validation errors don't get reported. Fixes #3052.
The extension stack executionDidEnd method gets called before didEncounterErrors for GraphQL errors returned from execution (although confusingly the plugin executionDidEnd method gets called after), which caused the assertion that nothing gets added to the EngineReportingTreeBuilder after stopTiming to fail. Fix by moving stopTiming to the last possible moment: format(). Actually test error reporting, including both kinds of rewriting. Add a comment noting that backend parse and validation errors don't get reported. Fixes #3052.
This should be fixed in |
Thanks @glasser - should fix it all up - the second error I referred to looked like it was coming from a code issue in apollo-engine-reporting, as was the original error I referred to. Will report back, if further problems, appreciate all the work 👍🏻👍🏻 |
Using versions:
@apollo/gateway 0.7.1
apollo-server-express 2.7.0
node 10.15.3
Setup:
Apollo managed federation gateway server. Local development on Mac, deployment to zeit/now v2.
Expected:
Apollo server produces a synthesised federated schema, passing/reformatting errors as produced by child services.
Actual:
If a graphql error is produced by a child service, the error message below is produced.
The query generating this error spans two services
Searching the Apollo Server codebase, the error message originates from the
addProtobufError
method ofEngineReportingTreeBuilder
fromapollo-engine-reporting/src/treeBuilder.ts
.addProtobufError
appears to be called when an error occurs in a child service; but I cannot see why it is afterstopTiming
has been called - perhaps someone familiar with the package may be able to see the issue.The text was updated successfully, but these errors were encountered: