Gateway fetching uplink indefinitely #1405

Nicoowr · 2022-01-19T16:07:12Z

Latest versions it didn't occur

    "@apollo/federation": "0.29.0",
    "@apollo/gateway": "0.38.0",

Current versions

    "@apollo/gateway": "0.45.1",
    "@apollo/subgraph": "0.1.5",

Set-up

Everything is hosted on AWS Lambda (Gateway & Subservices)

Expected behavior

Setting schemaConfigDeliveryEndpoint: undefined in gateway config should keep the old behavior, namely not fetching the supergraph from uplink.

Actual behavior

Setting schemaConfigDeliveryEndpoint: undefined does not always prevent the gateway from fetching the supergraph from uplink. When it does, the post requests do not succeed and the gateway continues to fetch until it times out:

This might be linked to #949

The text was updated successfully, but these errors were encountered:

glasser · 2022-01-26T19:29:49Z

Just to confirm: you are trying to use the old Google Cloud Storage-based system for getting schemas in your server instead of the newer Uplink system? This behavior was removed in @apollo/[email protected], as mentioned in the changelog.

Can you help us understand why Uplink doesn't work for you? The Uplink system gives us the ability to manage permissions on your graphs in a more reliable and dynamic way, and has allowed us to provide multi-cloud support so that Uplink continues to work even when one of our cloud vendors has a global failure.

Nicoowr · 2022-01-27T09:33:58Z

Hi @glasser
Actually we'd like to use the old behavior of the gateway, namely when it does the composition by itself. But setting schemaConfigDeliveryEndpoint: undefined does not seem to work anymore.

We'd be glad to use Uplink, but using the new uplink endpoints (or even the former one) led to timeout problems like the one mentioned in this issue, or the one here: #949

I have no idea why it behaves like this, perhaps it's due to lambda runtime but it's hard to say...

trevor-scheer · 2022-01-27T19:40:05Z

@Nicoowr that's very surprising to hear. In its former mode of behavior the gateway had to perform a series of fetches (literally one fetch waiting for the next) to the network along with the actual composition before it was ready to serve requests. With uplink it gets to skip all of that and perform just one fetch.

If you can provide us with some additional details i.e. where time is being lost when using Uplink that could be helpful. Do you know what your current lambda timeout is?

Nicoowr · 2022-01-28T08:20:22Z

@trevor-scheer Our lambda timeout is set to 28s, for both gateway and subservices.
AFAIK, the gateway does not even call the subservice since it's trapped in the loop you can see on the screenshot. It's very weird because every call to the uplink endpoint has a 200 status, but the gateway keeps polling 🤔

trevor-scheer · 2022-01-28T17:25:58Z

@Nicoowr thanks for the extra info. I don't think a 200 is conclusive, but this does seem to be an issue for you and others so I think we have some more digging to do. Is there a way for you to share what's in those responses from Uplink?

My first suspicion is that there might be some actual errors preventing the gateway from successfully starting.

trevor-scheer · 2022-02-10T21:56:06Z

This should be resolved via #1503 / #1504 (releasing @apollo/[email protected] as we speak)

trevor-scheer · 2022-02-10T22:59:02Z

I should backpedal a bit here - #1503 does resolve an issue that's demonstrated in your screenshot (gateway shouldn't send 7x requests per cycle when it's getting 200s). #949 seems like a completely different problem set that might still be blocking successfully using Uplink.

In any case, I hope you try out the new version and report back here with results (good or bad!).

Nicoowr · 2022-02-11T09:06:56Z

@trevor-scheer Thank you very much for your reactivity! I'll try it asap and tell you how it goes :)

Nicoowr · 2022-02-21T18:13:59Z

@trevor-scheer It seems like the new version causes not reproductible errors Cannot convert undefined or null to object.
It does not happen everytime which is really weird. I'll provide more information ASAP.

glasser · 2022-02-21T20:35:13Z

@Nicoowr Do those come with stack traces?

Nicoowr · 2022-02-21T21:04:24Z

Well it's actually the client (a SNS subscriber in this case) making the request which sometimes throws this kind of error:

I've checked the logs of the gateway and it's hard to find anything relevant. What I know though is that if I revert to a not-managed schema, everything works.

It's not much information, I'll try to provide more asap.

Nicoowr mentioned this issue Jan 26, 2022

Gateway fetching uplink indefinitely apollographql/apollo-server#6036

Closed

glasser closed this as completed Jan 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gateway fetching uplink indefinitely #1405

Gateway fetching uplink indefinitely #1405

Nicoowr commented Jan 19, 2022

glasser commented Jan 26, 2022

Nicoowr commented Jan 27, 2022

trevor-scheer commented Jan 27, 2022

Nicoowr commented Jan 28, 2022

trevor-scheer commented Jan 28, 2022

trevor-scheer commented Feb 10, 2022

trevor-scheer commented Feb 10, 2022

Nicoowr commented Feb 11, 2022

Nicoowr commented Feb 21, 2022

glasser commented Feb 21, 2022

Nicoowr commented Feb 21, 2022

Gateway fetching uplink indefinitely #1405

Gateway fetching uplink indefinitely #1405

Comments

Nicoowr commented Jan 19, 2022

Latest versions it didn't occur

Current versions

Set-up

Expected behavior

Actual behavior

glasser commented Jan 26, 2022

Nicoowr commented Jan 27, 2022

trevor-scheer commented Jan 27, 2022

Nicoowr commented Jan 28, 2022

trevor-scheer commented Jan 28, 2022

trevor-scheer commented Feb 10, 2022

trevor-scheer commented Feb 10, 2022

Nicoowr commented Feb 11, 2022

Nicoowr commented Feb 21, 2022

glasser commented Feb 21, 2022

Nicoowr commented Feb 21, 2022