Using a dataloader inside `resolve_reference` does not work #205

col · 2022-08-16T05:01:19Z

It's important that when a subgraph service receives an _entities query from the gateway that it can resolve all of the references efficiently. You would expect that using a data loader inside the resolve_reference method for each entity would achieve this but unfortunately it does not work as expected. Currently the data loader is being invoked for each individual reference rather than as a single batch.

I've created a small example application here: https://github.com/col/apollo-federation-example

Expected Output:

Product.resolve_reference reference: {:__typename=>"Product", :id=>"1"}
Product.resolve_reference reference: {:__typename=>"Product", :id=>"2"}
ProductsByUpc.fetch ["1", "2"]

Actual Output:

Product.resolve_reference reference: {:__typename=>"Product", :id=>"1"}
ProductsByUpc.fetch ["1"]
Product.resolve_reference reference: {:__typename=>"Product", :id=>"2"}
ProductsByUpc.fetch ["2"]

At scale this becomes a significant problem as we can receive hundreds of references in a single _entities query.

The text was updated successfully, but these errors were encountered:

geshwho · 2022-08-16T17:55:30Z

I'm not quite as well versed on this issue as @daemonsy, but I'll dig into it this week (@daemonsy is currently OOO)

@col Am I correct that this is the same issue (or very similar to) as #149?

col · 2022-08-17T02:18:44Z

@geshwho Yes you are correct, this is the same issue as #149. Sorry I really should have added this as a comment on that issue.

daemonsy · 2022-08-17T07:35:07Z

Hey @col, should have documented the other issue better, a little hazy now. I believe I didn't actually find a direct solution to the problem, it just wasn't a problem anymore on the project that's using it.

Had a window to take a look but couldn't get the example to run. I 'd take a look at def _entities and the after_lazy block. Are we doing something that causes execution right away?

rmosolgo · 2022-08-19T16:16:56Z

I think this basically boils down the the difference between GraphQL-Batch and Dataloader:

With GraphQL-Batch, user code would return a Promise to GraphQL-Ruby (or an Array of Promises), then GraphQL-Ruby would resolve them at some point
With Dataloader, .load(...) calls actually pause execution until the data is available, then it resumes.

So inside a .map { ... } (like this one), Dataloader doesn't Just Work ™️ -- each call to .load causes the fiber to pause until data is fetched, so each iteration of the loop requires its own data fetching.

To address this, I can think of a couple possibilities:

Support resolve_references (plural) which gives all references to the user-defined method. Then, the user-defined method can use Dataloader's .load_all method to do its own batching.

Support returning Dataloader::Requests (from .request) from that method, for example:

dataloader_requests = {} 
loaded_refs = representations.each_with_index.map do |(reference, ref_idx)|
   # ... 
   if type_class.respond_to?(:resolve_reference)
      result = type_class.resolve_reference(reference, context)
      if result.is_a?(GraphQL::Dataloader::Request) 
        dataloader_requests[ref_idx] = result 
      end 
      result 
    else
      result = reference
    end
    # ... 
  end
  
  
  # Now that all loads have been requested, tell Dataloader to start fetching things:
  dataloader_requests.each do |(ref_idx, request)| 
    loaded_refs[ref_idx] = request.load 
  end

There might also be a way to call dataloader methods like .append_job, then gather all the results, which is how GraphQL-Ruby handles this situation when executing queries, but honestly, I'd recommend one of those above.

What do you think of those approaches?

col · 2022-08-20T02:24:40Z

Thanks for your input @rmosolgo!

I've been trying to get this to work using append_job but have been unsuccessful.

Both the approaches you've suggested sound like they'd work.

Option 1 - resolve_references
I think this would be my preferred approach. From the users perspective it's easy for them to reason about. They don't have to understand the implementation for them to work with it. I think we could also provide backwards compatibility and deprecate the singular resolve_reference method.

Option 2 - return data loader requests
It's not going to be intuitive to the user that they need to call request instead of load. Of course it can be documented but I fear most people wouldn't read the docs until after they discover there is an issue.

@daemonsy Let me know what you think. I'm happy to put a PR together for Option 1 sometime next week.

daemonsy · 2022-08-31T14:12:20Z

Apologies from the late reply, just returned from a holiday in Singapore :)

Hey @rmosolgo it's been a while, hope you're doing well. Thanks for the clear explanation, that rang a big bell. I've personally bumped into surprises with pausing execution.

@col I'm going to do a spike on the suggestions on our codebase. One consideration is that we've kept to an implicit convention keeping it close to Apollo's reference implementation to avoid creating new concepts/APIs.

col · 2022-08-31T23:52:59Z

Hey @daemonsy, live in Singapore, I could have taken you out for a drink! Maybe next time!

Anyway, I'd still be very happy to see a better solution for this that keeps the original API intact. I had tried a few things but none of the approaches I tried worked.

I'm wondering if it would be better to lazily resolve the references while resolving the the Entity type rather than trying to do it as a loop in _entities. This would fit better into the existing DataLoader pattern and allow all the Entity resolvers to be paused until the data it fetched. Just a thought.

daemonsy · 2022-09-02T18:09:21Z

Hey @col 😱 definitely my loss. I'm from Singapore but reside in NY with my family. It was good to be back home for a while, despite the weather :)

it would be better to lazily resolve the references while resolving the the Entity type rather than trying to do it as a loop in _entities.

Do you have an example or pseudo code for how this might look?

col · 2023-02-10T03:11:17Z

I'd suggest we close this issue at this point. I wasn't able to find a fix that didn't require adding the resolve_references method. With the resolve_references workaround in place I would consider this closed.

kamal · 2023-09-18T17:54:36Z

What makes it slightly tricky is if you support multiple keys, for example, id and email. Some references will have { "_typename": "User", "id": 123 } while others might have { "__typename": "User", "email": "[email protected]" }. Now you're going to have to make two dataloader calls, effectively things batching yourself, and return the results in the same order it came in.

With the 2nd option proposed by @rmosolgo, you can let dataloader handle the batching by using batch_key_for and fetch(keys).

col mentioned this issue Aug 21, 2022

feat: introduce optional resolve_references method #206

Merged

utay mentioned this issue Apr 23, 2024

Deprecate returning .resolve(...) dataloader requests rmosolgo/graphql-ruby#4807

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using a dataloader inside `resolve_reference` does not work #205

Using a dataloader inside `resolve_reference` does not work #205

col commented Aug 16, 2022 •

edited

Loading

geshwho commented Aug 16, 2022

col commented Aug 17, 2022 •

edited

Loading

daemonsy commented Aug 17, 2022 •

edited

Loading

rmosolgo commented Aug 19, 2022

col commented Aug 20, 2022

daemonsy commented Aug 31, 2022

col commented Aug 31, 2022

daemonsy commented Sep 2, 2022

col commented Feb 10, 2023

kamal commented Sep 18, 2023

Using a dataloader inside resolve_reference does not work #205

Using a dataloader inside resolve_reference does not work #205

Comments

col commented Aug 16, 2022 • edited Loading

geshwho commented Aug 16, 2022

col commented Aug 17, 2022 • edited Loading

daemonsy commented Aug 17, 2022 • edited Loading

rmosolgo commented Aug 19, 2022

col commented Aug 20, 2022

daemonsy commented Aug 31, 2022

col commented Aug 31, 2022

daemonsy commented Sep 2, 2022

col commented Feb 10, 2023

kamal commented Sep 18, 2023

Using a dataloader inside `resolve_reference` does not work #205

Using a dataloader inside `resolve_reference` does not work #205

col commented Aug 16, 2022 •

edited

Loading

col commented Aug 17, 2022 •

edited

Loading

daemonsy commented Aug 17, 2022 •

edited

Loading