Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve generation of plans once all path options are computed #2316

Merged
merged 2 commits into from
Jan 10, 2023

Conversation

pcmanus
Copy link
Contributor

@pcmanus pcmanus commented Jan 6, 2023

At a very high level, query planning works in 2 main steps:

  1. for every "leaf" of the query, we compute which possible options (paths) can lead to it.
  2. we take all those options and generates the corresponding plans, which means we compute all the (non trivially inefficient) possible plan that would be valid for the query. For each of those plan, we compute the cost of the plan and keep the best plan.

In some cases, the 1st step gives quite a few possible options, and generating all the corresponding plans is costly.

This patch attempts to first generate plans that are a bit more likely to be the most efficient, or at least fairly good. When then use the cost of those plans to something cut the generation of other plans early. Essentially, as we generate a plan corresponding to a set of options, if we notice that the cost gets higher than the best we've found so far before we've handled all the paths, then we can give up on that set of options without generating everything.

Doing so have little to no impact on very simple plans, but for more complex ones it can drastically reduce the generation time (anecdotically, some real world plan gets more than a 10x improvement on generation speed).

@netlify
Copy link

netlify bot commented Jan 6, 2023

👷 Deploy request for apollo-federation-docs pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 1844846

@pcmanus pcmanus self-assigned this Jan 6, 2023
@codesandbox-ci
Copy link

codesandbox-ci bot commented Jan 6, 2023

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

@korinne korinne added this to the 2.3.0 milestone Jan 6, 2023
* @param onPlan - an optional method called on every _complete_ plan generated by this method, with both the cost of that plan and
* the best cost we have generated thus far (if that's not the first plan generated). This mostly exists to allow some debugging.
*/
export function generateAllPlansAndFindBest<P, E>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make these named arguments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const updatedTree = p.tree.merge(t);
return { graph: updatedDependencyGraph, tree: updatedTree };
},
(p) => this.cost(p.graph),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these should be the defaults, but I guess since we're only calling it once outside of tests it doesn't matter for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left it as it for now, but fwiw, I'm not entirely sure to understand the suggestion. That is, if the suggestion is to make those concrete function the defaults in the declaration of generateAllPlansAndFindBest, then I'm not sure how to do that easily given that the method is generic and so the argument of any default would have the generic as type (plus, the cost function references this too).

Sylvain Lebresne added 2 commits January 10, 2023 10:20
At a very high level, query planning works in 2 main steps:
1. for every "leaf" of the query, we compute which possible options
   (paths) can lead to it.
2. we take all those options and generates the corresponding plans,
  which means we compute all the (non trivially inefficient) possible plan
  that would be valid for the query. For each of those plan, we compute
  the cost of the plan and keep the best plan.

In some cases, the 1st step gives quite a few possible options, and
generating all the corresponding plans is costly.

This patch attempts to first generate plans that are a bit more likely
to be the most efficient, or at least fairly good. When then use
the cost of those plans to something cut the generation of other plans
early. Essentially, as we generate a plan corresponding to a set of
options, if we notice that the cost gets higher than the best we've
found so far before we've handled all the paths, then we can give up
on that set of options without generating everything.

Doing so have little to no impact on very simple plans, but for more
complex ones it can drastically reduce the generation time
(anecdotically, some real world plan gets more than a 10x improvement on
generation speed).
@pcmanus pcmanus merged commit 88ae498 into apollographql:main Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants