Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Over-aggressive optimisation can lead to generating non-optimal query plans #2623

Merged
merged 4 commits into from
Jun 13, 2023

Conversation

pcmanus
Copy link
Contributor

@pcmanus pcmanus commented Jun 9, 2023

The query planner has a number of heuristic it applies when computing possible paths to avoid generating options that can be proved early to never be better than other ones (as a basic example, if we have the same key between A -> B, A -> C and B -> C, and we're trying to go from A to C, then no point in considering doing A -> B -> C since we can do A -> C directly). Anyway, the point is that if we're not a bit smart in eliminating such "unnecessary" options, the number of final plans to evaluate can quickly explode (and make query planning long, up to the point where the planner stops degrading its output).

But one of the heuristic we're applying is unfortunately a bit too aggressive: it does remove options that are unnecessary, but was also removing options that should have been kept. The result is that the planner will sometime return plans that are not optimal (see the added test for an example).

This commit fix that problem by making the heuristic more precise and only remove the option that are safe to remove.

Side-note: the first commit of this PR is actually slightly unrelated to the description above, it is a small optimisation when generating plans. But I discovered both at the same time, and that optimisation was fairly simple, so took the liberty to include.

pcmanus added 2 commits June 9, 2023 15:21
When there is multiple valid choices for multiple paths of a query, the
number of theoretically possible ends up a cartesian product so it
explodes fairly quickly, and generating too many plans can be very
costly. To manage this in general, we generate the possible plans
incrementally and in an order that is likely to yield "good" plans
early, and use the cost of previously computed plan to possibly
cut early further computation.

The code doing this was however not optimal as we weren't using that
potential of early exit as aggressively as we could. This commit
fixes that.
@pcmanus pcmanus requested a review from a team as a code owner June 9, 2023 14:15
@netlify
Copy link

netlify bot commented Jun 9, 2023

👷 Deploy request for apollo-federation-docs pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 982979e

@changeset-bot
Copy link

changeset-bot bot commented Jun 9, 2023

🦋 Changeset detected

Latest commit: 982979e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 7 packages
Name Type
@apollo/query-planner Patch
@apollo/query-graphs Patch
@apollo/gateway Patch
@apollo/composition Patch
@apollo/federation-internals Patch
@apollo/subgraph Patch
apollo-federation-integration-testsuite Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@codesandbox-ci
Copy link

codesandbox-ci bot commented Jun 9, 2023

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

When building plans, after we've computed all the possible options to
get to a particular leaf of the query being plan, we're applying an
heuristic that checks if an option is guaranteed to be always better
than another one (and remove the latter one if so), which the goal
of reducing the number of actual full plan we need to evaluate.

But the implementation of that heuristic was incorrect: while it was
correctly removing options that we did meant to remove, it was also
sometimes removing options that really should be kept.

This resulted in planning sometimes not produce the most optimal
possible plan.

This commit fixes that heuristic to only exclude what can be safely
excluded.
@pcmanus pcmanus force-pushed the fix-faulty-optimization branch from eaf5c0c to c0ab4b1 Compare June 9, 2023 14:20
Copy link
Contributor

@clenfest clenfest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a general note, it would be great if we could get to the point where we have unit tests for each of these individual functions we're creating.

*
* Please note that this method assumes that the 2 paths have the same root, and will fail if that's not the case.
*/
findBiggestCommonPrefixAndCountMainSubgraphJumpsAfterIt(that: GraphPath<TTrigger, RV, TNullEdge>): {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a mouthful. How about something like countSubgraphJumpsAfterCommon or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to countSubgraphJumpsAfterLastCommonVertex to match the point below.

};
}

private findBiggestCommonPrefixEndIndex(that: GraphPath<TTrigger, RV, TNullEdge>): { vertex: Vertex, index: number } {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe findLastCommonVertexIndex instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, make sense, renamed to findLastCommonVertex (we return both the vertex and index, so figure the Index part was unnecessary).

// must be equal again. So check that it's true, and if it is, we're good.
// Note that for `this`, the last edge we looked at was `i`, so the next is `i+1`. And
// for `that`, we've skipped over one more edge, so need to use `j+1`.
for (let j = i + 1; j < this.size; j++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little ugly to have a loop inside the previous loop that is really just a continuation of the first loop. Would be great if you could move this outside the enclosing for loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that one might be a personal preference: as it is a continuation in only a specific case, moving that inner loop out means some sort of state tracking that I don't find more obvious.

@pcmanus pcmanus merged commit 2a97f37 into apollographql:main Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants