Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: Provide a way to translate aggregate methods on top level #28102

Merged
1 commit merged into from
Jun 14, 2022
Merged

Conversation

smitpatel
Copy link
Member

To translate an aggregate operation on top level, provider would get a SelectExpression which would have facets applied and the method call with additional arguments. They should wrap the selector into EnumerableExpression (also add other facets from SelectExpression if needed), wrap into AsQueryable (if queryable method else optional) and the aggregate method over it and use SqlTranslator to translate it.
Also exposed TranslateExpression & TranslateLambdaExpression in QueryableMethodTranslator for relational

Resolves #28097

@smitpatel smitpatel requested a review from a team May 25, 2022 21:35
@ghost
Copy link

ghost commented May 25, 2022

Hello @smitpatel!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@smitpatel smitpatel force-pushed the smit/abc branch 2 times, most recently from d6406ea to 7c070a8 Compare May 25, 2022 21:53
@ghost
Copy link

ghost commented May 25, 2022

Apologies, while this PR appears ready to be merged, I've been configured to only merge when all checks have explicitly passed. The following integrations have not reported any progress on their checks and are blocking auto-merge:

  1. Azure Pipelines

These integrations are possibly never going to report a check, and unblocking auto-merge likely requires a human being to update my configuration to exempt these integrations from requiring a passing check.

Give feedback on this
From the bot dev team

We've tried to tune the bot such that it posts a comment like this only when auto-merge is blocked for exceptional, non-intuitive reasons. When the bot's auto-merge capability is properly configured, auto-merge should operate as you would intuitively expect and you should not see any spurious comments.

Please reach out to us at [email protected] to provide feedback if you believe you're seeing this comment appear spuriously. Please note that we usually are unable to update your bot configuration on your team's behalf, but we're happy to help you identify your bot admin.

@smitpatel
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@smitpatel
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ghost
Copy link

ghost commented May 29, 2022

Apologies, while this PR appears ready to be merged, I've been configured to only merge when all checks have explicitly passed. The following integrations have not reported any progress on their checks and are blocking auto-merge:

  1. Azure Pipelines

These integrations are possibly never going to report a check, and unblocking auto-merge likely requires a human being to update my configuration to exempt these integrations from requiring a passing check.

Give feedback on this
From the bot dev team

We've tried to tune the bot such that it posts a comment like this only when auto-merge is blocked for exceptional, non-intuitive reasons. When the bot's auto-merge capability is properly configured, auto-merge should operate as you would intuitively expect and you should not see any spurious comments.

Please reach out to us at [email protected] to provide feedback if you believe you're seeing this comment appear spuriously. Please note that we usually are unable to update your bot configuration on your team's behalf, but we're happy to help you identify your bot admin.

@roji
Copy link
Member

roji commented May 29, 2022

@smitpatel to be sure I understand...

I'm still seeing special logic for Count/Max/... in RelationalQueryableMethodTranslatingEV. If the idea is to implement top-level aggregates as above, should these "built-in" ones be re-implemented in the same way? Or conversely, could a provider simply do the same thing for custom aggregate as is currently done for Count/Max (i.e. implement additional methods in their implementation of QueryableMethodTranslatingEV)? Just because ideally we wouldn't special-case some aggregate operators because they happen to be implemented in relational.

Also, ideally a provider wouldn't need to do anything beyond adding an aggregate method call translator, and that would work both on top-level and 2nd-level - but I guess that's not possible?

@smitpatel
Copy link
Member Author

Also, ideally a provider wouldn't need to do anything beyond adding an aggregate method call translator, and that would work both on top-level and 2nd-level - but I guess that's not possible?

That is not possible. Basically something like collection.Where(blah).Select(blah).Aggregate() could translate to 2 different things based on what is collection.
When it is grouping element then it would be AGGREGATE(CASE WHEN predicate THEN selector END) which would go in select expression which has group by but when it is top level (so db set or equivalent) then it would translate to following

SELECT AGGREGATE(selector)
FROM collection
WHERE predicate

Now we can certainly condense everything into projection element like group by case but probably that could surprise some users.

Other blocking side is that we don't know if certain provider defined aggregate method extension on IQueryable (top-level one) is an aggregate function and we need to call into aggregatemethodcalltranslatorprovider. Provider defined queryable methods are opaque to us. So provider would need to intercept that either way, (even if we provide a function to just call into aggregate translation pipeline).

3rd point which makes them separate is client side materialization. As in case with Min/Max, it throws exception when sequence is empty. We don't care about that when translating on group by case because it would be in subquery scenario where we don't throw such exceptions but on top-level we align behavior with LINQ (because we can) and add additional client side processing code to throw exception (when result of aggregate from server is null). This requires to generate a custom shaper which again requires intercepting the method, use aggregate pipeline to get the sql for translation but process the shaper on top of it.

@roji
Copy link
Member

roji commented Jun 2, 2022

Basically something like collection.Where(blah).Select(blah).Aggregate() could translate to 2 different things based on what is collection.
[...]
Now we can certainly condense everything into projection element like group by case but probably that could surprise some users.

Makes sense, missed that. Maybe the top-level infra logic in QueryableMethodTranslatingEV could take care of extracting the predicate before handing it off to the aggregate translator, and integrating it into the SelectExpression predicate instead (same with distinct/ordering, and maybe even skip/take, since we can actually do it for top-level)?

3rd point which makes them separate is client side materialization. [...]

Interesting too. That's going to be the case for some functions (e.g. Min/Max) and not for others: string.Join can probably work fine without any materialization tweaking (return empty string), same for standard deviation (which can just return nullable double as in https://github.com/dotnet/efcore/pull/28110/files#diff-f820f0461f6d16b0941352910f8b9c1e7a12ca57725b7b3168c1a8cbc1591ca6R1497 to correspond directly to the server-side function). So obviously cases where materialization tweaks are needed would need to do special stuff, but ideally those which don't wouldn't.

But I understand the point about the IQueryable (top-level) and IEnumerable (non-top-level) variations. Assuming what I wrote above is possible (extract predice/distinct/ordering in QueryableMethodTranslatingEV infra and apply to SelectExpression), maybe QueryableMethodTranslatingEV could end up calling aggregate translators in VisitMethodCall, and the translators would be responsible for identifying both versions (IQueryable and IEnumerable).

But I don't have the full picture, you know best.

@roji roji removed the auto-merge label Jun 14, 2022
Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK @smitpatel, see comments.

For my understanding... A provider-specific top-level aggregate translation would need to duplicate some logic from TranslateAggregateWithSelector, right? Like the thrownWhenEmpty wouldn't (typically) be needed, but a lot of the method would be need to be duplicated to support a function, whether with or without a selector, right (for the selector, e.g. context.Products.StandardDeviation(p => p.UnitPrice) as in #28104 (comment))?

I guess it's OK to do it like this (definitely as a start).. As we discussed, ideally if we could provide a bit more support to providers so they didn't have to duplicate this stuff, that would be great. We can revisit this later too.

@@ -2051,7 +2051,7 @@ public override async Task GroupJoin_in_subquery_with_client_projection_nested1(
WHERE (
SELECT COUNT(*)
FROM (
SELECT TOP(10) [l0].[Id], [l0].[Date], [l0].[Name], [l0].[OneToMany_Optional_Self_Inverse1Id], [l0].[OneToMany_Required_Self_Inverse1Id], [l0].[OneToOne_Optional_Self1Id], [l1].[Id] AS [Id0], [l1].[Date] AS [Date0], [l1].[Level1_Optional_Id], [l1].[Level1_Required_Id], [l1].[Name] AS [Name0], [l1].[OneToMany_Optional_Inverse2Id], [l1].[OneToMany_Optional_Self_Inverse2Id], [l1].[OneToMany_Required_Inverse2Id], [l1].[OneToMany_Required_Self_Inverse2Id], [l1].[OneToOne_Optional_PK_Inverse2Id], [l1].[OneToOne_Optional_Self2Id]
SELECT TOP(10) [l0].[Id], [l1].[Id] AS [Id0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice SQL simplification!

@smitpatel
Copy link
Member Author

Ideally, yes. But coming up with exact form of abstraction to provide is somewhat hard. Like for a fact we know that throwIfEmpty may not make sense for a lot of them. We don't know if custom ones will need a customized shaper or may need to look at the SelectExpression being translated and lift certain parts of it into the aggregate operator rather than leaving in SelectExpression. There are various unknowns in the process. So to avoid crystal balling and future breaks, on very conservative side, we made public what is minimum required to work so no one is blocked. Once multiple providers have methods and we can figure out abstraction to provide in relational we can update the code.

@roji
Copy link
Member

roji commented Jun 14, 2022

OK @smitpatel, am OK to merge. We can revisit in the future if we want to improve here.

@ghost
Copy link

ghost commented Jun 14, 2022

Apologies, while this PR appears ready to be merged, I've been configured to only merge when all checks have explicitly passed. The following integrations have not reported any progress on their checks and are blocking auto-merge:

  1. Azure Pipelines

These integrations are possibly never going to report a check, and unblocking auto-merge likely requires a human being to update my configuration to exempt these integrations from requiring a passing check.

Give feedback on this
From the bot dev team

We've tried to tune the bot such that it posts a comment like this only when auto-merge is blocked for exceptional, non-intuitive reasons. When the bot's auto-merge capability is properly configured, auto-merge should operate as you would intuitively expect and you should not see any spurious comments.

Please reach out to us at [email protected] to provide feedback if you believe you're seeing this comment appear spuriously. Please note that we usually are unable to update your bot configuration on your team's behalf, but we're happy to help you identify your bot admin.

To translate an aggregate operation on top level, provider would get a SelectExpression which would have facets applied and the method call with additional arguments. They should wrap the selector into EnumerableExpression (also add other facets from SelectExpression if needed), wrap into AsQueryable (if queryable method else optional) and the aggregate method over it and use SqlTranslator to translate it.
Also exposed TranslateExpression & TranslateLambdaExpression in QueryableMethodTranslator for relational

Resolves #28097
@ghost
Copy link

ghost commented Jun 14, 2022

Apologies, while this PR appears ready to be merged, I've been configured to only merge when all checks have explicitly passed. The following integrations have not reported any progress on their checks and are blocking auto-merge:

  1. Azure Pipelines

These integrations are possibly never going to report a check, and unblocking auto-merge likely requires a human being to update my configuration to exempt these integrations from requiring a passing check.

Give feedback on this
From the bot dev team

We've tried to tune the bot such that it posts a comment like this only when auto-merge is blocked for exceptional, non-intuitive reasons. When the bot's auto-merge capability is properly configured, auto-merge should operate as you would intuitively expect and you should not see any spurious comments.

Please reach out to us at [email protected] to provide feedback if you believe you're seeing this comment appear spuriously. Please note that we usually are unable to update your bot configuration on your team's behalf, but we're happy to help you identify your bot admin.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query: Provide a good story for calling into aggregate method call translator from QueryableMethodTranslator
2 participants