-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
physicalplan: add support for multi-stage execution of aggregate func… #59174
Conversation
Piao ZhiHuan seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR. Before a member of our team reviews your PR, I have some potential action items for you:
I have added a few people who may be able to assist in reviewing:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
This is great, thanks for the contribution! We will, however need some logictests for all these functions. The logictests are powerful because they run in many configurations, including "local" (where multi-stage won't be planned) and "fakedist" (which fakes a random distribution of table data in each run); so they would automatically cross-check the results with multi-stage planning against those without it. |
hi,thanks for your review ! |
Those kind of differences are expected since the result is calculated through a different sequence of operations. The logictests have a tolerance for float results (https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/logictest/logic.go#L2881). |
Thank you for updating your pull request. Before a member of our team reviews your PR, I have some potential action items for you:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Thank you for updating your pull request. Before a member of our team reviews your PR, I have some potential action items for you:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Thank you for updating your pull request. Before a member of our team reviews your PR, I have some potential action items for you:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of TestDistAggregationTable
failures; perhaps the tolerated errors in those tests need to be relaxed a bit (though they've been sufficient so far).
Please bump the Version
in execinfra/version.go
(and add a note in the comment).
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @piao88 and @yuzefovich)
pkg/sql/logictest/testdata/logic_test/distsql_agg, line 660 at r4 (raw file):
# Test muti-stage aggregate functions for #58347 (add support for multi-stage execution of aggregate functions). statement ok CREATE TABLE t (a INT, b INT, c FLOAT, d DECIMAL, PRIMARY KEY (a, b, c, d))
This test would be better in the aggregate
file. We don't need the part starting from ALTER TABLE - that file runs under fakedist
configuration which will pretend there is a random multi-node distribution of data each time it is run. By the way, some existing tests in that file now fail.
pkg/sql/sem/builtins/aggregate_builtins.go, line 1957 at r4 (raw file):
} func newFloatFinalRegressionSyyAggregate(
Why do we need these two variants if they're the same thing?
pkg/sql/sem/builtins/aggregate_builtins.go, line 3590 at r4 (raw file):
} func newFloatFinalStddevPopAggregate(
Isn't this exactly the same with newDecimalFinalStdDevAggregate
? Why not use the existing one? I would review all the new ones and keep only those for which we can't use an existing built-in.
#58347 has been fixed on master. |
Needed for #58347
This commit adds support for muti-stage execution of the following aggregate functions ,
sqrdiff
stddev_pop
var_pop
regr_count
regr_avgx
regr_avgy
regr_sxx
regr_syy
. besides , I add the aggregate functionsregr_sx
( Calculates sum of the independent variable) andregr_sy
( Calculates sum of the dependent variable.) in support of buildingregr_sxx
andregr_syy
muti-stage function. The rest functions (corr
,covar_pop
,covar_samp
,regr_intercept
,regr_r2
,regr_slope
,regr_sxy
) are not finished yet.