Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

physicalplan: add support for multi-stage execution of aggregate func… #59174

Closed
wants to merge 4 commits into from

Conversation

piao88
Copy link

@piao88 piao88 commented Jan 20, 2021

Needed for #58347

This commit adds support for muti-stage execution of the following aggregate functions , sqrdiff stddev_pop var_pop regr_count regr_avgx regr_avgy regr_sxx regr_syy. besides , I add the aggregate functions regr_sx( Calculates sum of the independent variable) and regr_sy( Calculates sum of the dependent variable.) in support of building regr_sxx and regr_syy muti-stage function. The rest functions (corr ,covar_pop, covar_samp,regr_intercept,regr_r2,regr_slope,regr_sxy ) are not finished yet.

@piao88 piao88 requested a review from a team as a code owner January 20, 2021 01:39
@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Jan 20, 2021

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ Piao ZhiHuan
❌ piao88


Piao ZhiHuan seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@blathers-crl
Copy link

blathers-crl bot commented Jan 20, 2021

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

Before a member of our team reviews your PR, I have some potential action items for you:

  • Please ensure your git commit message contains a release note.
  • When CI has completed, please ensure no errors have appeared.

I have added a few people who may be able to assist in reviewing:

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Jan 20, 2021
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde
Copy link
Member

This is great, thanks for the contribution! We will, however need some logictests for all these functions. The logictests are powerful because they run in many configurations, including "local" (where multi-stage won't be planned) and "fakedist" (which fakes a random distribution of table data in each run); so they would automatically cross-check the results with multi-stage planning against those without it.

@piao88
Copy link
Author

piao88 commented Jan 25, 2021

hi,thanks for your review !
I find some bugs about decimal precision in this commit . It has a little different between the result of miti-stage aggregate function and "local" function . (e.g. 82499.99999999999999999965 is the result of sqrdiff in "muti-stage" and 82499.99999999999999999985 is the result in "local" ) . I will take some days to fix it .

@RaduBerinde
Copy link
Member

Those kind of differences are expected since the result is calculated through a different sequence of operations. The logictests have a tolerance for float results (https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/logictest/logic.go#L2881).

@blathers-crl
Copy link

blathers-crl bot commented Jan 25, 2021

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • Please ensure your git commit message contains a release note.
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented Jan 26, 2021

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • Please ensure your git commit message contains a release note.
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented Jan 26, 2021

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • Please ensure your git commit message contains a release note.
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of TestDistAggregationTable failures; perhaps the tolerated errors in those tests need to be relaxed a bit (though they've been sufficient so far).

Please bump the Version in execinfra/version.go (and add a note in the comment).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @piao88 and @yuzefovich)


pkg/sql/logictest/testdata/logic_test/distsql_agg, line 660 at r4 (raw file):

# Test muti-stage aggregate functions for #58347 (add support for multi-stage execution of aggregate functions).
statement ok
CREATE TABLE t (a INT, b INT, c FLOAT, d DECIMAL, PRIMARY KEY (a, b, c, d))

This test would be better in the aggregate file. We don't need the part starting from ALTER TABLE - that file runs under fakedist configuration which will pretend there is a random multi-node distribution of data each time it is run. By the way, some existing tests in that file now fail.


pkg/sql/sem/builtins/aggregate_builtins.go, line 1957 at r4 (raw file):

}

func newFloatFinalRegressionSyyAggregate(

Why do we need these two variants if they're the same thing?


pkg/sql/sem/builtins/aggregate_builtins.go, line 3590 at r4 (raw file):

}

func newFloatFinalStddevPopAggregate(

Isn't this exactly the same with newDecimalFinalStdDevAggregate? Why not use the existing one? I would review all the new ones and keep only those for which we can't use an existing built-in.

@yuzefovich
Copy link
Member

#58347 has been fixed on master.

@yuzefovich yuzefovich closed this Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants