Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

physicalplan: add support for multi-stage execution of aggregate functions #58347

Closed
15 tasks done
yuzefovich opened this issue Dec 29, 2020 · 15 comments · Fixed by #76754
Closed
15 tasks done

physicalplan: add support for multi-stage execution of aggregate functions #58347

yuzefovich opened this issue Dec 29, 2020 · 15 comments · Fixed by #76754
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) good first issue meta-issue Contains a list of several other issues. T-sql-queries SQL Queries Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Dec 29, 2020

Most aggregate functions support multi-stage distributed execution in which on every node we perform a local aggregation, then we route all the intermediate results to all of the nodes (according to the grouping columns), and then we perform a final aggregation that gives us the result. For some aggregate functions the decomposition into local and final stages is trivial - e.g. in order to calculate sum we need to perform local sum and then the final sum; for other functions it is a bit more involved - e.g. in order to compute avg we calculate partial sum and count locally, then we finalize sum and count at the final stage and we render the result as final_sum / final_count.

(I believe Postgres calls the multi-stage execution a partial mode.)

The decomposition of supported functions is defined here

var DistAggregationTable = map[execinfrapb.AggregatorSpec_Func]DistAggregationInfo{

We are currently missing the following functions:

  • sqrdiff (I think it should be possible to decompose it)
  • corr
  • var_pop
  • stddev_pop
  • covar_pop
  • covar_samp
  • regr_intercept
  • regr_r2
  • regr_slope
  • regr_sxx
  • regr_syy
  • regr_sxy
  • regr_count
  • regr_avgx
  • regr_avgy

avg and variance are good examples of how more complicated decompositions are handled.

cc @mneverov in case you're interested

Jira issue: CRDB-3407

@yuzefovich yuzefovich added good first issue meta-issue Contains a list of several other issues. labels Dec 29, 2020
@blathers-crl blathers-crl bot added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Dec 29, 2020
@cockroachdb cockroachdb deleted a comment from blathers-crl bot Dec 29, 2020
@mneverov
Copy link
Contributor

@yuzefovich, definitely, that would be awesome to work on this, thanks!
Although, from the conversation:

This doesn't look right. I think this entire thing only makes sense for single-arg operators anyway,

I got the impression that it is impossible to have merge operation for multi-arg operators.

I'll take a look next year, so if anyone can start working immediately - please go ahead.

@yuzefovich
Copy link
Member Author

I think that conversation is not directly applicable to this issue since here we are deciding on how to execute a particular logical plan (meaning after the optimizer has already chosen the plan) whereas that convo was about some "transformations" of the logical plan in order to get an equivalent one that has supposedly less cost.

Possibly, my usage of a word "decomposition" is a bit misleading since AggregatesCanMerge also talks about "decomposition".

@mneverov
Copy link
Contributor

Ok, need to dig into the code :).

@piao88
Copy link

piao88 commented Dec 30, 2020

Hi @yuzefovich
I'm new to the community and would like to pick this up.

@mneverov
Copy link
Contributor

mneverov commented Jan 5, 2021

hi @piao88, have you already started working on this? May I take it?

@piao88
Copy link

piao88 commented Jan 8, 2021

hi @mneverov , I have already started.

@yuzefovich
Copy link
Member Author

@piao88 @mneverov I think this issue is big enough for both of you to work on it, just make sure to communicate which functions you are working on.

@piao88
Copy link

piao88 commented Jan 11, 2021

hi @mneverov @yuzefovich I have already finished var_pop and stddev_pop

@henrjan
Copy link

henrjan commented Mar 17, 2021

hi, is this issue still ongoing? can i work on this?

@yuzefovich
Copy link
Member Author

Hi @henrjan, thanks for the interest! Let's check in with @piao88 who have already opened up a PR (#59174) that addresses this issue.

@piao88 are you still interested in pushing #59174 forward or would you be ok if someone took over this issue and possibly your PR?

@piao88
Copy link

piao88 commented Mar 18, 2021

Hi@henrjan ,Of course you can .

@alimoosavi
Copy link

@yuzefovich
I am new to cockroachdb project and this issue seems attractive to me , please assign this issue to me.

@yuzefovich
Copy link
Member Author

@mneverov are you still interested in working on this issue? Looks like @piao88 made some progress in #59174, but that PR seems abandoned, so I'd assign the issue to @mneverov if they are still interested.

@piao88
Copy link

piao88 commented Nov 19, 2021

@yuzefovich Sorry, I am no longer interested in this issue, you can give it to someone else.

@mneverov
Copy link
Contributor

thanks @yuzefovich , i'll take a look

mneverov added a commit to mneverov/cockroach that referenced this issue Nov 22, 2021
See cockroachdb#58347.

Release note (performance improvement): `covar_pop` aggregate function is now
evaluated more efficiently in a distributed setting
mneverov added a commit to mneverov/cockroach that referenced this issue Jan 18, 2022
See cockroachdb#58347.

Release note (performance improvement): `covar_pop` aggregate function is now
evaluated more efficiently in a distributed setting.
mneverov added a commit to mneverov/cockroach that referenced this issue Jan 19, 2022
See cockroachdb#58347.

Release note (performance improvement): `covar_pop` aggregate function is now
evaluated more efficiently in a distributed setting.
craig bot pushed a commit that referenced this issue Jan 19, 2022
73062: physicalplan: add support for multi-stage execution of covar_pop. r=yuzefovich a=mneverov

physicalplan: add support for multi-stage execution of covar_pop.

See #58347.

Release note (performance improvement): `covar_pop` aggregate function is now
evaluated more efficiently in a distributed setting.

75148: sql/rowenc: avoid panic when encoding delete preserving index r=rhu713 a=stevendanna

It appears, we can occasionally be asked to encode empty KVs. Notably,
this seems to happen along the delete path when multiple column
families are in play.

The included test case mirrors the case that caught this bug on an
in-progress branch. Namely, a "temporary" index used by the new index
backfiller during a primary key change.

Release note: None

75160: build: add utility function to merge test xml's r=rail a=rickystewart

The `stress` utility runs tests many times to accumulate results, so
we'd like some code to merge multiple `test.xml` files into a single
one.

Release note: None

75165: kv: fix GenerateForcedRetryableError to return a bumped epoch r=lidorcarmel a=lidorcarmel

This is needed for PR #74563, where we change how txn is reset.
Without this change GenerateForcedRetryableError returns an
error with an inner txn that has an epoch 0. With this change
the epoch is copied from the original txn.

Release note: None

75169: vendor: pull in latest version of `stress` r=rail a=rickystewart

Pull in the latest version of `stress` including these changes:

```
43d99a9 Merge pull request #13 from cockroachdb/bazelsharding
01690a1 stress: add `-bazel` support, support for sharding artifacts
```

Release note: None

Co-authored-by: Max Neverov <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Lidor Carmel <[email protected]>
craig bot pushed a commit that referenced this issue Jan 20, 2022
73062: physicalplan: add support for multi-stage execution of covar_pop. r=yuzefovich a=mneverov

physicalplan: add support for multi-stage execution of covar_pop.

See #58347.

Release note (performance improvement): `covar_pop` aggregate function is now
evaluated more efficiently in a distributed setting.

Co-authored-by: Max Neverov <[email protected]>
gtr pushed a commit to gtr/cockroach that referenced this issue Jan 24, 2022
See cockroachdb#58347.

Release note (performance improvement): `covar_pop` aggregate function is now
evaluated more efficiently in a distributed setting.
mneverov added a commit to mneverov/cockroach that referenced this issue Jan 27, 2022
…_sxy,

regr_syy.

See cockroachdb#58347.

Release note (performance improvement): regr_sxx, regr_sxy, regr_syy aggregate
functions are now evaluated more efficiently in a distributed setting.
craig bot pushed a commit that referenced this issue Jan 27, 2022
75619: physicalplan: multi-stage for regr_sxx, regr_sxy, regr_syy r=yuzefovich a=mneverov

physicalplan: add support for multi-stage execution of regr_sxx, regr_sxy, regr_syy.

See #58347.

Release note (performance improvement): regr_sxx, regr_sxy, regr_syy aggregate
functions are now evaluated more efficiently in a distributed setting.

75626: sql: skip TestDistSQLFlowsVirtualTables r=lidorcarmel a=lidorcarmel

Refs: #75208

Reason: flaky test

Generated by bin/skip-test.

Release justification: non-production code changes

Release note: None

75629: batcheval: ignore `MVCCStats.LastUpdateNanos` in `AddSSTable` assertions r=dt a=erikgrinaker

Release note: None

Co-authored-by: Max Neverov <[email protected]>
Co-authored-by: Lidor Carmel <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
mneverov added a commit to mneverov/cockroach that referenced this issue Feb 4, 2022
…r_avgy.

See cockroachdb#58347.

Release note (performance improvement): regr_avgx and regr_avgy aggregate
functions are now evaluated more efficiently in a distributed setting
mneverov added a commit to mneverov/cockroach that referenced this issue Feb 9, 2022
…r_avgy,

regr_intercept, regr_r2, and regr_slope aggregate functions.

See cockroachdb#58347.

Release note (performance improvement): regr_avgx, regr_avgy, regr_intercept,
regr_r2, and regr_slope aggregate functions are now evaluated more efficiently
in a distributed setting
mneverov added a commit to mneverov/cockroach that referenced this issue Feb 9, 2022
…r_avgy,

regr_intercept, regr_r2, and regr_slope aggregate functions.

See cockroachdb#58347.

Release note (performance improvement): regr_avgx, regr_avgy, regr_intercept,
regr_r2, and regr_slope aggregate functions are now evaluated more efficiently
in a distributed setting
craig bot pushed a commit that referenced this issue Feb 9, 2022
75870: rfc: pgwire-compatible query cancellation r=otan,knz a=rafiss

refs #41335

Release note: None

76007: physicalplan: add support for multi-stage execution of regr_avgx, regr_avgy, regr_intercept, regr_r2, and regr_slope aggregate functions. r=yuzefovich a=mneverov

physicalplan: add support for multi-stage execution of regr_avgx, regr_avgy,
regr_intercept, regr_r2, and regr_slope aggregate functions.

See #58347.

Release note (performance improvement): regr_avgx, regr_avgy, regr_intercept,
regr_r2, and regr_slope aggregate functions are now evaluated more efficiently
in a distributed setting

76115: sql: use 16 as default bucket count for hash index r=chengxiong-ruan a=chengxiong-ruan

for some reason I forgot to modify it in my previous pr :(

Release note (sql change): 16 is used as the default bucket count
for hash sharded index.

76262: server: remove DeprecateBaseEncryptionRegistry r=jbowens a=jbowens

Remove the migration server DeprecateBaseEncryptionRegistry RPC. It was
used for the migration of the encryption-at-rest registry format in
21.2 and is no longer relevant.

Missed this in #74314.

Release note: None

Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Max Neverov <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
Co-authored-by: Jackson Owens <[email protected]>
mneverov added a commit to mneverov/cockroach that referenced this issue Feb 17, 2022
sqrdiff, and regr_count aggregate functions.

See cockroachdb#58347.

Release note (performance improvement): corr, covar_samp, sqrdiff, and
regr_count aggregate functions are now evaluated more efficiently in a
distributed setting
mneverov added a commit to mneverov/cockroach that referenced this issue Feb 17, 2022
sqrdiff, and regr_count aggregate functions.

See cockroachdb#58347.

Release note (performance improvement): corr, covar_samp, sqrdiff, and
regr_count aggregate functions are now evaluated more efficiently in a
distributed setting
mneverov added a commit to mneverov/cockroach that referenced this issue Feb 18, 2022
sqrdiff, and regr_count aggregate functions.

Fixes cockroachdb#58347.

Release note (performance improvement): corr, covar_samp, sqrdiff, and
regr_count aggregate functions are now evaluated more efficiently in a
distributed setting
mneverov added a commit to mneverov/cockroach that referenced this issue Feb 24, 2022
sqrdiff, and regr_count aggregate functions.

Fixes cockroachdb#58347.

Release note (performance improvement): corr, covar_samp, sqrdiff, and
regr_count aggregate functions are now evaluated more efficiently in a
distributed setting
craig bot pushed a commit that referenced this issue Feb 25, 2022
…77019 #77045 #77047 #77049

72925: sql, cli: support basic auto complete for sql keywords r=rafiss a=RichardJCai

sql: add SHOW COMPLETIONS AT offset FOR syntax 

Release note (sql change): Support
SHOW COMPLETIONS AT OFFSET <offset> FOR <stmt> syntax that
returns a set of SQL keywords that can complete the keyword at
<offset> in the given <stmt>.

If the offset is in the middle of a word, then it returns the
full word.
For example SHOW COMPLETIONS AT OFFSET 1 FOR "SELECT" returns select.

cli: support autocomplete 

Release note (cli change): CLI now auto completes on tab
by using `SHOW COMPLETIONS AT OFFSET`.

76539: cli: Enable profiles and other debug info for tenants r=rimadeodhar a=rimadeodhar

This PR updates debug.zip functionality to
collect goroutine stacks and profiles for all
active SQL instances for a tenant.
This PR also addresses a bug where the nodes.json
and status.json data was not getting populated
correctly due to the switch to the `NodesList` API.
This bug has been addressed by using the p;d
`Nodes` API when the debug zip command is run against
a storage server.

Release note: None

76676: telemetry,sql: remove redaction from operational sql data r=abarganier a=dhartunian

Previously, when redaction was introduced into CRDB, all unidentifiable
strings were marked as redacted since that was the safer approach. We
expected to later return with a closer look and differentiate more
carefully between what should be redacted and what shouldn't.

SQL names have been identified as operational-sensitive data that should
not be redacted since it provides very useful debugging information and,
while user-controlled, do not typically contain user-data since those
would be stored in a Datum. This commit marks names as safe from
redaction for telemetry logging in cases where the
`sql.telemetry.query_sampling.enabled` cluster setting is enabled.

Additionally, some log tags such as client IP addresses are not to be
considered sensitive and are critical to debugging operational issues.
They have also been marked as safe.

In order to help with documenting these cases, a helper
`SafeOperational()` has been added to the `log` package. This helps us
mark strings as safe while documenting *why* we're doing so.

Resolves #76595

Release note (security update, ops change): When the
`sql.telemetry.query_sampling.enabled` cluster setting is enabled, SQL
names and client IPs are no longer redacted in telemetry logs.

76754: physicalplan: add support for multi-stage execution of corr, covar_samp, sqrdiff, and regr_count aggregate functions. r=yuzefovich a=mneverov

Fixes: #58347.

Release note (performance improvement): corr, covar_samp, sqrdiff, and
regr_count aggregate functions are now evaluated more efficiently in a
distributed setting

76908: roachtest: update 22.1 version map to v21.2.6 r=bananabrick a=bananabrick

Release note: None

76948: spanconfigreconciler{ccl}: apply system span config diffs to the store r=arulajmani a=adityamaru

This change teaches the reconciler about system span configs. Concretely,
we make the following changes:

- A full reconciliation when checking for existing span configurations now
asks for SpanConfigs corresponding to the SystemTargets relevant to the tenant.

For the host tenant this includes the SystemTarget for the `entire-keyspace` as
well as the SystemTarget for span configs installed by the host tenant on its
tenant keyspace, and on other secondary tenant keyspaces.

For secondary tenants this only includes the SystemTarget for span configs installed
by it on its own tenant keyspace.

- During incremental reconciliation, before applying our updates to the Store,
we now also check for "missing protected timestamp system targets". These correspond
to protected timestamp records that target a `Cluster` or a `Tenant` but no longer
exist in the system.protected_ts_records table as they have been released by the client.
For every such unique missing system target we apply a spanconfig.Deletion to the Store.

In order to make the above possible, this change moves the ptsStateReader from the
`spanconfigsqltranslator` package, to the top level `spanconfig` package.

Informs: #73727

Release note: None

76990: sql, geo: Fix upper case geohash parsing and allow NULL arguments r=otan a=RichardJCai

Release note (sql change): ST_Box2DFromGeoHash now accepts NULL arguments,
the precision being NULL is the same as no precision being passed in at all.

Upper case characters are now parsed as lower case characters for geohash,
this matches Postgis behaviour.

Resolves #75537

77006: changefeedccl: Fix data race in a test. r=miretskiy a=miretskiy

Fix data race in TestChangefeedSendError.

Release Notes: None

77014: log: Set content type header for http sink r=dhartunian a=rimadeodhar

The content type header for the output of HTTP log sink
is always set to text/plain irrespective of the log format.
If the log format is JSON, we should set the content
type to be application/json.

Release note (bug fix): The content type header for the
HTTP log sink is set to application/json if the format of
the log output is JSON.

77019: sql: update tpcc and tpch stats r=rharding6373 a=rharding6373

This change rewrites the stats for tpcc and tpch to include the new
table statistic avg_size.

Fixes: #72332

Release note: None

77045: bazel: bump `rules_go` to pick up cockroachdb/rules_go#4 r=irfansharif a=rickystewart

Closes #77037.

Release note: None

77047: dev: make sure we inherit `stdout` and `stderr` when appropriate r=postamar,irfansharif a=rickystewart

Otherwise `bazel build`/`test` output will be ugly (won't be colored/
use `ncurses`/etc.)

Release note: None

77049: sql: skip flaky schema_changer/drop_database_cascade test r=postamar a=RichardJCai

Release note: None

Co-authored-by: richardjcai <[email protected]>
Co-authored-by: rimadeodhar <[email protected]>
Co-authored-by: David Hartunian <[email protected]>
Co-authored-by: Max Neverov <[email protected]>
Co-authored-by: Arjun Nair <[email protected]>
Co-authored-by: Aditya Maru <[email protected]>
Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Co-authored-by: rharding6373 <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
@craig craig bot closed this as completed in 6fb88ef Feb 25, 2022
RajivTS pushed a commit to RajivTS/cockroach that referenced this issue Mar 6, 2022
…r_avgy,

regr_intercept, regr_r2, and regr_slope aggregate functions.

See cockroachdb#58347.

Release note (performance improvement): regr_avgx, regr_avgy, regr_intercept,
regr_r2, and regr_slope aggregate functions are now evaluated more efficiently
in a distributed setting
RajivTS pushed a commit to RajivTS/cockroach that referenced this issue Mar 6, 2022
sqrdiff, and regr_count aggregate functions.

Fixes cockroachdb#58347.

Release note (performance improvement): corr, covar_samp, sqrdiff, and
regr_count aggregate functions are now evaluated more efficiently in a
distributed setting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) good first issue meta-issue Contains a list of several other issues. T-sql-queries SQL Queries Team
Projects
None yet
7 participants
@jlinder @mneverov @yuzefovich @alimoosavi @piao88 @henrjan and others