Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: only add default value for final aggregation to fix the aggregate push down (partition) union case #35443

Merged
merged 6 commits into from
Jun 28, 2022

Conversation

tiancaiamao
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #35295

Problem Summary:

When table t is empty, this get empty result:

select avg(id) from t group by id;

While this get NULL:

select avg(id) from t;

We add NULL (default value) to the empty result here:

tidb/executor/aggregate.go

Lines 1368 to 1379 in 2de01b4

if e.childResult.NumRows() == 0 {
if !e.isChildReturnEmpty {
err = e.appendResult2Chunk(chk)
} else if e.defaultVal != nil {
chk.Append(e.defaultVal, 0, 1)
}
e.executed = true
return err
}
// Reach here, "e.childrenResults[0].NumRows() > 0" is guaranteed.
e.isChildReturnEmpty = false
e.inputRow = e.inputIter.Begin()

Whether or not add the NULL row depends on some conditions.

Note, for this case, the added NULL row make the final result wrong:

image

What is changed and how it works?

Only add NULL for final aggregation.
Otherwise the add NULL row operation would be done several times and send a wrong NULL row to the final aggregation.

  • In aggregation push down optimze rule, I'll set the pushed down aggregation to non-final
  • In physical plan, aggregation push down to cop phase, I'll not blindly change the upper aggregation to final/complete... because it's parent might be final and it's a partial itself
  • In the executor builder phase, only set the default value for final aggregation

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jun 16, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • qw4990
  • winoros

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. needs-cherry-pick-4.0 needs-cherry-pick-release-5.0 needs-cherry-pick-release-5.1 needs-cherry-pick-release-5.2 needs-cherry-pick-release-5.3 Type: Need cherry pick to release-5.3 needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 16, 2022
@tiancaiamao tiancaiamao marked this pull request as ready for review June 16, 2022 08:03
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 16, 2022
@tiancaiamao
Copy link
Contributor Author

/run-unit-test

1 similar comment
@tiancaiamao
Copy link
Contributor Author

/run-unit-test

@sre-bot
Copy link
Contributor

sre-bot commented Jun 20, 2022

@tiancaiamao
Copy link
Contributor Author

PTAL @XuHuaiyu @qw4990

@@ -123,8 +123,6 @@ func (a *AggFuncDesc) Split(ordinal []int) (partialAggDesc, finalAggDesc *AggFun
partialAggDesc.Mode = Partial1Mode
} else if a.Mode == FinalMode {
partialAggDesc.Mode = Partial2Mode
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous assumption, there will be no partial mode agg before coprocessor push down...

But now I change the logical plan aggregation to partial, so in the coprocessor push down here, this line may meet partial mode agg.

planner/core/physical_plans.go Show resolved Hide resolved
@@ -1417,7 +1421,11 @@ func BuildFinalModeAggregation(
}
}

finalAggFunc.Mode = aggregation.FinalMode
if aggFunc.Mode == aggregation.CompleteMode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand what does this code block mean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the old assumption, before the coprocessor push down, the agg mode is either Complete or Final. So after the push down, the parent agg become Final mode.

But in the new assumption, before the coprocessor push down, the agg mode can be partial. So here the parent agg mode is set to different value accordingly.

For final before, we set it to final.
For partial, it's set to partial2

@@ -1417,7 +1421,11 @@ func BuildFinalModeAggregation(
}
}

finalAggFunc.Mode = aggregation.FinalMode
if aggFunc.Mode == aggregation.CompleteMode {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean that when we are in the partition union mode. The Agg->PartitionUnion->TableReader may become Agg->PartitionUnion->Agg->TableReader, then we do the push down. So we need to add a new if-condition check.

You can add some comments here.

Copy link
Contributor Author

@tiancaiamao tiancaiamao Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agg1->PartitionUnion->Agg2->TableReader

In coprocessor push down, Agg2->TableReader become Agg3(root)->TableReader->Agg4(cop)->TableScan(cop), and the final plan become Agg1->PartitionUnion->Agg3(root)->TableReader->Agg4(cop)->TableScan(cop)

In the past, Agg2 is always Complete or Final, but in this PR here Agg2 could be Partial
So BuildFinalModeAggregation need to consider the difference here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add some comment in the code so the reviewer know what happen.

@ti-chi-bot ti-chi-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 23, 2022
@tiancaiamao tiancaiamao requested a review from winoros June 23, 2022 03:38
@ti-chi-bot ti-chi-bot merged commit d99b358 into pingcap:master Jun 28, 2022
ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Jun 28, 2022
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #35765

@ti-srebot
Copy link
Contributor

cherry pick to release-5.0 in PR #35766

@ti-srebot
Copy link
Contributor

cherry pick to release-5.1 in PR #35767

@ti-srebot
Copy link
Contributor

cherry pick to release-5.2 in PR #35768

@tiancaiamao tiancaiamao deleted the issue-35295 branch June 28, 2022 04:14
ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Jun 28, 2022
@ti-srebot
Copy link
Contributor

cherry pick to release-5.3 in PR #35769

@ti-srebot
Copy link
Contributor

cherry pick to release-5.4 in PR #35770

@ti-srebot
Copy link
Contributor

cherry pick to release-6.0 in PR #35771

@ti-srebot
Copy link
Contributor

cherry pick to release-6.1 in PR #35772

@sre-bot
Copy link
Contributor

sre-bot commented Jun 28, 2022

TiDB MergeCI notify

🔴 Bad News! [1] CI still failing after this pr merged.
These failed integration tests don't seem to be introduced by the current PR.

CI Name Result Duration Compare with Parent commit
idc-jenkins-ci-tidb/integration-common-test 🔴 failed 2, success 9, total 11 21 min Existing failure
idc-jenkins-ci/integration-cdc-test 🟢 all 35 tests passed 25 min Existing passed
idc-jenkins-ci-tidb/common-test 🟢 all 12 tests passed 11 min Existing passed
idc-jenkins-ci-tidb/integration-ddl-test 🟢 all 6 tests passed 8 min 48 sec Existing passed
idc-jenkins-ci-tidb/sqllogic-test-2 🟢 all 28 tests passed 7 min 23 sec Existing passed
idc-jenkins-ci-tidb/sqllogic-test-1 🟢 all 26 tests passed 6 min 54 sec Existing passed
idc-jenkins-ci-tidb/tics-test 🟢 all 1 tests passed 6 min 7 sec Existing passed
idc-jenkins-ci-tidb/integration-compatibility-test 🟢 all 1 tests passed 4 min 44 sec Existing passed
idc-jenkins-ci-tidb/mybatis-test 🟢 all 1 tests passed 3 min 35 sec Existing passed
idc-jenkins-ci-tidb/plugin-test 🟢 build success, plugin test success 4min Existing passed

morgo added a commit to morgo/tidb that referenced this pull request Jun 28, 2022
* upstream/master: (57 commits)
  types: fix incompatible implementation of jsonpath extraction (pingcap#35320)
  planner: fix TRACE PLAN TARGET = 'estimation' panic when meeting partition table (pingcap#35743)
  *: Add `testfork.RunTest` to run multiple tests in one function (pingcap#35746)
  sessionctx/variable: add tests to ensure skipInit can be removed (pingcap#35703)
  helper: request another PD if one of them is unavailable (pingcap#35750)
  metrics: add cached table related metrics to grafana panel (pingcap#34718)
  expression: use cloned RetType at `evaluateExprWithNull` when it may be changed. (pingcap#35759)
  executor: fix left join on partition table generate invalid lock key (pingcap#35732)
  readme: remove adopters (pingcap/docs#8725) (pingcap#35124)
  *: only add default value for final aggregation to fix the aggregate push down (partition) union case (pingcap#35443)
  planner: fix the wrong cost formula of MPPExchanger on cost model ver2 (pingcap#35718)
  sessionctx: support encoding and decoding statement context (pingcap#35688)
  txn: refactor ts acquisition within build and execute phases (pingcap#35376)
  ddl: for schema-level DDL method parameter is now XXXStmt (pingcap#35722)
  *: enable gofmt (pingcap#35721)
  planner: disable collate clause support for enum or set column (pingcap#35684)
  *: Provide a util to "pause" session in uint test (pingcap#35529)
  ddl: implement the core for multi-schema change (pingcap#35429)
  parser: XXXDatabaseStmt now use CIStr for DB name (pingcap#35668)
  *: remove real tikv test on github actions (pingcap#35710)
  ...
ti-chi-bot pushed a commit that referenced this pull request Aug 22, 2022
ti-chi-bot pushed a commit that referenced this pull request Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-5.0 needs-cherry-pick-release-5.1 needs-cherry-pick-release-5.2 needs-cherry-pick-release-5.3 Type: Need cherry pick to release-5.3 needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

static pruning mode gets wrong results
8 participants