planner: rewrite count star to count not null column #39197

elsa0520 · 2022-11-16T10:40:39Z

What problem does this PR solve?

Issue Number: close #37165

Problem Summary:

What is changed and how it works?

The main purpose of this pr is to improve the execution speed of count(), and rewrite count() to count (the narrowest non-null column) at the planning layer to achieve the purpose of improving performance.

Added a countStarRewriter logical rule responsible for rewriting count(*) to count(the narrowest non-null column).
When the columns of the datasource are empty, column pruning will supplement the narrowest non-null column instead of the row id column (in the case of TiFlash)

For detailed rewriting steps, see comments of countStarRewriter

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

ti-chi-bot · 2022-11-16T10:40:41Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

AilinKid
winoros

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

AilinKid

Rest LGTM

planner/core/rule_count_star_rewriter.go

planner/core/physical_plan_test.go

parser/types/field_type.go

planner/core/rule_column_pruning.go

planner/core/rule_count_star_rewriter.go

planner/core/rule_column_pruning.go

planner/core/rule_count_star_rewriter.go

planner/core/testdata/plan_suite_in.json

planner/core/physical_plan_test.go

planner/core/testdata/plan_suite_in.json

elsa0520 · 2022-11-28T09:04:30Z

/run-build

elsa0520 · 2022-11-28T09:30:18Z

/run-check_dev

elsa0520 · 2022-11-28T09:37:03Z

/run-check_dev_2

elsa0520 · 2022-11-28T09:37:29Z

/run-mysql-test

elsa0520 · 2022-11-28T09:37:42Z

/run-unit-test

elsa0520 · 2022-11-28T10:09:54Z

/run-all-tests

wuhuizuo · 2022-11-28T10:20:54Z

/run-all-tests

elsa0520 · 2022-11-28T12:07:27Z

/run-mysql-test tidb-test=2026

elsa0520 · 2022-11-28T12:19:41Z

/run-mysql-test tidb-test=2026

elsa0520 · 2022-11-28T12:22:50Z

/run-mysql-test tidb-test=pr/2026

elsa0520 · 2022-11-28T12:31:32Z

/run-mysql-test tidb-test=pr/2026

elsa0520 · 2022-11-28T12:39:30Z

/run-mysql-test tidb-test=pr/2026

elsa0520 · 2022-11-28T13:51:38Z

/run-mysql-test tidb-test=pr/2026

elsa0520 · 2022-11-28T15:09:54Z

/run-mysql-test tidb-test=pr/2026

winoros · 2022-11-28T15:28:45Z

/merge

ti-chi-bot · 2022-11-28T15:28:51Z

This pull request has been accepted and is ready to merge.

Commit hash: 83dd8a3

elsa0520 · 2022-11-28T15:47:18Z

/run-mysql-test tidb-test=pr/2026

sre-bot · 2022-11-28T16:24:46Z

TiDB MergeCI notify

🔴 Bad News! New failing [2] after this pr merged.
These new failed integration tests seem to be caused by the current PR, please try to fix these new failed integration tests, thanks!

CI Name	Result	Duration	Compare with Parent commit
idc-jenkins-ci-tidb/integration-common-test	🟥 failed 8, success 9, total 17	17 min	New failing
idc-jenkins-ci-tidb/common-test	🟥 failed 2, success 9, total 11	10 min	New failing
idc-jenkins-ci/integration-cdc-test	🟢 all 40 tests passed	21 min	Existing passed
idc-jenkins-ci-tidb/integration-ddl-test	🟢 all 6 tests passed	12 min	Existing passed
idc-jenkins-ci-tidb/mybatis-test	🟢 all 1 tests passed	6 min 15 sec	Existing passed
idc-jenkins-ci-tidb/sqllogic-test-1	🟢 all 26 tests passed	6 min 1 sec	Existing passed
idc-jenkins-ci-tidb/tics-test	🟢 all 1 tests passed	5 min 54 sec	Existing passed
idc-jenkins-ci-tidb/sqllogic-test-2	🟢 all 28 tests passed	4 min 44 sec	Existing passed
idc-jenkins-ci-tidb/integration-compatibility-test	🟢 all 1 tests passed	3 min 21 sec	Existing passed
idc-jenkins-ci-tidb/plugin-test	🟢 build success, plugin test success	4min	Existing passed

Lloyd-Pottiger · 2022-11-29T01:57:56Z

planner/core/rule_count_star_rewriter.go

+Case2 there is no columns from datasource
+Query: select count(*) from table
+ColumnPruningRule: pick k1 as the narrowest not null column from origin table @Function.preferNotNullColumnFromTable
+                   datasource.columns: k1
+CountStarRewriterRule: rewrite count(*) -> count(k1)
+Rewritten Query: select count(k1) from table


Any benchmark for this case? In TiFlash, there are extra three columns <row_id, version, delmark> which are used for MVCC. Therefore, if there are multi version of data, we need to scan <row_id, version, delmark> each query additionally. For count(1) we need to scan <row_id, version, delmark> and return row_id to aggregation, but for count(k) we need to scan <row_id, version, delmark, k1> and return k1 to aggregation. So the performance may degrade when there are multi version of data in TiFlash and k1 is wide.

I am doing performance testing.
Do you mean that in the case of multiple versions, even if the count(*) does not select the row_id column, the final read will still read the row_id column?

yes, we need to use it to filter out other data of other versions.

ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 16, 2022

elsa0520 force-pushed the count_star branch from 43de4e2 to 8b03667 Compare November 18, 2022 10:04

elsa0520 marked this pull request as ready for review November 24, 2022 02:40

ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 24, 2022

qw4990 added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner labels Nov 24, 2022

AilinKid reviewed Nov 24, 2022

View reviewed changes

planner/core/rule_count_star_rewriter.go Show resolved Hide resolved

planner/core/physical_plan_test.go Outdated Show resolved Hide resolved

elsa0520 added 4 commits November 24, 2022 15:06

planner: rewrite count star to count not null column

ea58edc

change the empty columns rule

be601fa

Supply origin column name and remove var type

d63db6b

add ut

7557ab3

elsa0520 force-pushed the count_star branch from 571fa5d to 7557ab3 Compare November 24, 2022 07:06

ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 24, 2022

elsa0520 added 4 commits November 24, 2022 15:11

fix import group

b08485d

fix build and ut error

e96cab4

fix ut

916e0f7

fix ut

6b72996

fixdb reviewed Nov 24, 2022

View reviewed changes

add ut and make row_id hidden

da695d2

fixdb reviewed Nov 25, 2022

View reviewed changes

planner/core/rule_count_star_rewriter.go Outdated Show resolved Hide resolved

fixdb reviewed Nov 25, 2022

View reviewed changes

planner/core/rule_count_star_rewriter.go Outdated Show resolved Hide resolved

fixdb reviewed Nov 25, 2022

View reviewed changes

planner/core/testdata/plan_suite_in.json Outdated Show resolved Hide resolved

fixdb reviewed Nov 25, 2022

View reviewed changes

planner/core/physical_plan_test.go Outdated Show resolved Hide resolved

change to count constant rule

4220271

fixdb reviewed Nov 25, 2022

View reviewed changes

planner/core/testdata/plan_suite_in.json Show resolved Hide resolved

fixdb reviewed Nov 25, 2022

View reviewed changes

planner/core/testdata/plan_suite_in.json Outdated Show resolved Hide resolved

fix bazel

608b56b

fix switch case

fd677ef

fix ut

83dd8a3

winoros approved these changes Nov 28, 2022

View reviewed changes

ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Nov 28, 2022

ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Nov 28, 2022

Merge branch 'master' into count_star

0aa20b4

AilinKid approved these changes Nov 28, 2022

View reviewed changes

ti-chi-bot merged commit 37bd052 into pingcap:master Nov 28, 2022

Lloyd-Pottiger reviewed Nov 29, 2022

View reviewed changes

elsa0520 mentioned this pull request Nov 30, 2022

Can't find column when count star rewriter is enable #39506

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

planner: rewrite count star to count not null column #39197

planner: rewrite count star to count not null column #39197

elsa0520 commented Nov 16, 2022 •

edited

Loading

ti-chi-bot commented Nov 16, 2022 •

edited

Loading

AilinKid left a comment

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

wuhuizuo commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

winoros commented Nov 28, 2022

ti-chi-bot commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

sre-bot commented Nov 28, 2022

Lloyd-Pottiger Nov 29, 2022

elsa0520 Nov 29, 2022

Lloyd-Pottiger Nov 29, 2022

planner: rewrite count star to count not null column #39197

planner: rewrite count star to count not null column #39197

Conversation

elsa0520 commented Nov 16, 2022 • edited Loading

What problem does this PR solve?

What is changed and how it works?

Check List

Release note

ti-chi-bot commented Nov 16, 2022 • edited Loading

AilinKid left a comment

Choose a reason for hiding this comment

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

wuhuizuo commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

winoros commented Nov 28, 2022

ti-chi-bot commented Nov 28, 2022

elsa0520 commented Nov 28, 2022

sre-bot commented Nov 28, 2022

TiDB MergeCI notify

Lloyd-Pottiger Nov 29, 2022

Choose a reason for hiding this comment

elsa0520 Nov 29, 2022

Choose a reason for hiding this comment

Lloyd-Pottiger Nov 29, 2022

Choose a reason for hiding this comment

elsa0520 commented Nov 16, 2022 •

edited

Loading

ti-chi-bot commented Nov 16, 2022 •

edited

Loading