planner: add heuristic rules for index selection #26304

xuyifangreeneyes · 2021-07-16T10:08:11Z

What problem does this PR solve?

What is changed and how it works?

What's Changed & How it Works:

In (*DataSource).DeriveStats, we try heuristic rules to select the path from possible access paths. Once some path matches heuristics, we remove other paths from ds.possibleAccessPaths, let the selected path remain and add a NOTE level warning to indicate the path matches heuristics.

If the query ranges of a unique index are all points and it only needs a single scan, then just use the unique index.
If the query ranges of a unique index are all points but it needs a double scan, we collect it into uniqueIdxsWithDoubleScan. For the one in uniqueIdxsWithDoubleScan with the minimal number of ranges, we record it as uniqueBest.
If the query ranges of an index are not all points but it only needs a single scan, we collect it into singleScanIdxs. For each index in singleScanIdxs, if it is better than some index in uniqueIdxsWithDoubleScan(there is a subset relation between two accessCondsColSet), we call it a refined index. We select the refined index with the minimal number of ranges as refinedBest.
We compare uniqueBest and refinedBest and choose the better one.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

add heuristic rules for index selection

ti-chi-bot · 2021-07-16T10:08:12Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

qw4990
winoros

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

xuyifangreeneyes · 2021-07-16T10:08:36Z

/sig planner

xuyifangreeneyes · 2021-07-16T10:10:36Z

/cc @winoros

xuyifangreeneyes · 2021-07-21T03:29:50Z

/cc @qw4990 @time-and-fate

ti-chi-bot · 2021-08-02T09:18:30Z

@xuyifangreeneyes: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

planner/core/stats.go

qw4990

The function DataSource.DeriveStats is too huge. It's better to split it into multiple functions, and then put all logic about heuristic rules in one of them?

planner/core/stats.go

Co-authored-by: Yuanjia Zhang <[email protected]>

qw4990 · 2021-08-04T11:01:06Z

planner/core/integration_test.go

+	tk.MustExec("drop table if exists t1, t2")
+	tk.MustExec("create table t1(a int, b int, c int, d int, e int, f int, g int, primary key (a), unique key c_d_e (c, d, e), unique key f (f), unique key f_g (f, g), key g (g))")
+	tk.MustExec("create table t2(a int, b int, c int, d int, unique index idx_a (a), unique index idx_b_c (b, c), unique index idx_b_c_a_d (b, c, a, d))")


Please add some cases about ClusteredIndex.

qw4990

LGTM

winoros · 2021-08-04T12:00:22Z

/merge

ti-chi-bot · 2021-08-04T12:00:24Z

This pull request has been accepted and is ready to merge.

Commit hash: 1eddf71

ti-chi-bot · 2021-08-04T12:00:34Z

@xuyifangreeneyes: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot · 2021-08-04T12:00:36Z

@xuyifangreeneyes: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

xuyifangreeneyes added 6 commits July 14, 2021 16:32

refine index back factor of skyline prunning

d9d8ce2

fix test case

b45895a

enhance isMatchProp

25b39f5

fix ut

6ad7d5c

add test for isMatchProp

d246484

fmt

0685f15

ti-chi-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jul 16, 2021

ti-chi-bot added the sig/planner SIG: Planner label Jul 16, 2021

ti-chi-bot requested a review from winoros July 16, 2021 10:10

add comment

486fdc7

ti-chi-bot requested review from qw4990 and time-and-fate July 21, 2021 03:29

ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 21, 2021

xuyifangreeneyes and others added 5 commits July 23, 2021 18:01

enhance detection of constant columns

7decc45

fix ut & add comment

b2d975c

Merge branch 'master' into improve-skyline-pruning-2

116e60b

minor fix

aadd749

add heuristics in DataSource.DeriveStats

3e73ab9

xuyifangreeneyes force-pushed the add-heuristics branch from 77cce9f to 3e73ab9 Compare July 28, 2021 15:23

ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 28, 2021

xuyifangreeneyes and others added 5 commits July 29, 2021 13:07

append warning about heuristic index selection

ff458ef

add test for heuristics

c7ab877

add test

4f84651

fmt

ded7ec8

Merge branch 'master' into add-heuristics

26655cd

ti-chi-bot removed the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Aug 2, 2021

Merge branch 'master' into add-heuristics

e126e61

ti-chi-bot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Aug 2, 2021

zhouqiang-cl removed the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Aug 4, 2021

Merge branch 'master' into add-heuristics

5519965