*: restrict index range mem usage #37754

xuyifangreeneyes · 2022-09-10T10:27:45Z

What problem does this PR solve?

Issue Number: ref #37176

Problem Summary:

What is changed and how it works?

Restrict index range mem usage. Part 3 of #37160.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

ti-chi-bot · 2022-09-10T10:27:46Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

Yisaer
time-and-fate

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

Yisaer

rest lgtm

Yisaer

rest lgtm

Yisaer · 2022-09-14T05:31:16Z

util/ranger/ranger.go

+	// Estimate whether rangeMaxSize will be exceeded first before appending points to ranges.
+	if rangeMaxSize > 0 && len(origin) > 0 && len(rangePoints) > 0 {
+		startPoint, err := convertPoint(sctx, rangePoints[0], ft)
+		if err != nil {
+			return nil, false, errors.Trace(err)
+		}
+		endPoint, err := convertPoint(sctx, rangePoints[1], ft)
+		if err != nil {
+			return nil, false, errors.Trace(err)
+		}
+		ran := appendPoint2Range(origin[0], startPoint, endPoint, ft)
+		if ran.MemUsage()*int64(len(origin))*int64(len(rangePoints))/2 > rangeMaxSize {
+			return origin, true, nil
+		}
+	}


What if the first rangePoint has large/tiny data which caused the estimate difference too big?

Changed the way to estimate mem usage of ranges. Now we iterating origin and rangePoints to sum up mem usage of datums.

time-and-fate · 2022-09-16T12:36:06Z

planner/util/path.go

+			}
+			colEqConstant := isColEqConstant(filter, path.IdxCols[i])
+			if i == eqOrInCount && colEqConstant {
+				// If there is a col-eq-constant condition for path.IdxCols[eqOrInCount], it means that range fallback happens


I can't understand this. Why does a col = constant condition mean fallback happened?

If range fallback doesn't happen, DetachCondAndBuildRangeForIndex must put col = constant to path.AccessConds rather than path.TableFilters.

i == eqOrInCount && colEqConstant IF is to prevent the following case:

create table t (a int, b int, c int, index idx_a_b(a, b)); set @@tidb_opt_range_max_size=1000; explain format='brief' select /*+ use_index(t, idx_a_b) */ * from t where a in (1, 3, 5) and b = 2;

DetachCondAndBuildRangeForIndex puts a in (1, 3, 5) into path.AccessConds and puts b = 2 into path.TableFilters. Without i == eqOrInCount && colEqConstant IF, SplitCorColAccessCondFromFilters adds b = 2 back into path.AccessConds but doesn't rebuild ranges, which leads to wrong results.

Got it.
As you said, I think another reason we need to do this here is that, if the additional AccessConds added here are not caused by correlated columns, then we will not try to build ranges using the additional conditions, which will lead to incomplete range and then wrong results.
I think we can clarify it in the comments.

Yes. The IF branch ensures that there must be some col-eq-corcol condition in access if len(access) > 0. Actually the first one in access must be some col-eq-corcol condition. I add it to comment in code.

time-and-fate

We can add a test case with expressions like (a,b) in ((1,2),(3,4)) and c = 5 to cover more considerDNF code path of detachCNFCondAndBuildRangeForIndex().

time-and-fate · 2022-09-18T20:21:36Z

planner/util/path.go

+			}
+			colEqConstant := isColEqConstant(filter, path.IdxCols[i])
+			if i == eqOrInCount && colEqConstant {
+				// If there is a col-eq-constant condition for path.IdxCols[eqOrInCount], it means that range fallback happens


Got it.
As you said, I think another reason we need to do this here is that, if the additional AccessConds added here are not caused by correlated columns, then we will not try to build ranges using the additional conditions, which will lead to incomplete range and then wrong results.
I think we can clarify it in the comments.

time-and-fate · 2022-09-18T20:23:43Z

util/ranger/ranger.go

+	}
+	// Estimate whether rangeMaxSize will be exceeded first before appending ranges to point ranges.
+	if rangeMaxSize > 0 && estimateMemUsageForAppendRanges2PointRanges(pointRanges, ranges) > rangeMaxSize {
+		return ranges, true


It seems we should return pointRanges here to be consistent with other places.

Good catch! Fixed.

time-and-fate · 2022-09-18T20:33:51Z

util/ranger/ranger.go

+	if len1 == 0 || len2 == 0 {
+		return 0
+	}
+	getDatumSize := func(rs Ranges) int64 {


Seems we repeated the "adding up all memory usage of Datums in a Range" logic several times (including the one in the MemUsage()), I think we can unify them maybe some time.

Use getRangesTotalDatumSize and getPointsTotalDatumSize to reuse common logic.

xuyifangreeneyes · 2022-09-21T03:09:51Z

We can add a test case with expressions like (a,b) in ((1,2),(3,4)) and c = 5 to cover more considerDNF code path of detachCNFCondAndBuildRangeForIndex().

I Add some test cases to cover considerDNF code path.

xuyifangreeneyes · 2022-09-21T12:15:59Z

/run-unit-test

xuyifangreeneyes · 2022-09-21T12:40:40Z

/run-unit-test

xuyifangreeneyes · 2022-09-21T12:58:36Z

/run-unit-test

xuyifangreeneyes · 2022-09-21T13:06:03Z

/build

xuyifangreeneyes · 2022-09-21T13:06:49Z

/rebuild

xuyifangreeneyes · 2022-09-21T13:12:56Z

/build

xuyifangreeneyes · 2022-09-21T13:13:45Z

/run-build

xuyifangreeneyes · 2022-09-21T13:31:26Z

/run-build

xuyifangreeneyes · 2022-09-21T13:31:40Z

/run-unit-test

xuyifangreeneyes · 2022-09-21T14:20:49Z

/run-build

xuyifangreeneyes · 2022-09-21T14:35:03Z

/run-build

xuyifangreeneyes · 2022-09-21T14:43:52Z

/run-mysql-test

xuyifangreeneyes · 2022-09-21T14:57:21Z

/run-build

xuyifangreeneyes · 2022-09-21T15:13:36Z

/run-build

xuyifangreeneyes · 2022-09-21T15:13:51Z

/run-unit-test

xuyifangreeneyes · 2022-09-21T15:43:35Z

/run-unit-test

xuyifangreeneyes · 2022-09-21T16:09:09Z

/run-unit-test

xuyifangreeneyes · 2022-09-21T16:14:22Z

/run-build

sre-bot · 2022-09-21T17:27:38Z

TiDB MergeCI notify

✅ Well Done! New fixed [1] after this pr merged.

CI Name	Result	Duration	Compare with Parent commit
idc-jenkins-ci-tidb/common-test	🔴 failed 1, success 10, total 11	52 min	Existing failure
idc-jenkins-ci-tidb/integration-ddl-test	🔴 failed 2, success 4, total 6	23 min	Existing failure
idc-jenkins-ci-tidb/tics-test	🔴 failed 1, success 0, total 1	4 min 12 sec	Existing failure
idc-jenkins-ci/integration-cdc-test	✅ all 37 tests passed	26 min	Fixed
idc-jenkins-ci-tidb/integration-common-test	🟢 all 17 tests passed	10 min	Existing passed
idc-jenkins-ci-tidb/sqllogic-test-1	🟢 all 26 tests passed	5 min 23 sec	Existing passed
idc-jenkins-ci-tidb/sqllogic-test-2	🟢 all 28 tests passed	4 min 42 sec	Existing passed
idc-jenkins-ci-tidb/mybatis-test	🟢 all 1 tests passed	3 min 4 sec	Existing passed
idc-jenkins-ci-tidb/integration-compatibility-test	🟢 all 1 tests passed	2 min 32 sec	Existing passed
idc-jenkins-ci-tidb/plugin-test	🟢 build success, plugin test success	4min	Existing passed

restrict index range mem usage

7c423af

ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 10, 2022

xuyifangreeneyes added 3 commits September 11, 2022 16:15

add more tests

05d9f4b

fix bug and refine tests

9e660cf

add test on correlated column in ranges

605b41d

xuyifangreeneyes requested review from Yisaer, Reminiscent and winoros September 13, 2022 01:25

Merge branch 'master' into range-mem-3

62f1a25

Yisaer reviewed Sep 14, 2022

View reviewed changes

xuyifangreeneyes mentioned this pull request Sep 15, 2022

Refine building ranges logic #37176

Open

14 tasks

use avg range size to estimate mem usage

6b1c548

ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 16, 2022

Merge branch 'master' into range-mem-3

63e55db

ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 16, 2022

xuyifangreeneyes and others added 2 commits September 16, 2022 16:26

refine range mem usage estimation

1e26570

Merge branch 'master' into range-mem-3

c731f14

xuyifangreeneyes requested a review from time-and-fate September 16, 2022 08:31

time-and-fate reviewed Sep 16, 2022

View reviewed changes

time-and-fate reviewed Sep 19, 2022

View reviewed changes

xuyifangreeneyes and others added 6 commits September 19, 2022 11:04

add comment

ac08e4c

add considerDNF test

eb8fff3

add more test

94160d3

fix

a6b3c9e

address comment

f866dfe

Merge branch 'master' into range-mem-3

0b3d297

Merge branch 'master' into range-mem-3

839e845

Merge branch 'master' into range-mem-3

05aee60

xuyifangreeneyes added 2 commits September 21, 2022 20:40

Merge branch 'master' into range-mem-3

317050f

Merge branch 'master' into range-mem-3

b118c5b

Merge branch 'master' into range-mem-3

9fd7130

xuyifangreeneyes and others added 2 commits September 21, 2022 21:41

Merge branch 'master' into range-mem-3

62f2803

Merge branch 'master' into range-mem-3

08ec13d

Merge branch 'master' into range-mem-3

acd323d

Merge branch 'master' into range-mem-3

5b59314

ti-chi-bot merged commit da3dab1 into pingcap:master Sep 21, 2022

xuyifangreeneyes deleted the range-mem-3 branch September 21, 2022 16:30

*: restrict index range mem usage #37754

*: restrict index range mem usage #37754

Conversation

xuyifangreeneyes commented Sep 10, 2022

What problem does this PR solve?

What is changed and how it works?

Check List

Release note

ti-chi-bot commented Sep 10, 2022 • edited Loading

Yisaer left a comment

Choose a reason for hiding this comment

Yisaer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

time-and-fate left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

xuyifangreeneyes commented Sep 21, 2022

sre-bot commented Sep 21, 2022

TiDB MergeCI notify

ti-chi-bot commented Sep 10, 2022 •

edited

Loading