Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: support fast analyze. #9973

Closed
wants to merge 72 commits into from

Conversation

lzmhhh123
Copy link
Contributor

What problem does this PR solve?

Support fast analyze.

What is changed and how it works?

We random the keys in each region to get samples instead of scanning all regions.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)

Code changes

  • Has exported function/method change
  • Has exported variable/fields change
  • Has persistent data change

Side effects

  • Possible performance regression
  • Increased code complexity

Related changes

  • Need to update the documentation
  • Need to be included in the release note

@codecov
Copy link

codecov bot commented Apr 2, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@abeddab). Click here to learn what that means.
The diff coverage is 60.4863%.

@@             Coverage Diff             @@
##             master      #9973   +/-   ##
===========================================
  Coverage          ?   77.9428%           
===========================================
  Files             ?        405           
  Lines             ?      82540           
  Branches          ?          0           
===========================================
  Hits              ?      64334           
  Misses            ?      13445           
  Partials          ?       4761

@lzmhhh123 lzmhhh123 marked this pull request as ready for review April 2, 2019 05:00
store/tikv/tikvrpc/tikvrpc.go Outdated Show resolved Hide resolved
store/tikv/client.go Outdated Show resolved Hide resolved
planner/core/planbuilder.go Show resolved Hide resolved
executor/analyze.go Outdated Show resolved Hide resolved
executor/analyze.go Outdated Show resolved Hide resolved
executor/analyze.go Outdated Show resolved Hide resolved
executor/analyze.go Outdated Show resolved Hide resolved
executor/analyze.go Outdated Show resolved Hide resolved
executor/analyze.go Outdated Show resolved Hide resolved
executor/analyze.go Outdated Show resolved Hide resolved
executor/analyze.go Outdated Show resolved Hide resolved
executor/analyze.go Outdated Show resolved Hide resolved
@lzmhhh123
Copy link
Contributor Author

lzmhhh123 commented Apr 4, 2019

This PR is too large to review. I'll split it by:

executor/analyze.go Outdated Show resolved Hide resolved
@lzmhhh123
Copy link
Contributor Author

lzmhhh123 commented Apr 18, 2019

Test by TPC-H factor=50. Table lineitem. 300M rows in total.

Normal analyze tasks 7min 54sec.
Fast analyze tasks 6sec.

@lzmhhh123
Copy link
Contributor Author

PTAL. @qw4990 @winoros

@winoros
Copy link
Member

winoros commented Apr 18, 2019

Wonderful result!
300G rows is a little confusing. I think 60 million rows is easier to understand.
BTW what's the quality of the histogram result compared with the normal analyze's?

@lzmhhh123
Copy link
Contributor Author

@winoros Now, the number of rows regards the same rows with a different snapshot as the different rows. So the total count for a column is not collect. The problem will be fixed at the next PR——build stats info for fast analyze.

executor/analyze.go Outdated Show resolved Hide resolved

keys := make([]kv.Key, 0, task.SampSize)
for i := 0; i < int(task.SampSize); i++ {
randKey := rander.Int63n(maxRowID-minRowID) + minRowID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since rand.Intn(0) will result in panic, do we need to check maxRowID-minRowID > 0 here?

if collectors[0].Samples[samplePos] == nil {
collectors[0].Samples[samplePos] = &statistics.SampleItem{}
}
collectors[0].Samples[samplePos].Ordinal = int(samplePos)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ordinal represents this item's relative order of physical position among samples.
For example, if there are two regions, they have condition that region1 < region2.
And you sample item1 from region1 and item2, item3 from region2 and they have condition that item1.key < item2.key < item3.key.
Then you have to meet condition that item1.Ordinal < item2.Ordinal < item3.Ordinal.
But in this implementation, item2.Ordinal < item3.Ordinal < item1.Ordinal can happen.
@eurekaka PTAL and please help to confirm if this problem exist?

for buildCnt := 0; buildCnt < 5; buildCnt++ {
needRebuild, err := e.buildSampTask()
if err != nil {
return nil, nil, errors.Trace(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to call errors.Trace() anymore.

go e.getSampRegionsRowCount(bo, &needRebuildForRoutine[i], &errs[i], &sampTasksForRoutine[i])
}

store, _ := e.ctx.GetStore().(tikv.Storage)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we return error if the store is not tikv.Storage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check has been done at the planner.

scanTasks []*tikv.KeyLocation
}

func (e *AnalyzeFastExec) getSampRegionsRowCount(bo *tikv.Backoffer, needRebuild *bool, err *error, sampTasks *[]*AnalyzeFastTask) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use another function signature? for example:

func (e *AnalyzeFastExec) getSampRegionsRowCount(bo *tikv.Backoffer) (needRebuild bool, sampTasks []*AnalyzeFastTask, err error) {

@lzmhhh123
Copy link
Contributor Author

All the sub-PRs have been merged.

@lzmhhh123 lzmhhh123 closed this Apr 28, 2019
@lzmhhh123 lzmhhh123 deleted the dev/fast_analyze branch July 24, 2019 05:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution status/LGT1 Indicates that a PR has LGTM 1. type/new-feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants