Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: auto analyze on certain period of a day #7570

Merged
merged 7 commits into from
Sep 5, 2018
Merged

Conversation

alivxxx
Copy link
Contributor

@alivxxx alivxxx commented Aug 31, 2018

What problem does this PR solve?

Some application doesn't want auto-analyze to be executed in the hours when the cluster is busy, we need to provide an option to only execute auto-analyze in idle hours.
Fix #7072

What is changed and how it works?

Add two global variables to control the analyze start time and end time. But when the table is never analyzed, the time limit is ignored.

Check List

Tests

  • Unit test

Code changes

  • Has exported variable/fields change

Side effects

  • None

Related changes

  • Need to update the documentation

PTAL @coocood @zz-jason @winoros

@alivxxx alivxxx added component/statistics type/enhancement The issue or PR belongs to an enhancement. labels Aug 31, 2018
@alivxxx alivxxx force-pushed the time branch 2 times, most recently from a36c514 to b1e8881 Compare August 31, 2018 11:54
@winoros
Copy link
Member

winoros commented Aug 31, 2018

Can we let it auto analyze at 11:00pm~6:00am by set this two variable?

@alivxxx
Copy link
Contributor Author

alivxxx commented Aug 31, 2018

@winoros Yes, you can set it to 23:00 CST and 06:00 CST like the case in test.

@shenli
Copy link
Member

shenli commented Sep 2, 2018

Could we guarantee that the EndTime is latter than the StartTime?

tbl *statistics.Table
ratio float64
limit time.Duration
timeStr []string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would easier to read if we split timeStr slice to start, end and now.

@coocood
Copy link
Member

coocood commented Sep 2, 2018

@lamxTyler
I think using the local timezone is easier to use, and we don't need to specify the timezone part.

@alivxxx
Copy link
Contributor Author

alivxxx commented Sep 3, 2018

@shenli Then it would need another field to note that they are not on the same day, I think it is more complicated.

@alivxxx
Copy link
Contributor Author

alivxxx commented Sep 3, 2018

@coocood Stats owner could on any server, so it may introduce unexpected behavior when the stats owner is on a different timezone.

@winoros
Copy link
Member

winoros commented Sep 3, 2018

Whether analyze it or not is still controlled by the existing variable TiDBAutoAnalyeRatio?

Copy link
Member

@winoros winoros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@alivxxx
Copy link
Contributor Author

alivxxx commented Sep 3, 2018

@winoros Yes, TiDBAutoAnalyeRatio decides if we need to analyze the table.

@alivxxx alivxxx added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 3, 2018
@coocood
Copy link
Member

coocood commented Sep 3, 2018

@lamxTyler
The time zone of stats owner doesn't matter, we use the local timezone on the server that executes the set statement.
We can use UTC internally, but accept user input in local timezone.

return now.Sub(end) <= 0 || now.Sub(start) >= 0
}

// NeedAnalyzeTable checks if we need to analyze the table:
// 1. If the table has never been analyzed, we need to analyze it when it has
// not been modified for a time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ not been modified for a time./ not been modified for a while

statistics/update.go Outdated Show resolved Hide resolved
now: "00:02 CST",
result: false,
},
// table already analyzed and but not within time period
Copy link
Contributor

@zhexuany zhexuany Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/table already analyzed and but not within time period/table was already analyzed and but not within time period

and for the rest of the comment, you also need to fix them.

@alivxxx
Copy link
Contributor Author

alivxxx commented Sep 4, 2018

@coocood How can the stats owner know the timezone of the server that sets the variable?

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coocood
Copy link
Member

coocood commented Sep 5, 2018

LGTM

@alivxxx
Copy link
Contributor Author

alivxxx commented Sep 5, 2018

/run-all-tests

@alivxxx alivxxx added status/all tests passed status/LGT3 The PR has already had 3 LGTM. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 5, 2018
@alivxxx alivxxx merged commit 74a6ddd into pingcap:master Sep 5, 2018
@alivxxx alivxxx deleted the time branch September 5, 2018 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/statistics status/LGT3 The PR has already had 3 LGTM. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants