-
Notifications
You must be signed in to change notification settings - Fork 721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
schedule: support patrol region concurrency #8094
Conversation
[REVIEW NOTIFICATION] This pull request has not been approved. To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Skipping CI for Draft Pull Request. |
Signed-off-by: lhy1024 <[email protected]>
4c59016
to
d1f4b8a
Compare
Signed-off-by: lhy1024 <[email protected]>
Signed-off-by: lhy1024 <[email protected]>
Signed-off-by: lhy1024 <[email protected]>
2e61d30
to
948dc77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall lgtm, is this PR ready?
I am preparing some tests for different scenarios |
Signed-off-by: lhy1024 <[email protected]>
Signed-off-by: lhy1024 <[email protected]>
Signed-off-by: lhy1024 <[email protected]>
948dc77
to
c198b08
Compare
pkg/schedule/config/config.go
Outdated
@@ -63,6 +63,8 @@ const ( | |||
defaultRegionScoreFormulaVersion = "v2" | |||
defaultLeaderSchedulePolicy = "count" | |||
defaultStoreLimitVersion = "v1" | |||
defaultPatrolRegionConcurrency = 1 | |||
defaultPatrolRegionBatchLimit = 128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can use max(128,region_count/1024)
Signed-off-by: lhy1024 <[email protected]>
e71f635
to
a0ec33d
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8094 +/- ##
==========================================
+ Coverage 77.29% 77.36% +0.07%
==========================================
Files 471 471
Lines 61445 61515 +70
==========================================
+ Hits 47491 47590 +99
+ Misses 10395 10362 -33
- Partials 3559 3563 +4
Flags with carried forward coverage won't be shown. Click here to find out more. |
Signed-off-by: lhy1024 <[email protected]>
// Stop the old workers and start the new workers. | ||
c.patrolRegionContext.workersCancel() | ||
c.patrolRegionContext.wg.Wait() | ||
c.patrolRegionContext.workersCtx, c.patrolRegionContext.workersCancel = context.WithCancel(c.ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we adjust the workers more gracefully? For example, if the new worker count is more than the current workers, we can scale out more wroker and no need to build all workers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's necessary, and generally speaking we don't change this configuration very often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid that it maybe wait some time to stop all and start all wokers.
Signed-off-by: lhy1024 <[email protected]>
Signed-off-by: lhy1024 <[email protected]>
Signed-off-by: lhy1024 <[email protected]>
Signed-off-by: lhy1024 <[email protected]>
PTAL @bufferflies @rleungx |
@@ -67,6 +67,9 @@ const ( | |||
defaultRegionScoreFormulaVersion = "v2" | |||
defaultLeaderSchedulePolicy = "count" | |||
defaultStoreLimitVersion = "v1" | |||
defaultPatrolRegionWorkerCount = 1 | |||
maxPatrolRegionWorkerCount = 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it too small and not be changed?how about using the core num as the max limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current tests show that 8 is enough, if needed in the future I think it can be increased or core num can be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
rest lgtm |
[LGTM Timeline notifier]Timeline:
|
Signed-off-by: lhy1024 <[email protected]>
02bfc3f
to
7e5813d
Compare
/test pull-integration-realcluster-test |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bufferflies, niubell, nolouch, okJiang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What problem does this PR solve?
Issue Number: Close #7963 #7706
What is changed and how does it work?
Check List
Tests
Release note