-
Notifications
You must be signed in to change notification settings - Fork 39.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a scheduler profile level parameter percentageOfNodesToScore #112521
Conversation
/cc @Huang-Wei @ahg-g |
@Huang-Wei got a bunch of errors. Should I make changes to
|
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
@yuanchen8911 you need to update the spec in versioned types.go as well: staging/src/k8s.io/kube-scheduler/config/v1/types.go. Then rerun codegen script, and then optionally, fill the conversion logic if the auto-generated is not smart enough to cover it for you. |
74fdb48
to
aa80bc4
Compare
aa80bc4
to
2586d82
Compare
4d791f4
to
fb58e03
Compare
Fixed manual conversions. |
@yuanchen8911: The
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test pull-kubernetes-integration |
// nodes will be scored. | ||
PercentageOfNodesToScore int32 | ||
// nodes will be scored. It is overridden by profile level PercentageOfNodesToScore. | ||
PercentageOfNodesToScore *int32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to keep the API stable, why can we change this? Because no difference to end-users?
So the conclusion is if we can keep the consistent experience to users, no breaking change, then we can make some modifications, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep it consistent with the profile percentage. We need pointer to tell the difference between 0 and not set (nil) for profile percentage. If it's the latter, the global percentage will be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just nits.
SchedulerName: "default-scheduler", | ||
PercentageOfNodesToScore: nil, | ||
Plugins: defaults.PluginsV1beta3, | ||
PluginConfig: []config.PluginConfig{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defaults.PluginConfigV1beta3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's PluginsV1beta3
.
var PluginsV1beta3 = &config.Plugins{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm referring to this one
var PluginConfigsV1beta3 = []config.PluginConfig{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, replaced PluginConfig: ...
with PluginConfig: defaults.PluginConfigsV1beta3
pkg/scheduler/scheduler_test.go
Outdated
tests := []struct { | ||
name string | ||
percentageOfNodesToScore *int32 | ||
option *schedulerOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create this object inside the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, forget about this suggestion.
Can you instead call New(..., WithPercentageOfNodesToScore)
and then verify that the scheduler is created with the appropriate value?
schedulerOptions
is an internal object, so it's not much worth testing at that level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrote the test function to create a scheduler using WithPercentageOfNodesToScore
and to verify its percentageOfNodesToScore is set correctly.
13d3b8a
to
148aa16
Compare
There are a couple of suggestions that are still not addressed |
148aa16
to
d533d77
Compare
Made the suggested changes. Please take another look. Thanks. |
SchedulerName: "default-scheduler", | ||
PercentageOfNodesToScore: nil, | ||
Plugins: defaults.PluginsV1beta2, | ||
PluginConfig: []config.PluginConfig{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defaults.PluginConfigsV1beta2
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced it here. Other PluginConfigs
cannot use defaults.DefaultPluginConfigs
unless we modify minCandidateNodesPercentage: 50
to 10
(default value) and/or bindTimeoutSeconds: 300
to 600
(default value).
There are a few places that can be improved. How about a separate PR for a complete cleanup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, those sound tangential
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, added it to TODO list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missed one comment :)
Fix conversion errors Changed the order update update fix manaul coversions keep the global parameter for backward compatibility Address Wei's comments Fix an error Fix issues Add unit tests for validation Fix a comment Address comments Update comments fix verifiation errors Add tests for scheme_test.go Convert percentageOfNodesToScore to pointer Fix errors Resolve conflicts Fix testing errors Address Wei's comments Revert IntPtr to Int changes Address comments Not overrite percentageOfNodesToScore Fix a bug Fix a bug change errs to err Fix a nit Remove duplication Address comments Fix lint warning Fix an issue Update comments Clean up Address comments Revert changes to defaults fix unit test error Update Fix tests Use default PluginConfigs
d533d77
to
7297f48
Compare
/lgtm |
/assign @liggitt |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, Huang-Wei, liggitt, yuanchen8911 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold |
Hi @alculquicondor, the PR was merged, but there's a red X with it. Is it related to pending/failed pull-kubernetes-e2e-gce-100-performance test. Just wanted to make sure everything is fine. |
/test pull-kubernetes-e2e-gce-100-performance |
That means the CI job passed on a batch merge that included this commit, and failed on this individual PR. That happens occasionally with flaky jobs |
Thanks! So we don't need to rerun the test since it passed all tests as part of the batch merge. |
Right |
What type of PR is this?
/kind feature
/kind api-change
What this PR does / why we need it:
Add
percentageOfNodesToScore
as a per scheduler profile parameter in API version v1.The PR makes the following changes:
Add profile level
PercentageOfNodesToScore
to API Versionv1
.v1beta2
andv1beta3
will still use the old way (global value). Inv1
, if a profile levelpercentageOfNodesToScore
is set, it will override the global (scheduler level) value. If the profilepercentageOfNodesToScore
is not set and the globalpercentageOfNodesToScore
is set, the global value will be used. Otherwise, the defaultpercentageOfNodesToScore
is used.Change both scheduler and profile level
PercentageOfNodesToScore
to pointers.The goal is to support fine-grained per profile/workload scheduling customization and better balance scheduling performance and quality in multi-tenant Kubernetes clusters.
Which issue(s) this PR fixes:
Fixes # #93270
Reference:#95709 #95823
Previous PR: #97263 (incomplete)
Special notes for your reviewer:
We'll review and evaluate if the current global
percentageOfNodesToScore
should be deprecated and removed in the future versions.TODO (future PRs)
pointer.Int32Ptr/Int64Ptr
withpointer.Int32/Int64
inv1
.errorString
byfield.ErrorList
so the coverage is more comprehensive.PluginConfig
variable withdefaults.PluginConfigs
when it applies inapis/config/scheme/scheme_test.go
.Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: