-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pod label selector to cli and strategies #202
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Welcome @killianmuldoon! |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: killianmuldoon The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed CLA |
for _, node := range nodes { | ||
pods, err := podutil.ListPodsOnANode(client, node) | ||
pods, err := podutil.ListPodsOnANode(client, selector, node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pod selectors might not work right for lownodeutilization startgey this way as in this strategy, a node's load is computed by considering all pods. So computing loads on each node should be done by considering all pods. Perhaps during only eviction, pod selectors might be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better if I moved the implementation of this into podutil.isEvictable rather than podutil.ListPodsOnNode? I considered this, but it means a label check on each pod which slows down the check as pod numbers go up. Would it work to make lownodeutilization a special case? I guess it already works quite a bit differently to the other strategies.
IMHO, I am not sure adding pod selectors is a good idea in general, or at least might need a bit more thinking than just introducing pod selectors like this. kube-scheduler makes its decisions by considering all pods in a cluster, and descheduler strategies must follow same. As I am not sure how evicting only selected pods can help with balancing a cluster which is the main goal of descheduler. In any case, even if we would like to have pod selectors, that must be done after careful review, and by considering real use cases. So please feel free to write down concrete use cases and how that would help with balancing whole cluster when pod selectors are used. |
@aveshagarwal Its understandable that the goals of descehduler is to balance the cluster but there might be scenarios/use case where descheduler is used for other purposes. E.g. the use case we are targeting involves decheduling a set of pods in case of an event. This would be useful in case of Telemetry Aware Scheduling where the TAS controller ensures that the telemetry policy corresponding to a pod is not violated, violation involves the controller to label pods which can be then descheduled by the descehuler. Other than that this could be very useful for slicing the cluster for testing purposes and also for scenarios where you would like to prevent workloads from being descheduled. But regardless of that, this pod label selector is provided as an optional argument meaning the current behavior of the descheduler is not altered. Also, if you have suggestions to improve the design/ implementation of this feature we would be more than happy to make changes. |
+1 to @swatisehgal here - Telemetry Aware Scheduling is the main use I was considering with this PR. I'd like to add more generally that slicing the cluster for descheduling in this way would move the cluster to a more balanced state while allowing a step based approach. |
/kind feature |
@killianmuldoon: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I think the best way to do this would be to write a google doc and share it with sig-scheduling mailing list and ask for the feedback, if there are lots of people who have use-cases like this, we can consider this |
@ravisantoshgudimetla I agree that submitting a proposal to the sig-scheduling mailing list would be a good next step. @killianmuldoon is this something is this something you would be able to do? |
@ravisantoshgudimetla @seanmalloy I'll get working on this and submit it soon. Thanks for the direction! |
Any update on this ? i need this in order to run descheduler on namespace scope |
@killianmuldoon are you still planning on creating a design proposal and sharing it with sig-scheduling? Is there anything we can do to help? |
@seanmalloy Sorry - this fell by the wayside a little bit! I'll put this on the agenda for an upcoming sig-scheduling meeting. Thanks for the reminder! |
@killianmuldoon I created a Google Doc proposal and shared it with the sig-scheduling mailing list. I've also added this to the sig-scheduling meeting agenda for July 2nd. |
@swatisehgal and @killianmuldoon are there specific descheduler strategies you would like to use with this feature? Any additional details you can provide about real world use cases would be useful as I build out the proposal document for review by SIG scheduling. Thanks! |
@seanmalloy Thanks for following up on this - it completely fell off my plate. I updated the doc with some details of our use case, and I'll join the meeting later on in case there's any questions I can help answer. |
/close @killianmuldoon after review by SIG scheduling we would like to implement this as a descheduler strategy parameters instead of a CLI option. So I'm going to close this PR. Additional details can be found in this comment: #195 (comment). |
@seanmalloy: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
this PR was created to address #195 and #163
It adds "pod-selector" - a Kubernetes Label Selector - as a command line argument and propagates it through the existing strategies. This allows slicing the cluster into pods the descheduler is interested in and those it isn't interested in e.g. "app=test".
This PR mirrors the implementation of the node selector arg.