-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider pod image in duplicates strategy #275
Consider pod image in duplicates strategy #275
Conversation
This strategy should probably be refactored a bit, but for a quick and dirty fix I think this will tighten up the duplicates edge case |
// So any non-unique Namespace/Kind/Name/Image pattern is a duplicate pod. | ||
// Note that this doesn't consider all containers in the pod when calculating a duplicate, just any match | ||
s := strings.Join([]string{pod.ObjectMeta.Namespace, ownerRef.Kind, ownerRef.Name, container.Image}, "/") | ||
dpm[s] = append(dpm[s], pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case a pod has 2 or more containers, the strategy will evict the same pod more than once, failing in second and following attempts. Also, listDuplicatePodsOnANode
no longer properly reflects the intention. It's more like listDuplicatePodsContainersOnANode
.
What about to refactor listDuplicatePodsOnANode
to return only a list of duplicated pods without returning duplicatePodsMap
data type? I.e. func listDuplicatePodsOnANode(client clientset.Interface, node *v1.Node, evictLocalStoragePods bool) (duplicatedPods []*v1.Pod)
Keeping, what you have now. But adding another post-filter step:
var dupPodsMap [string]*v1.Pod
dpm := duplicatePodsMap{}
// Ignoring the error here as in the ListDuplicatePodsOnNode function we call ListEvictablePodsOnNode which checks for error.
for _, pod := range pods {
ownerRefList := podutil.OwnerRef(pod)
for _, ownerRef := range ownerRefList {
for _, container := range pod.Spec.Containers {
// Namespace/Kind/Name should be unique for the cluster.
// We also consider the image, as 2 pods could have the same owner but serve different purposes
// So any non-unique Namespace/Kind/Name/Image pattern is a duplicate pod.
// Note that this doesn't consider all containers in the pod when calculating a duplicate, just any match
s := strings.Join([]string{pod.ObjectMeta.Namespace, ownerRef.Kind, ownerRef.Name, container.Image}, "/")
dpm[s] = append(dpm[s], pod)
if len(dpm[s]) > 1 && dupPodsMap[key] != nil {
key := pod.ObjectMeta.Namespace + "/" + pod.ObjectMeta.Name
dupPodsMap[key] = pod
}
}
}
}
for _, pod := range dupPodsMap {
duplicatedPods = append(duplicatedPods, pod)
}
Possibly, changing duplicatePodsMap
to just a yes/no switch:
type duplicatePodsMap map[string]*v1.Pod
...
if dpm[s] != nil && dupPodsMap[key] != nil {
key := pod.ObjectMeta.Namespace + "/" + pod.ObjectMeta.Name
dupPodsMap[key] = pod
}
dpm[s] = pod
0289a2a
to
4eba0e4
Compare
@ingvagabund thanks, you are right that it would lead to multiple attempts to delete the same pod, so I added your changes, please take a look. I think I was actually able to simplify from the code you suggested, so if you can verify that I understood your point that would help |
2cc9453
to
ed43f65
Compare
Actually, I think my previous changes would have caused pods with more than 1 container to automatically match on themselves, so I've made the following logic changes:
|
ed43f65
to
c21aa26
Compare
I also added tests to confirm this works as expected @aveshagarwal could you take a look as well? |
c21aa26
to
af6d4f1
Compare
af6d4f1
to
19a1cca
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: damemi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
19a1cca
to
7c65e7c
Compare
cc @ingvagabund could you take another look at the solution we came to here? I slightly modified your suggestions (explained above) and added a test which looks like it confirms this is now working as expected |
7c65e7c
to
b94ffba
Compare
// Namespace/Kind/Name should be unique for the cluster. | ||
s := strings.Join([]string{pod.ObjectMeta.Namespace, ownerRef.Kind, ownerRef.Name}, "/") | ||
dpm[s] = append(dpm[s], pod) | ||
for _, container := range pod.Spec.Containers { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a strong assumption the owner is either RC, RS, SS or alike. I.e. one template for all the instances. Also, there's no policy disallowing to create an owner that's not template based. I.e. all pods owned by the same owner kind/name but each pod created from a different template. As you now mention in the strategy description: A pod is said to be a duplicate of other if both of them are from same creator, kind and are within the same namespace, and have at least one container with the same image.
It's perfectly valid for template based owners. However, for non-template based owners, one might face the following situation:
pod1:
- container: image1
- container: image2
pod2: - container: image1
- container: image3
Which, based on the current definition of a duplicate,pod1
is a duplicate ofpod2
. What about to extend the definition to say:
A pod is a duplicate of another pod if both pods have:
- the same owner kind/name
- the same namespace
- the same list of containers (the same list of names, each two equal names have the same image).
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ingvagabund This was actually a problem I considered, whether we want to consider duplicates when there is an exact matching list of images or if just one match is enough. I originally went with the latter, but can see that there is a case for matching exact lists.
So I have pushed 365e355, which updates the strategy to the strictest definition, where pods are only duplicates if they have the exact same list of owners and images.
But because now we're checking uniqueness against a list of strings (instead of just one string), the map we had before wouldn't work (unless we came up with a way to represent the entire list as a unique string, which could get long and messy).
So now, I'm building the list of all the ownerref/image keys for a pod, then sorting that list. I'm keying the duplicates map on the first entry of that list (because if another pod doesn't have the same first entry, it's not a duplicate). The value for that key in the map is a list of all the lists which contain the same first key. If we find a matching list, the pod is a duplicate.
I was going to just make a simple loop using the existing map, where we check if all the key entries have >1 existing match like before. But this could give a false positive for pods with shorter lists.
Please take a look at the new commit (I've kept it separate for now to review easier)
// the same first entry is clearly not a duplicate. This makes lookup quick and minimizes storage needed). | ||
// If any of the existing lists for that first key matches the current pod's list, the current pod is a duplicate. | ||
// If not, then we add this pod's list to the list of lists for that key. | ||
duplicateKeysMap := map[string][][]string{} | ||
for _, pod := range pods { | ||
ownerRefList := podutil.OwnerRef(pod) | ||
podKeys := make([]string, 0, len(ownerRefList)*len(pod.Spec.Containers)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/podKeys/containerKeys
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to make it clear that this is a collection of container keys essentially representing a pod, so I went with podContainerKeys
@@ -75,18 +89,21 @@ func listDuplicatePodsOnANode(ctx context.Context, client clientset.Interface, n | |||
podKeys = append(podKeys, s) | |||
} | |||
} | |||
sort.Strings(podKeys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming the following pods owned by the same (namespace, kind, owner) triple:
- pod1:
- image1
- image2
- pod2:
- image2
- image1
Do we want to consider both pods to be duplicates of each other?
Also, the following case independent of ordering where one image contains multiple binaries:
- pod1:
- image1 (containername=withbinary1)
- image1 (containername=withbinary2)
- pod2:
- image1 (containername=withbinary3)
- image1 (containername=withbinary4)
Each binary responsible for a different aspect of a customer solution. We could keep going and discuss the case where two pods have the exact same list of containers (wrt. image and container name), yet both pods running different binaries. Which will make the logic unnecessarily complicated.
This can eventually lead to customers reporting strange behavior for their custom controllers. As long as the owner kind is a known controller (RC, RS, SS, Deployment, ..), we are safe. However, we need to stress in the strategy description it's meant to be run only over template based owners. Or, allow to explicitly skip specific list of kinds. @damemi what do you think about adding a new ignoreOwnerKinds
param to the strategy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about this a little offline so for reference here, I think that those 2 cases are reasonably out of the scope for what we should take into consideration for a duplicate pod.
The ordering of containers in a pod doesn't have any effect on their actual output, so I think it's safe to ignore that. And we can't really know what binaries are in the image, but we could look at the commands that are being run in the container. I think that's excessive though, but if there ends up being demand for that level of specificity I'm open to revisiting it.
Ultimately with these changes we have already made the strategy much stricter than it was before, which is a big improvement. I also pushed commit c11cb4f, which adds a parameter to exclude certain kinds of ownerrefs as you suggested (and also addresses the request in #243).
One caveat to note about this parameter is that it becomes all-or-nothing for considering a pod a duplicate. That is, if the pod has multiple ownerrefs and one of them is an excluded kind, we have to consider the whole pod ineligible for eviction. Because for example, if we just skipped adding the keys for that particular ownerref, it would actually make the pod more likely to match another pod (as it would have less keys identifying its uniqueness)
Please let me know if the new parameter matches what you had in mind, and if so I can squash the commits down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!!! Please squash, I will lgtm then.
aa43559
to
3067264
Compare
3067264
to
c11cb4f
Compare
f9c4499
to
7890f3a
Compare
Squashed, and updated with a better commit message to describe the changes. |
c7778d5
to
633fe6d
Compare
This increases the specificity of the RemoveDuplicates strategy by removing pods which not only have the same owner, but who also must have the same list of container images. This also adds a parameter, `ExcludeOwnerKinds` to the RemoveDuplicates strategy which accepts a list of Kinds. If a pod has any of these Kinds as an owner, that pod is not considered for eviction.
633fe6d
to
c6ff87d
Compare
/lgtm |
/kind bug |
Based on https://github.com/kubernetes/community/blob/master/community-membership.md#requirements-1: The following apply to the part of codebase for which one would be a reviewer in an OWNERS file (for repos using the bot). > member for at least 3 months For a couple of years now > Primary reviewer for at least 5 PRs to the codebase kubernetes-sigs#285 kubernetes-sigs#275 kubernetes-sigs#267 kubernetes-sigs#254 kubernetes-sigs#181 > Reviewed or merged at least 20 substantial PRs to the codebase https://github.com/kubernetes-sigs/descheduler/pulls?q=is%3Apr+is%3Aclosed+assignee%3Aingvagabund > Knowledgeable about the codebase yes > Sponsored by a subproject approver > With no objections from other approvers > Done through PR to update the OWNERS file this PR > May either self-nominate, be nominated by an approver in this subproject, or be nominated by a robot self-nominating
Consider pod image in duplicates strategy
Based on https://github.com/kubernetes/community/blob/master/community-membership.md#requirements-1: The following apply to the part of codebase for which one would be a reviewer in an OWNERS file (for repos using the bot). > member for at least 3 months For a couple of years now > Primary reviewer for at least 5 PRs to the codebase kubernetes-sigs#285 kubernetes-sigs#275 kubernetes-sigs#267 kubernetes-sigs#254 kubernetes-sigs#181 > Reviewed or merged at least 20 substantial PRs to the codebase https://github.com/kubernetes-sigs/descheduler/pulls?q=is%3Apr+is%3Aclosed+assignee%3Aingvagabund > Knowledgeable about the codebase yes > Sponsored by a subproject approver > With no objections from other approvers > Done through PR to update the OWNERS file this PR > May either self-nominate, be nominated by an approver in this subproject, or be nominated by a robot self-nominating
Issue: #230
This adds the images in a pod to the key for determining duplicates along with the ownerref. The intended effect of this is that now instead of just checking if 2 pods have the same owner, they will also need to both have a container with the same image.