Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider pod image in duplicates strategy #275

Merged
merged 1 commit into from
May 15, 2020

Conversation

damemi
Copy link
Contributor

@damemi damemi commented May 1, 2020

Issue: #230

This adds the images in a pod to the key for determining duplicates along with the ownerref. The intended effect of this is that now instead of just checking if 2 pods have the same owner, they will also need to both have a container with the same image.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 1, 2020
@damemi
Copy link
Contributor Author

damemi commented May 1, 2020

This strategy should probably be refactored a bit, but for a quick and dirty fix I think this will tighten up the duplicates edge case

// So any non-unique Namespace/Kind/Name/Image pattern is a duplicate pod.
// Note that this doesn't consider all containers in the pod when calculating a duplicate, just any match
s := strings.Join([]string{pod.ObjectMeta.Namespace, ownerRef.Kind, ownerRef.Name, container.Image}, "/")
dpm[s] = append(dpm[s], pod)
Copy link
Contributor

@ingvagabund ingvagabund May 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case a pod has 2 or more containers, the strategy will evict the same pod more than once, failing in second and following attempts. Also, listDuplicatePodsOnANode no longer properly reflects the intention. It's more like listDuplicatePodsContainersOnANode.

What about to refactor listDuplicatePodsOnANode to return only a list of duplicated pods without returning duplicatePodsMap data type? I.e. func listDuplicatePodsOnANode(client clientset.Interface, node *v1.Node, evictLocalStoragePods bool) (duplicatedPods []*v1.Pod)

Keeping, what you have now. But adding another post-filter step:

var dupPodsMap [string]*v1.Pod

dpm := duplicatePodsMap{}
// Ignoring the error here as in the ListDuplicatePodsOnNode function we call ListEvictablePodsOnNode which checks for error.
for _, pod := range pods {
	ownerRefList := podutil.OwnerRef(pod)
	for _, ownerRef := range ownerRefList {
		for _, container := range pod.Spec.Containers {
			// Namespace/Kind/Name should be unique for the cluster.
			// We also consider the image, as 2 pods could have the same owner but serve different purposes
			// So any non-unique Namespace/Kind/Name/Image pattern is a duplicate pod.
			// Note that this doesn't consider all containers in the pod when calculating a duplicate, just any match
			s := strings.Join([]string{pod.ObjectMeta.Namespace, ownerRef.Kind, ownerRef.Name, container.Image}, "/")
			dpm[s] = append(dpm[s], pod)
			if len(dpm[s]) > 1 && dupPodsMap[key] != nil {
				key := pod.ObjectMeta.Namespace + "/" + pod.ObjectMeta.Name
				dupPodsMap[key] = pod
			}
		}
	}
}

for _, pod := range dupPodsMap {
	duplicatedPods = append(duplicatedPods, pod)
}

Possibly, changing duplicatePodsMap to just a yes/no switch:

type duplicatePodsMap map[string]*v1.Pod
...
if dpm[s] != nil &&  dupPodsMap[key] != nil {
	key := pod.ObjectMeta.Namespace + "/" + pod.ObjectMeta.Name
	dupPodsMap[key] = pod
}
dpm[s] = pod

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 6, 2020
@damemi
Copy link
Contributor Author

damemi commented May 6, 2020

@ingvagabund thanks, you are right that it would lead to multiple attempts to delete the same pod, so I added your changes, please take a look. I think I was actually able to simplify from the code you suggested, so if you can verify that I understood your point that would help

@damemi damemi force-pushed the image-in-duplicates branch 3 times, most recently from 2cc9453 to ed43f65 Compare May 6, 2020 18:41
@damemi
Copy link
Contributor Author

damemi commented May 6, 2020

Actually, I think my previous changes would have caused pods with more than 1 container to automatically match on themselves, so I've made the following logic changes:

  • within the containers loop, build a podKeys list of fully-qualified owner/image keys that will be matched in dpm
  • at the end of the pod loop, add the pod to dpm for each key, as before (but now only in the pod loop, not the containers loop)
  • if we find a match, add this pod to the list of duplicates (this will ensure that the pod only matches on other pods). Then continue adding the rest of the keys (in case future pods may match on them) but we don't need to append this pod anymore

@damemi
Copy link
Contributor Author

damemi commented May 6, 2020

I also added tests to confirm this works as expected

@aveshagarwal could you take a look as well?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: damemi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 12, 2020
@damemi
Copy link
Contributor Author

damemi commented May 12, 2020

cc @ingvagabund could you take another look at the solution we came to here? I slightly modified your suggestions (explained above) and added a test which looks like it confirms this is now working as expected

// Namespace/Kind/Name should be unique for the cluster.
s := strings.Join([]string{pod.ObjectMeta.Namespace, ownerRef.Kind, ownerRef.Name}, "/")
dpm[s] = append(dpm[s], pod)
for _, container := range pod.Spec.Containers {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a strong assumption the owner is either RC, RS, SS or alike. I.e. one template for all the instances. Also, there's no policy disallowing to create an owner that's not template based. I.e. all pods owned by the same owner kind/name but each pod created from a different template. As you now mention in the strategy description: A pod is said to be a duplicate of other if both of them are from same creator, kind and are within the same namespace, and have at least one container with the same image. It's perfectly valid for template based owners. However, for non-template based owners, one might face the following situation:
pod1:

  • container: image1
  • container: image2
    pod2:
  • container: image1
  • container: image3
    Which, based on the current definition of a duplicate, pod1 is a duplicate of pod2. What about to extend the definition to say:
A pod is a duplicate of another pod if both pods have:
- the same owner kind/name
- the same namespace
- the same list of containers (the same list of names, each two equal names have the same image).

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ingvagabund This was actually a problem I considered, whether we want to consider duplicates when there is an exact matching list of images or if just one match is enough. I originally went with the latter, but can see that there is a case for matching exact lists.

So I have pushed 365e355, which updates the strategy to the strictest definition, where pods are only duplicates if they have the exact same list of owners and images.

But because now we're checking uniqueness against a list of strings (instead of just one string), the map we had before wouldn't work (unless we came up with a way to represent the entire list as a unique string, which could get long and messy).

So now, I'm building the list of all the ownerref/image keys for a pod, then sorting that list. I'm keying the duplicates map on the first entry of that list (because if another pod doesn't have the same first entry, it's not a duplicate). The value for that key in the map is a list of all the lists which contain the same first key. If we find a matching list, the pod is a duplicate.

I was going to just make a simple loop using the existing map, where we check if all the key entries have >1 existing match like before. But this could give a false positive for pods with shorter lists.

Please take a look at the new commit (I've kept it separate for now to review easier)

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 13, 2020
// the same first entry is clearly not a duplicate. This makes lookup quick and minimizes storage needed).
// If any of the existing lists for that first key matches the current pod's list, the current pod is a duplicate.
// If not, then we add this pod's list to the list of lists for that key.
duplicateKeysMap := map[string][][]string{}
for _, pod := range pods {
ownerRefList := podutil.OwnerRef(pod)
podKeys := make([]string, 0, len(ownerRefList)*len(pod.Spec.Containers))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/podKeys/containerKeys

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make it clear that this is a collection of container keys essentially representing a pod, so I went with podContainerKeys

@@ -75,18 +89,21 @@ func listDuplicatePodsOnANode(ctx context.Context, client clientset.Interface, n
podKeys = append(podKeys, s)
}
}
sort.Strings(podKeys)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the following pods owned by the same (namespace, kind, owner) triple:

- pod1:
  - image1
  - image2
- pod2:
  - image2
  - image1

Do we want to consider both pods to be duplicates of each other?

Also, the following case independent of ordering where one image contains multiple binaries:

- pod1:
  - image1 (containername=withbinary1)
  - image1 (containername=withbinary2)
- pod2:
  - image1 (containername=withbinary3)
  - image1 (containername=withbinary4)

Each binary responsible for a different aspect of a customer solution. We could keep going and discuss the case where two pods have the exact same list of containers (wrt. image and container name), yet both pods running different binaries. Which will make the logic unnecessarily complicated.

This can eventually lead to customers reporting strange behavior for their custom controllers. As long as the owner kind is a known controller (RC, RS, SS, Deployment, ..), we are safe. However, we need to stress in the strategy description it's meant to be run only over template based owners. Or, allow to explicitly skip specific list of kinds. @damemi what do you think about adding a new ignoreOwnerKinds param to the strategy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about this a little offline so for reference here, I think that those 2 cases are reasonably out of the scope for what we should take into consideration for a duplicate pod.

The ordering of containers in a pod doesn't have any effect on their actual output, so I think it's safe to ignore that. And we can't really know what binaries are in the image, but we could look at the commands that are being run in the container. I think that's excessive though, but if there ends up being demand for that level of specificity I'm open to revisiting it.

Ultimately with these changes we have already made the strategy much stricter than it was before, which is a big improvement. I also pushed commit c11cb4f, which adds a parameter to exclude certain kinds of ownerrefs as you suggested (and also addresses the request in #243).

One caveat to note about this parameter is that it becomes all-or-nothing for considering a pod a duplicate. That is, if the pod has multiple ownerrefs and one of them is an excluded kind, we have to consider the whole pod ineligible for eviction. Because for example, if we just skipped adding the keys for that particular ownerref, it would actually make the pod more likely to match another pod (as it would have less keys identifying its uniqueness)

Please let me know if the new parameter matches what you had in mind, and if so I can squash the commits down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!!! Please squash, I will lgtm then.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 14, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 14, 2020
@damemi damemi force-pushed the image-in-duplicates branch 3 times, most recently from f9c4499 to 7890f3a Compare May 15, 2020 13:21
@damemi
Copy link
Contributor Author

damemi commented May 15, 2020

Squashed, and updated with a better commit message to describe the changes.

@damemi damemi force-pushed the image-in-duplicates branch 2 times, most recently from c7778d5 to 633fe6d Compare May 15, 2020 13:34
This increases the specificity of the RemoveDuplicates strategy by
removing pods which not only have the same owner, but who also
must have the same list of container images. This also adds a
parameter, `ExcludeOwnerKinds` to the RemoveDuplicates strategy
which accepts a list of Kinds. If a pod has any of these Kinds as
an owner, that pod is not considered for eviction.
@ingvagabund
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 15, 2020
@k8s-ci-robot k8s-ci-robot merged commit 34550d4 into kubernetes-sigs:master May 15, 2020
@seanmalloy
Copy link
Member

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 20, 2020
@seanmalloy seanmalloy mentioned this pull request May 20, 2020
4 tasks
ingvagabund added a commit to ingvagabund/descheduler that referenced this pull request Jun 4, 2020
Based on https://github.com/kubernetes/community/blob/master/community-membership.md#requirements-1:

The following apply to the part of codebase for which one would be a reviewer in an OWNERS file (for repos using the bot).

> member for at least 3 months

For a couple of years now

> Primary reviewer for at least 5 PRs to the codebase

kubernetes-sigs#285
kubernetes-sigs#275
kubernetes-sigs#267
kubernetes-sigs#254
kubernetes-sigs#181

> Reviewed or merged at least 20 substantial PRs to the codebase

https://github.com/kubernetes-sigs/descheduler/pulls?q=is%3Apr+is%3Aclosed+assignee%3Aingvagabund

> Knowledgeable about the codebase

yes

> Sponsored by a subproject approver
> With no objections from other approvers
> Done through PR to update the OWNERS file

this PR

> May either self-nominate, be nominated by an approver in this subproject, or be nominated by a robot

self-nominating
briend pushed a commit to briend/descheduler that referenced this pull request Feb 11, 2022
Consider pod image in duplicates strategy
briend pushed a commit to briend/descheduler that referenced this pull request Feb 11, 2022
Based on https://github.com/kubernetes/community/blob/master/community-membership.md#requirements-1:

The following apply to the part of codebase for which one would be a reviewer in an OWNERS file (for repos using the bot).

> member for at least 3 months

For a couple of years now

> Primary reviewer for at least 5 PRs to the codebase

kubernetes-sigs#285
kubernetes-sigs#275
kubernetes-sigs#267
kubernetes-sigs#254
kubernetes-sigs#181

> Reviewed or merged at least 20 substantial PRs to the codebase

https://github.com/kubernetes-sigs/descheduler/pulls?q=is%3Apr+is%3Aclosed+assignee%3Aingvagabund

> Knowledgeable about the codebase

yes

> Sponsored by a subproject approver
> With no objections from other approvers
> Done through PR to update the OWNERS file

this PR

> May either self-nominate, be nominated by an approver in this subproject, or be nominated by a robot

self-nominating
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants