Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eviction pods with safe-to-evict: false annotation - scale-down-delay-after-add issue #7269

Open
leonelvargas opened this issue Sep 9, 2024 · 1 comment
Labels
area/cluster-autoscaler kind/bug Categorizes issue or PR as related to a bug.

Comments

@leonelvargas
Copy link

Which component are you using?:
Cluster autoscaler

What version of the component are you using?:
v1.30.0

Component version:
v1.30.0

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: v1.30.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.3-eks-2f46c53

What environment is this in?:
EKS - AWS

What did you expect to happen?:

If the configured scale-down-delay-after-add time expires, the autoscaler will mark this node and proceed to execute a scale-down if it is not used anymore, unless there are pods in it that have the annotation “safe-to-evict: false”.

What happened instead?:

Once this time expires, if there are new pods in that node, the autoscaler proceeds to drain the node via the cluster API, causing eviction in those pods.
Note that all these pods have the annotation “safe-to-evict: false” but they are drained anyway.

How to reproduce it (as minimally and precisely as possible):

Reproducing it is my problem. I can't determine the cause but in our environments if we configure the autoscaler as follows

./cluster-autoscaler
--v=4
--stderrthreshold=info
--cloud-provider=aws
--skip-nodes-with-local-storage=false
--expander=least-waste
--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/{{ .Values.aws.clusterName }}
--balance-similar-node-groups
--skip-nodes-with-system-pods=false
--ignore-daemonsets-utilization=true
--scale-down-delay-after-add=30m
--scale-down-utilization-threshold=0.01

The number of pods with eviction increases. But if we configure the autoscaler by increasing the delay, the eviction is very markedly reduced.

./cluster-autoscaler
--v=4
--stderrthreshold=info
--cloud-provider=aws
--skip-nodes-with-local-storage=false
--expander=least-waste
--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/{{ .Values.aws.clusterName }}
--balance-similar-node-groups
--skip-nodes-with-system-pods=false
--ignore-daemonsets-utilization=true
--scale-down-delay-after-add=4h
--scale-down-utilization-threshold=0.01

Anything else we need to know?:

I attach an analysis I did, because this error appears only if I activate the autoscaler. If the cluster I leave enough fixed nodes for the loads of my pods, at no time Eviction problems occur, hence my concern and why I generate this ticket.
Exemplified case:
At 15:56 (UTC-3) the node is without any pod with the annotation “safe-to-evict:false” because the pod “5bc..” finishes its work (no eviction occurs). After 4 min the cluster schedules 2 pods with the annotation (“safe-to-evict:false”) but the node already has a “scale-down-delay-after-add” time expired.
The strange thing is that after that time the node is drained.

image

Drained pods:
image
image

Autoscaler Logs:
I also attach the autoscaler logs:
autoscaler-logs.txt

Pod with the annotation “safe-to-evict:false”.
image

Hypothesis

The autoscaler does not take into account the annotation “safe-to-evict:false” when the “scale-down-delay-after-add” expired.
I will also add a ticket I saw of similar behaviors due to a problem with this annotation, maybe they are related.
#7244

Thanks.

@leonelvargas leonelvargas added the kind/bug Categorizes issue or PR as related to a bug. label Sep 9, 2024
@adrianmoisey
Copy link
Member

/area cluster-autoscaler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants