Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prometheus-node-exporter: declare cpu/mem requests and limits #2450

Merged

Conversation

consideRatio
Copy link
Member

@consideRatio consideRatio commented Mar 31, 2023

We've seen the node-exporter pod be evicted because it has no resource requests/limits when what needs to be evicted are other pods to resolve the situation. Let's not allow that to happen!

CPU (leap, 30 last days)

Note that the use dropped at some point. That was me upgrading the GKE based k8s cluster in #2337.

image

Memory (leap, 30 last days)

image

Why I choose the requests and limits

  1. We don't want this pod to starve other pods of CPU if it bursts, but it seems it can only burst to ~3m CPU. Setting a limit on 5m CPU seemed fine.
  2. We want to ensure the typical operation is guaranteed CPU wise, which seem to be ~0-3m CPU. Setting a request of 5m CPU seemed fine.
  3. We want requests/limits of memory to be the same to avoid getting evicted by memory, and crashing reliably if this is too low instead. I saw that it has peaked at ~21Mi, so I went for 30Mi to be safe, knowing there is a few hundred Mi of memory available still to not cause issues with planned requests/limits in New default machine types and profile list options - sharing nodes is great! #2121.

@consideRatio consideRatio self-assigned this Mar 31, 2023
Copy link
Contributor

@pnasrat pnasrat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes for comments/docs

This generally makes sense though.

helm-charts/support/values.yaml Outdated Show resolved Hide resolved
helm-charts/support/values.yaml Outdated Show resolved Hide resolved
helm-charts/support/values.yaml Outdated Show resolved Hide resolved
@consideRatio
Copy link
Member Author

Thank you for reviewing @pnasrat! I realized that I had copy pasted incorrect prometheus queries so I drew the wrong conclusions. It turns out this pod is peaking at very low CPU (~3m) and memory (~21Mi), so I've made updates on queries, screenshots, and requests/limits etc.

Copy link
Contributor

@pnasrat pnasrat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@consideRatio consideRatio merged commit 113baf9 into 2i2c-org:master Apr 4, 2023
@consideRatio
Copy link
Member Author

Thank you @pnasrat for reviewing!!

@github-actions
Copy link

github-actions bot commented Apr 4, 2023

🎉🎉🎉🎉

Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/4608120591

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants