Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase memory & disk size limit for pangeo prometheus #1906

Merged
merged 1 commit into from
Nov 14, 2022

Commits on Nov 11, 2022

  1. Increase memory & disk size limit for pangeo prometheus

    While investigating why the pangeo hubs prometheus was dead,
    I discovered prometheus/prometheus#6934 -
    where there can be a big temporary *spike* in memory usage when
    prometheus is recovering from a restart. It tries to read the WAL
    (the write-ahead log), to make sure it hasn't lost any data during
    the restart process itself. The details of the WAL are unimportant
    (in this specific case), but just the fact that prometheus spikes
    memory usage on restarts!
    
    I manually hand edited the prometheus deployment with
    `k -n support edit deployment support-prometheus-server`,
    and gave it a higher limit (8G). Then I watched actual memory usage,
    with `watch kubectl -n support top pod`. I noticed that it momentarily
    spiked to almost 5G, before settling back to about 1.5G. The old memory
    limit was 4G, so during the spike the server gets killed! And then
    enters crashloopbackoff, as it can never survive.
    
    This commit raises the memory limit, so it won't keep crashing :)
    
    I also actually manually increased the size of the disk (with
    `kubectl -n support edit pvc`), but that wasn't the problem. However,
    we need to persist the change regardless, so here it is.
    
    Hopefully this will fix 2i2c-org#1843
    yuvipanda committed Nov 11, 2022
    Configuration menu
    Copy the full SHA
    e3bd48f View commit details
    Browse the repository at this point in the history