-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Steady memory consumption increase of about 2MB/h #859
Comments
I had encountered this problem too. Memory leak? |
What do you get for the metric It would help to get a pprof memory allocation graph. |
Our memory consumption with version 0.15.3 is stable, but ranges from 7MB minimum to 60MB maximum across thousands of nodes. We disable the ipvs, xfs and zfs collectors. |
@xguerin Your screenshot cut off the Y axis values, so I have no idea what it means. |
Apologies I did not realize it. I updated the post. |
Those numbers seem reasonable, as @grobie said, the amount used by the exporter depends on which collectors you're using, and how much data they need to gather. There is probably some optimization we could do, but it doesn't seem like a clear memory leak. As you can see in the graph, Go GC is doing some work and reducing the process size from time to time. pprof samples are what we would need to debug farther, see if there are any specific collectors using up/ leaking memory. |
/usr/sbin/node_exporter --version |
@csawyerYumaed We really need a pprof dump in order to figure out what's going on. You can follow the same basic steps as this blog post but gather the data from the node_exporter rather than Prometheus itself. |
I run that go tool pprof command in the source directory of node_exporter? |
You don't need the source to run it, so you can run it from anywhere. |
oh DUH. sorry I was asleep at the w heel, it takes the URL as the argument to know where to profile.. I get it. I'll work on it next time it happens (I've already killed that process as it's super annoying) |
I can email the raw SVG somewhere if you want, but github won't let me post it here, and imgur and friends don't accept them either apparently. $ ps auxww | grep node |
I've upgraded to 0.15.2 and 0.15.1 on these nodes, to see if they also leak, if they do I'll post the same information for those versions, respectively. |
It would also be useful to test the 0.16.0 release candidate. |
OK, I didn't see the RC release. I have a node that leaked before on 0.16.0 RC release as well. If it leaks will probably do so overnight, will check on them in ~ 16 hrs or so(from now) will report back. |
@daimon99 Have you upgraded to the latest versoin? |
I'm not seeing anything akin to a memory leak on 0.16-rc.0. In the case of my machine it builds up to about 18MB of memory usage and is completely steady after that. I haven't disabled any collector on 0.16, didn't seem to be any need for it. |
The same here. Below is the graph for the The corresponding pprof heap dump is here: https://www.dropbox.com/s/d48l0rdco8cl4t7/heap.svg?dl=0 The RSS for the The command line we run
And this memory usage pattern is not unique: there are few other instances of the It looks like it's caused by |
@zerkms Interesting, thanks for the quality bug report. Yes, it looks like the xmlrpc library is leaky, or the supervisord collector doesn't close the connections properly and causes a leak. |
One thing I discovered so far is that the supervisord collector leaks a goroutine every scrape. I've tried adding some explicit RPC |
Host operating system: output of
uname -a
node_exporter version: output of
node_exporter --version
node_exporter command line flags
Are you running node_exporter in Docker?
Yes, through the
kube-prometheus
operator.What did you do that produced an error?
Just let the
node_monitor
running.What did you expect to see?
A memory consumption stabilizing around 50MB.
What did you see instead?
An increasing memory consumption of about 2MB/h
The text was updated successfully, but these errors were encountered: