-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Gaurd against infinite memory leak in dgram #8705
Conversation
This gaurds against a memory leak with an unbounded queue. We've seen this queue have 250k items in production.
self.send.apply(self, self._sendQueue[i]); | ||
self._sendQueue = undefined; | ||
}); | ||
self.on('listening', onListening); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be once
, since onListening
blows away the ._sendQueue
property after it executes (making subsequent calls to onListening
throw an exception)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inside onListening
I remove the listener manually. I should probably remove the listener as the first line to avoid a sync race condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed a flaw here. I need to not add an 'error' listener as thats a major breaking semantic change.
I'll refactor this to once and deal with the error case differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added another commit to address this.
@@ -110,6 +110,7 @@ function Socket(type, listener) { | |||
this._bindState = BIND_STATE_UNBOUND; | |||
this.type = type; | |||
this.fd = null; // compatibility hack | |||
this.highWaterMark = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious -- why 100
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty arbitrary. I really don't know what a good default is.
@Raynos ... still an issue for you? This would likely need to be updated and retargeted at master http://github.com/nodejs/node if you'd like to pursue. |
@mcollina could you check to see if this might still be an issue? |
@Fishrock123 I believe I can artificially reproduce the problem on master: 'use strict';
const dns = require('dns');
dns.lookup = function(address, family, callback) {
callback(new Error('foo'));
};
const dgram = require('dgram');
const socket = dgram.createSocket('udp4');
socket.on('error', (err) => {});
setInterval(() => {
for (let i = 0; i < 100000; ++i) {
socket.send('foobar', 80, 'google.com');
}
console.log(`queue length: ${socket._queue ? socket._queue.length : 0}`);
}, 1000); The queue grows on every |
Sounds like it's still an issue. At uber we only have one UDP use case and it has a manual gaurd ( https://github.com/uber/node-statsd-client/blob/master/lib/ephemeral-socket.js#L136-L150 ) in it work around this bug. So, funny story. Because of the "nodejs dns bug" we've literally stopped using DNS in our uber data centers and only use IP addresses. |
When send() triggers an implicit bind, the send operation is added to an internal queue. If a DNS error occurs during the bind, there is currently no mechanism for clearing the queue other than sending more data. If DNS errors keep occurring, the queue will continue to grow with no upper bound. This commit reports errors with implicit binds, and clears the queue. This should be fine, given the nature of UDP. Refs: nodejs/node-v0.x-archive#8705 Refs: nodejs#10902 PR-URL: nodejs#11036 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: James M Snell <[email protected]>
Closed in nodejs/node#11036. |
There is a unbounded array in the dgram library. We've seen this array have over 250 thousands items in production heap dumps.
We put about a huge amount of traffic through our statsd UDP socket. When we get DNS failures this unbounded queue buffers
all statsd data which causes a huge memory leak.
cc @tjfontaine @indutny