-
Notifications
You must be signed in to change notification settings - Fork 7.3k
[cluster.js] removeHandlesForWorker() on both 'exit' and 'disconnect' events. #9418
Conversation
[cluster.js][worker] Calls �����removeHandlesForWorker(worker) on both 'exit' and 'disconnect' event instead of just 'disconnect' because sometimes it happens that 'disconnect' event isn't emitted at all in which case some handles with be left and when all workers will be dead an exception raised "AssertionError: Resource leak detected".
@misterdjules @sam-github @tjfontaine : Any thoughts on this ? |
@jshkurti Thank you for the pull-request! For now, most of the team is focused on releasing v0.12.1, which is why it's going to take some time to review it. It's at the top of my review list though, so I'll get there soon. |
A couple comments.
Figuring out why disconnect isn't occurring should be the focus, I think, but I'll look carefully at your patch to see if it is safe. |
Thank you for the quick response :)
I agree, it was the only way to trigger the exception on purpose though.
No, I double-checked everything. As I said above it happens quite randomly. Let's say once in 50 restarts. For some reason the worker dies without firing 'disconnect' or maybe 'disconnect' is fired after 'exit'. I updated the code : worker.process.once('exit', function(exitCode, signalCode) {
/*
* Remove the handles associated with this
* worker a second time just in case 'disconnect'
* event isn't emitted.
*/
if (worker.state != 'disconnected')
removeHandlesForWorker(worker); to avoid redundant calls to removeHandlesForWorker() and thus only call it if 'exit' event is fired before 'disconnect'. |
Any update on this? We also use clusters and every few hours or so all Node processes are down because of this bug. We did not have this issue last week with 0.10. |
+1 You can use this workaround for now : |
Thanks I will look at this today. I was also considering using PM2 and stop managing my own clusters but I'd rather not add another dependency on my servers. |
@jshkurti @gzurbach ... how would you like to proceed on this one? If this is something that needs to be fixed in v0.12, then opening a new PR targeted there would be best. Otherwise, this may need to be moved over to http://github.com/nodejs/node master. |
I am still seeing this on 4.1.2
|
os: Linux VM_56_101_centos 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Each time I kill a worker, the master exit together with the cluster file is simple: var cluster = require('cluster');
var http = require('http');
var app = require('../app');
//var numCPUs = require('os').cpus().length;
var numCPUs = 1; // 等离线统计分离出来后, 再将核数调整上去
var port = process.env.PORT || '3000';
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('fork', function (worker) {
console.log('[%s] [worker:%d] new worker start', Date(), worker.process.pid);
})
cluster.on('exit', function (worker) {
console.error('[%s] [master:%s] wroker:%s disconnect, suicide: %s, state: %s.',
Date(), process.pid, worker.process.pid, worker.suicide, worker.state);
setTimeout(function () {
console.log('restart worker...')
cluster.fork();
}, 1000)
})
} else {
http.createServer(app).listen(port);
} |
[cluster.js][worker] Calls removeHandlesForWorker(worker) on both 'exit' and 'disconnect' event instead of just 'disconnect' because sometimes it happens that 'disconnect' event isn't emitted at all in which case some handles will be left and when all workers will be dead an exception raised "AssertionError: Resource leak detected".
Hello, I have previously posted this issue : #9409
I finally came up with this piece of code which intentionally triggers the excpetion :
And finally I found the solution which is to double check that the handles are removed when a worker dies even if it does not emit the 'disconnect' event but only the 'exit' event as it happens sometimes.
Calling removeHandlesForWorker() twice isn't much of a problem because it does nothing if the worker has already been removed from that handle.
Thanks.