Cluster workers not sharing ports after reopening a listener #6693

gpkeene · 2016-05-11T14:31:44Z

Version: 6.0.0
Platform: Windows 7 64-bit

Based on the responses to this question, I'm trying to figure out why having multiple workers call server.listen() on the same port/address doesn't cause any issues, but having an old worker call server.close() followed by a server.listen() on the same port will repeatedly give the error EADDRINUSE.

It does not seem to be a case of the listener not closing correctly, as a close event is emitted, which is when I attempt to set up the new listener. While this worker is getting EADDRINUSE, newly spawned workers are able call server.listen() with no issues.

Here is a simple test that will demonstrate the problem. As workers are forked every 100ms, they will establish a listener on port 16000. When worker 10 is forked, it will establish a timeout to tear down its listener after 1s. Once a close event is emitted, it will attempt to call server.listen() on port 16000 again and get the EADDRINUSE error. For consistency, this test explicitly provides the same address during binding to avoid any potential issues with core modules dealing with a null address.

This particular implementation will cause worker 10 to then take up all cycles once it hits the error during binding, thereby keeping the master process from forking new workers. If a delay is added before calling server.listen(), worker 10 will still continue to hit EADDRINUSE while the master continually forks new workers that are capable of establishing listeners.

var cluster = require('cluster');
var net     = require('net');

if (cluster.isMaster) {
    setInterval(function(){cluster.fork()},100);
} else {
    var workerID = cluster.worker.id;
    var server;
    var setup = function() {
        console.log('Worker ' + workerID + ' setting up listener');
        server = net.createServer(function(stream) {});
        server.on('error', function(err) {
            console.log('Error on worker ' + workerID, err);
            teardown();
        });
        if (workerID == 10) {
            server.listen(16000, '127.0.0.1', function() {
                console.log('Worker ' + workerID + ' listener established');
                setTimeout(teardown, 1000);
            });
        } else {
            server.listen(16000, '127.0.0.1', function() {
                console.log('Worker ' + workerID + ' listener established');
            });
        }
    }
    var teardown = function() {
        console.log('Worker ' + workerID + ' closing listener');
        server.close(setup);
    }
    setup();
}

Initial output from this test case:

Worker 1 setting up listener
Worker 1 listener established
Worker 2 setting up listener
Worker 2 listener established
Worker 3 setting up listener
Worker 3 listener established
Worker 4 setting up listener
Worker 4 listener established
Worker 5 setting up listener
Worker 5 listener established
Worker 6 setting up listener
Worker 6 listener established
Worker 7 setting up listener
Worker 7 listener established
Worker 8 setting up listener
Worker 8 listener established
Worker 9 setting up listener
Worker 9 listener established
Worker 10 setting up listener
Worker 10 listener established
Worker 11 setting up listener
Worker 11 listener established
Worker 12 setting up listener
Worker 12 listener established
Worker 13 setting up listener
Worker 13 listener established
Worker 14 setting up listener
Worker 14 listener established
Worker 15 setting up listener
Worker 15 listener established
Worker 16 setting up listener
Worker 16 listener established
Worker 17 setting up listener
Worker 17 listener established
Worker 18 setting up listener
Worker 18 listener established
Worker 19 setting up listener
Worker 19 listener established
Worker 10 closing listener
Worker 10 setting up listener
Error on worker 10 { [Error: bind EADDRINUSE 127.0.0.1:16000]
  code: 'EADDRINUSE',
  errno: 'EADDRINUSE',
  syscall: 'bind',
  address: '127.0.0.1',
  port: 16000 }
Worker 10 closing listener
Worker 10 setting up listener
Error on worker 10 { [Error: bind EADDRINUSE 127.0.0.1:16000]
  code: 'EADDRINUSE',
  errno: 'EADDRINUSE',
  syscall: 'bind',
  address: '127.0.0.1',
  port: 16000 }
Worker 10 closing listener

(This issue has all the same information as this StackOverflow post that I posted a couple days back.)

The text was updated successfully, but these errors were encountered:

It allows reopening a server after it has been closed. Fixes: nodejs#6693 PR-URL: nodejs#6981 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Ron Korving <[email protected]> Reviewed-By: James M Snell <[email protected]>

santigimeno · 2016-05-27T17:03:02Z

Fixed by 0c29436

It allows reopening a server after it has been closed. Fixes: nodejs#6693 PR-URL: nodejs#6981 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Ron Korving <[email protected]> Reviewed-By: James M Snell <[email protected]>

It allows reopening a server after it has been closed. Fixes: #6693 PR-URL: #6981 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Ron Korving <[email protected]> Reviewed-By: James M Snell <[email protected]>

addaleax added cluster Issues and PRs related to the cluster subsystem. net Issues and PRs related to the net subsystem. labels May 11, 2016

santigimeno mentioned this issue May 25, 2016

cluster: reset handle index on close #6981

Merged

4 tasks

santigimeno closed this as completed in #6981 May 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster workers not sharing ports after reopening a listener #6693

Cluster workers not sharing ports after reopening a listener #6693

gpkeene commented May 11, 2016

santigimeno commented May 27, 2016

Cluster workers not sharing ports after reopening a listener #6693

Cluster workers not sharing ports after reopening a listener #6693

Comments

gpkeene commented May 11, 2016

santigimeno commented May 27, 2016