Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Not using HTTP connection pool when also using TCP #4362

Closed
kenperkins opened this issue Dec 4, 2012 · 10 comments
Closed

Not using HTTP connection pool when also using TCP #4362

kenperkins opened this issue Dec 4, 2012 · 10 comments

Comments

@kenperkins
Copy link

I've got an isolated test that, when using TCP connections simultaneously with HTTP requests, appears to make node.js not use the connection pool (i.e. creating more connections than globalAgent.maxSockets.

Tests were created with 2 node apps listening on different machines. Source for HTTP Server and TCP Server.

The first test creates a global socket which is reused, and should use node's built in agent with connection keep-alive.

var async = require('async'),
    net = require('net'),
    request = require('request');

var listHttp = [];

// Create our global socket
var socket = net.createConnection(6379, '192.168.10.2');

var doHttpCall = function(callback) {

// Do three tcp calls in parallel before doing our http call
async.parallel([ 
      doTcpCall,
      doTcpCall,
      doTcpCall
    ], function() {
        request.get({
            url: 'http://192.168.10.1:8098',
            headers: {
                connection: 'keep-alive'
            }
        }, callback);
    });
};

var doTcpCall = function(callback) {
   socket.write('Hello!\n');
   callback(); 
};

// push 1000 functions into our array for being called 
for (var i = 0; i < 1000; i++) {
   listHttp.push(doHttpCall);
}

// after we connect, do 10 concurrent calls doHttpCall
socket.on('connect', function() {
     async.forEachLimit(listHttp, 10, function(item, next) {
        item(next);
    },
    function() {
        process.exit();
    });
});

After running this test, netstat --all shows exactly what we expect, with only a single TIME_WAIT connection to port 6379, and 5 to 8098.

Contrast that with this test:

var async = require('async'),
    net = require('net'),
    request = require('request');

var listHttp = [];

// For each HTTP call, first do a single TCP call
var doHttpCall = function(callback) {
    doTcpCall(function() {
        request.get({
            url: 'http://192.168.10.1:8098',
            headers: {
                Connection: 'keep-alive'
            }
        }, callback);
    });
};

// create a new socket for each tcp call
var doTcpCall = function(callback) {
    var socket = net.createConnection(6379, '192.168.10.2');

    socket.on('data',function(data) {
        callback();
        socket.end();
    }).on('connect', function() {
        socket.write('Hello!\n');
    });
};

for (var i = 0; i < 1000; i++) {
    listHttp.push(doHttpCall);
}

// do 10 concurrent calls doHttpCall
async.forEachLimit(listHttp, 10, function(item, next) {
    item(next);
},
function() {
    process.exit();
});

After running this test, we see as we expect ~1000 TCP TIME_WAIT connections, but we see ~40+ HTTP connections in the TIME_WAIT status. I can't explain why this is happening.

@bnoordhuis
Copy link
Member

Sorry, you'll have to take it up with the request or async guys. I can only take test cases that use core modules only.

@kenperkins
Copy link
Author

Should I open a new issue with the third party modules removed, or update thus one?

Sent from my iPhone

On Dec 4, 2012, at 6:56 PM, Ben Noordhuis [email protected] wrote:

Sorry, you'll have to take it up with the request or async guys. I can only take test cases that use core modules only.


Reply to this email directly or view it on GitHub.

@bnoordhuis
Copy link
Member

Update this one and I'll reopen it.

@kenperkins
Copy link
Author

So I did a significant amount of research today trying to better understand what's going on, and I think I've been able to nail it.

The default behavior of the globalAgent is to immediately close a connection when there are no pending connections is confusing to me at least, and seemingly wrong at worst. Allow me to explain.

var http = require('http');

var count = 0;

// Although using connection: keep-alive, as the default agent does,
// this test, by having a delay, causes each connection to be closed since there
// are no immediately pending connections.

function doCall() {
   count++;
   console.log(count);
   http.get('http://my-web-server/', function(response) {
      setTimeout(doCall, 0);
   });
}

doCall();

This code creates as many unique connections as the machine can handle (provided you let the test run long enough) and doesn't re-use any of them. I found this confusing as you would expect connections to be reused within a resonable time frame such that you amortize the cost of creating a connection across numbers of requests. Obviously, this is the case if you have a high number of concurrent requests, but I have a number of apps that do significant amounts of work between HTTP, TCP, and local execution that end up CPU limited far before having enough concurrent connections to not end up with them being closed.

Can you help explain this behavior?

@kenperkins
Copy link
Author

Did a bit more digging and found the code in question and I think it sheds some light:

function Agent(options) {
  EventEmitter.call(this);

  var self = this;
  self.options = options || {};
  self.requests = {};
  self.sockets = {};
  self.maxSockets = self.options.maxSockets || Agent.defaultMaxSockets;
  self.on('free', function(socket, host, port, localAddress) {
    var name = host + ':' + port;
    if (localAddress) {
      name += ':' + localAddress;
    }

    if (self.requests[name] && self.requests[name].length) {
      self.requests[name].shift().onSocket(socket);
      if (self.requests[name].length === 0) {
        // don't leak
        delete self.requests[name];
      }
    } else {
      // If there are no pending requests just destroy the
      // socket and it will get removed from the pool. This
      // gets us out of timeout issues and allows us to
      // default to Connection:keep-alive.
      socket.destroy();
    }
  });
  self.createConnection = net.createConnection;
}

Again, given the situations like above, why is it bad to have the connection closed only if it's not in use within some time frame?

@bnoordhuis
Copy link
Member

Again, given the situations like above, why is it bad to have the connection closed only if it's not in use within some time frame?

It's a question that comes up from time to time. The primary reason is that it's bad Internet citizenship to keep unused connections open.

There's been discussion of adding an x seconds delay but nothing has come of it yet. Speaking for myself, I wouldn't be supportive of such a change.

@eklitzke
Copy link

With respect to it being "bad Internet citizenship to keep unused connections open", I think that this is a bit misguided for a couple of reasons:

  • If a server thinks that an idle connection is being held open for too long, it can close the socket on its end. All major HTTP servers that implement keep-alive semantics will do this -- they'll close idle connections after N seconds of inactivity. This is normal -- the operator of the remote server knows best what the idle timeout should be, so it's incumbent on them to close idle persistent connections (or disable keep-alive altogether).
  • Most major web browsers have extremely long timeouts; for instance, Chrome will keep idle connections open for up to 10 minutes(!), and the deployment and usage of Chrome on the internet is pervasive.
  • In general, creating short-lived TCP sockets and closing them is bad, because the TCP socket will be assigned an ephemeral port that cannot be immediately re-used; the port has to enter the TIME_WAIT state for some amount of time (two minutes on Linux) before the ephemeral port can be re-used; if a lot of short-lived sockets are being created in this way, one can quickly run out of ephemeral ports.
  • My suspicion is that a common use case for people using node, is that they're using it for some sort of "backend" service and making requests to other "backend" services over a local network. For instance, a node application connecting to a local database/proxy/cache/etc. Assuming these services respond quickly, even a very modest request rate can fill up the number of ephemeral ports quickly.

@isaacs
Copy link

isaacs commented Dec 27, 2012

In 0.12 we're going to explore adding some reasonable keepalive behavior to the http client by default. @bnoordhuis's objections are noted, and will be taken into account -- he has earned a much larger vote than the average node user, for sure -- but this is a common and reasonable request. It's not easy to do right now with the current state of the http client, which is something of a mess, and it was impossible to do properly prior to 0.6, since the open sockets would have kept the node process open until they all disconnected.

In the meantime, you can use/copy the "forever" agent in the request library, and I believe @indutny is working on a standalone module to use reuse sockets until the server closes them.

Node's http client should behave by default similarly to how browsers behave.

@bnoordhuis
Copy link
Member

Okay, let me clarify my opinion. I'm not against keeping connections open per se, I just don't want it to be the default behavior.

Two changes I can see us making to the current default agent are:

  1. Keeping the connection open for one more tick. Currently, a new request needs to be queued when the current request finishes in order for the connection to be reused. This is somewhat inconvenient because you may not have everything you need to start a new request until the current one is fully finished.
  2. Add an option that says: "Keep connections open for X amount of time.". To be clear, I want it to be opt-in, not opt-out. People that need that kind of behavior should have motivation enough to consult the documentation and find out it's there. For everyone else, the agent will keep on behaving in the most unobtrusive way possible.

In general, creating short-lived TCP sockets and closing them is bad, because the TCP socket will be assigned an ephemeral port that cannot be immediately re-used; the port has to enter the TIME_WAIT state for some amount of time (two minutes on Linux) before the ephemeral port can be re-used; if a lot of short-lived sockets are being created in this way, one can quickly run out of ephemeral ports.

I would have agreed if you mentioned TCP setup/teardown time but I'm not that worried about ephemeral port exhaustion (and I say that as someone who deals with bogus bug reports caused by same on an almost daily basis.)

On properly configured systems, the tuple space for TCP sockets is so mind boggingly big that it needs no consideration. Even in a stock two-system configuration, it's on the order of 32768^2 (about one billion). In the real world, it's more like 64,000^2.

Most major web browsers have extremely long timeouts; for instance, Chrome will keep idle connections open for up to 10 minutes(!), and the deployment and usage of Chrome on the internet is pervasive.

That's not a valid argument for me for two reasons:

  1. node.js is not a browser (sorry @isaacs, but it's not.). The workload a typical node.js applications deals with is not remotely similar to that of a browser. Apples and oranges.
  2. Chrome got - and gets - a lot of flack exactly because it's so anti-social. I don't want node.js to end up on the system admin shit list. (Full disclosure: one of my hats is the admin one and this behavior is something I really don't like about Chrome.)

@eklitzke
Copy link

To be clear, the issue with ephemeral ports that I was referencing was just for the ephemeral ports on the local system (presumably the server has a fixed port that it's using), e.g. the remote address is example.com port 80, and each new connection on my machine has to find an available ephemeral port. On my computer, out-of-the-box this is 28232 ports:

evan@loonix ~ $ cat /proc/sys/net/ipv4/ip_local_port_range 
32768   61000

although I'm sure this varies by distribution (and obviously it can be increased).

gibfahn pushed a commit to ibmruntimes/node that referenced this issue Apr 26, 2016
The following significant (semver-major) changes have been made since the
previous Node v5.0.0 release.

* Buffer
  * New Buffer constructors have been added
    [nodejs#4682](nodejs/node#4682)
  * Previously deprecated Buffer APIs are removed
    [nodejs#5048](nodejs/node#5048),
    [nodejs#4594](nodejs/node#4594)
  * Improved error handling [nodejs#4514](nodejs/node#4514)
* Cluster
  * Worker emitted as first argument in 'message' event
    [nodejs#5361](nodejs/node#5361).
* Crypto
  * Improved error handling [nodejs#3100](nodejs/node#3100),
    [nodejs#5611](nodejs/node#5611)
  * Simplified Certificate class bindings
    [nodejs#5382](nodejs/node#5382)
  * Improved control over FIPS mode
    [nodejs#5181](nodejs/node#5181)
  * pbkdf2 digest overloading is deprecated
    [nodejs#4047](nodejs/node#4047)
* Dependencies
  * Reintroduce shared c-ares build support
    [nodejs#5775](nodejs/node#5775).
  * V8 updated to 5.0.71.31 [nodejs#6111](nodejs/node#6111).
* DNS
  * Add resolvePtr API to query plain DNS PTR records
    [nodejs#4921](nodejs/node#4921).
* Domains
  * Clear stack when no error handler
  [nodejs#4659](nodejs/node#4659).
* File System
  * The `fs.realpath()` and `fs.realpathSync()` methods have been updated
    to use a more efficient libuv implementation. This change includes the
    removal of the `cache` argument and the method can throw new errors
    [nodejs#3594](nodejs/node#3594)
  * FS apis can now accept and return paths as Buffers
    [nodejs#5616](nodejs/node#5616).
  * Error handling and type checking improvements
    [nodejs#5616](nodejs/node#5616),
    [nodejs#5590](nodejs/node#5590),
    [nodejs#4518](nodejs/node#4518),
    [nodejs#3917](nodejs/node#3917).
  * fs.read's string interface is deprecated
    [nodejs#4525](nodejs/node#4525)
* HTTP
  * 'clientError' can now be used to return custom errors from an
    HTTP server [nodejs#4557](nodejs/node#4557).
* Modules
  * Current directory is now prioritized for local lookups
    [nodejs#5689](nodejs/node#5689)
  * Symbolic links are preserved when requiring modules
    [nodejs#5950](nodejs/node#5950)
* Net
  * DNS hints no longer implicitly set
    [nodejs#6021](nodejs/node#6021).
  * Improved error handling and type checking
    [nodejs#5981](nodejs/node#5981),
    [nodejs#5733](nodejs/node#5733),
    [nodejs#2904](nodejs/node#2904)
* Path
  * Improved type checking [nodejs#5348](nodejs/node#5348).
* Process
  * Introduce process warnings API
    [nodejs#4782](nodejs/node#4782).
  * Throw exception when non-function passed to nextTick
    [nodejs#3860](nodejs/node#3860).
* Readline
  * Emit key info unconditionally
    [nodejs#6024](nodejs/node#6024)
* REPL
  * Assignment to `_` will emit a warning.
    [nodejs#5535](nodejs/node#5535)
* Timers
  * Fail early when callback is not a function
    [nodejs#4362](nodejs/node#4362)
* TLS
  * Rename 'clientError' to 'tlsClientError'
    [nodejs#4557](nodejs/node#4557)
  * SHA1 used for sessionIdContext
    [nodejs#3866](nodejs/node#3866)
* TTY
  * Previously deprecated setRawMode wrapper is removed
    [nodejs#2528](nodejs/node#2528).
* Util
  * Changes to Error object formatting
    [nodejs#4582](nodejs/node#4582).
* Windows
  * Windows XP and Vista are no longer supported
    [nodejs#5167](nodejs/node#5167),
    [nodejs#5167](nodejs/node#5167).
gibfahn pushed a commit to ibmruntimes/node that referenced this issue Apr 27, 2016
The following significant (semver-major) changes have been made since the
previous Node v5.0.0 release.

* Buffer
  * New Buffer constructors have been added
    [nodejs#4682](nodejs/node#4682)
  * Previously deprecated Buffer APIs are removed
    [nodejs#5048](nodejs/node#5048),
    [nodejs#4594](nodejs/node#4594)
  * Improved error handling [nodejs#4514](nodejs/node#4514)
* Cluster
  * Worker emitted as first argument in 'message' event
    [nodejs#5361](nodejs/node#5361).
* Crypto
  * Improved error handling [nodejs#3100](nodejs/node#3100),
    [nodejs#5611](nodejs/node#5611)
  * Simplified Certificate class bindings
    [nodejs#5382](nodejs/node#5382)
  * Improved control over FIPS mode
    [nodejs#5181](nodejs/node#5181)
  * pbkdf2 digest overloading is deprecated
    [nodejs#4047](nodejs/node#4047)
* Dependencies
  * Reintroduce shared c-ares build support
    [nodejs#5775](nodejs/node#5775).
  * V8 updated to 5.0.71.31 [nodejs#6111](nodejs/node#6111).
* DNS
  * Add resolvePtr API to query plain DNS PTR records
    [nodejs#4921](nodejs/node#4921).
* Domains
  * Clear stack when no error handler
  [nodejs#4659](nodejs/node#4659).
* File System
  * The `fs.realpath()` and `fs.realpathSync()` methods have been updated
    to use a more efficient libuv implementation. This change includes the
    removal of the `cache` argument and the method can throw new errors
    [nodejs#3594](nodejs/node#3594)
  * FS apis can now accept and return paths as Buffers
    [nodejs#5616](nodejs/node#5616).
  * Error handling and type checking improvements
    [nodejs#5616](nodejs/node#5616),
    [nodejs#5590](nodejs/node#5590),
    [nodejs#4518](nodejs/node#4518),
    [nodejs#3917](nodejs/node#3917).
  * fs.read's string interface is deprecated
    [nodejs#4525](nodejs/node#4525)
* HTTP
  * 'clientError' can now be used to return custom errors from an
    HTTP server [nodejs#4557](nodejs/node#4557).
* Modules
  * Current directory is now prioritized for local lookups
    [nodejs#5689](nodejs/node#5689)
  * Symbolic links are preserved when requiring modules
    [nodejs#5950](nodejs/node#5950)
* Net
  * DNS hints no longer implicitly set
    [nodejs#6021](nodejs/node#6021).
  * Improved error handling and type checking
    [nodejs#5981](nodejs/node#5981),
    [nodejs#5733](nodejs/node#5733),
    [nodejs#2904](nodejs/node#2904)
* OS X
  * MACOSX_DEPLOYMENT_TARGET has been bumped up to 10.7
    [nodejs#6402](nodejs/node#6402).
* Path
  * Improved type checking [nodejs#5348](nodejs/node#5348).
* Process
  * Introduce process warnings API
    [nodejs#4782](nodejs/node#4782).
  * Throw exception when non-function passed to nextTick
    [nodejs#3860](nodejs/node#3860).
* Readline
  * Emit key info unconditionally
    [nodejs#6024](nodejs/node#6024)
* REPL
  * Assignment to `_` will emit a warning.
    [nodejs#5535](nodejs/node#5535)
* Timers
  * Fail early when callback is not a function
    [nodejs#4362](nodejs/node#4362)
* TLS
  * Rename 'clientError' to 'tlsClientError'
    [nodejs#4557](nodejs/node#4557)
  * SHA1 used for sessionIdContext
    [nodejs#3866](nodejs/node#3866)
* TTY
  * Previously deprecated setRawMode wrapper is removed
    [nodejs#2528](nodejs/node#2528).
* Util
  * Changes to Error object formatting
    [nodejs#4582](nodejs/node#4582).
* Windows
  * Windows XP and Vista are no longer supported
    [nodejs#5167](nodejs/node#5167),
    [nodejs#5167](nodejs/node#5167).
gibfahn pushed a commit to ibmruntimes/node that referenced this issue May 6, 2016
The following significant (semver-major) changes have been made since the
previous Node v5.0.0 release.

* Buffer
  * New Buffer constructors have been added
    [nodejs#4682](nodejs/node#4682)
  * Previously deprecated Buffer APIs are removed
    [nodejs#5048](nodejs/node#5048),
    [nodejs#4594](nodejs/node#4594)
  * Improved error handling [nodejs#4514](nodejs/node#4514)
* Cluster
  * Worker emitted as first argument in 'message' event
    [nodejs#5361](nodejs/node#5361).
* Crypto
  * Improved error handling [nodejs#3100](nodejs/node#3100),
    [nodejs#5611](nodejs/node#5611)
  * Simplified Certificate class bindings
    [nodejs#5382](nodejs/node#5382)
  * Improved control over FIPS mode
    [nodejs#5181](nodejs/node#5181)
  * pbkdf2 digest overloading is deprecated
    [nodejs#4047](nodejs/node#4047)
* Dependencies
  * Reintroduce shared c-ares build support
    [nodejs#5775](nodejs/node#5775).
  * V8 updated to 5.0.71.31 [nodejs#6111](nodejs/node#6111).
* DNS
  * Add resolvePtr API to query plain DNS PTR records
    [nodejs#4921](nodejs/node#4921).
* Domains
  * Clear stack when no error handler
  [nodejs#4659](nodejs/node#4659).
* File System
  * The `fs.realpath()` and `fs.realpathSync()` methods have been updated
    to use a more efficient libuv implementation. This change includes the
    removal of the `cache` argument and the method can throw new errors
    [nodejs#3594](nodejs/node#3594)
  * FS apis can now accept and return paths as Buffers
    [nodejs#5616](nodejs/node#5616).
  * Error handling and type checking improvements
    [nodejs#5616](nodejs/node#5616),
    [nodejs#5590](nodejs/node#5590),
    [nodejs#4518](nodejs/node#4518),
    [nodejs#3917](nodejs/node#3917).
  * fs.read's string interface is deprecated
    [nodejs#4525](nodejs/node#4525)
* HTTP
  * 'clientError' can now be used to return custom errors from an
    HTTP server [nodejs#4557](nodejs/node#4557).
* Modules
  * Current directory is now prioritized for local lookups
    [nodejs#5689](nodejs/node#5689)
  * Symbolic links are preserved when requiring modules
    [nodejs#5950](nodejs/node#5950)
* Net
  * DNS hints no longer implicitly set
    [nodejs#6021](nodejs/node#6021).
  * Improved error handling and type checking
    [nodejs#5981](nodejs/node#5981),
    [nodejs#5733](nodejs/node#5733),
    [nodejs#2904](nodejs/node#2904)
* OS X
  * MACOSX_DEPLOYMENT_TARGET has been bumped up to 10.7
    [nodejs#6402](nodejs/node#6402).
* Path
  * Improved type checking [nodejs#5348](nodejs/node#5348).
* Process
  * Introduce process warnings API
    [nodejs#4782](nodejs/node#4782).
  * Throw exception when non-function passed to nextTick
    [nodejs#3860](nodejs/node#3860).
* Readline
  * Emit key info unconditionally
    [nodejs#6024](nodejs/node#6024)
* REPL
  * Assignment to `_` will emit a warning.
    [nodejs#5535](nodejs/node#5535)
* Timers
  * Fail early when callback is not a function
    [nodejs#4362](nodejs/node#4362)
* TLS
  * Rename 'clientError' to 'tlsClientError'
    [nodejs#4557](nodejs/node#4557)
  * SHA1 used for sessionIdContext
    [nodejs#3866](nodejs/node#3866)
* TTY
  * Previously deprecated setRawMode wrapper is removed
    [nodejs#2528](nodejs/node#2528).
* Util
  * Changes to Error object formatting
    [nodejs#4582](nodejs/node#4582).
* Windows
  * Windows XP and Vista are no longer supported
    [nodejs#5167](nodejs/node#5167),
    [nodejs#5167](nodejs/node#5167).
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants