Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Buffer Memory Leak #4525

Closed
trevnorris opened this issue Jan 6, 2013 · 6 comments
Closed

Buffer Memory Leak #4525

trevnorris opened this issue Jan 6, 2013 · 6 comments

Comments

@trevnorris
Copy link

I realized this while reviewing changes I want to make discussed in the mailing list.

Fast Buffers have a performance advantage since they are allocated from a single larger buffer. But that means if there is even one byte hanging around, the entire SlowBuffer must hang around in memory. The following is a script that will prove the point:

// WARNING: This script will crash your computer!
// ran this with `./out/Release/node --expose-gc mem-test.js
const SIZE = Buffer.poolSize - 2;
// if you want to test this w/o crashing your computer, reduce ITTR
const ITTR = 1e6;

var SlowBuffer = require('buffer').SlowBuffer;
var store = [];
var rss, tmp, i;

rss = process.memoryUsage().rss;

for (i = 0; i < ITTR; i++) {
// was using this as a control
    //tmp = SlowBuffer(SIZE);

    store.push(Buffer(1));
// comment out the following and you'll see that the problem no
// longer exists
    tmp = Buffer(SIZE);

    if (i % 1e3 === 0) gc();
}


setTimeout(function() {
    var nr = process.memoryUsage().rss
    console.log(((nr - rss) / 1024 / 1024).toFixed(2));
}, 1000);

Basically every loop will allocate
Buffer.poolSize
memory instead of just 1. I'll write an http example later that exhibits the same behavior.

@bnoordhuis
Copy link
Member

I appreciate that you bring it up but I'm well aware of that. It's not a bug, it's a conscious time/space trade-off.

Malloc'ing memory for each buffer independently has a massive performance impact. It's exactly the reason why the current scheme was implemented.

@trevnorris
Copy link
Author

@bnoordhuis I understand the implications of malloc'ing for each smaller buffer, but since Buffers are a major part of node, it seems there is a better memory allocation algorithm. You understand how memory pools are currently created, and that something like the following would leave allocated but unused memory hanging around:

var a = Buffer(1);
var b = Buffer(Buffer.poolSize / 2);
// here a new SlowBuffer is allocated and the old pool is forgotten about,
// but only about half of it was used.
var c = Buffer(Buffer.poolSize / 2);

I'm not sure it will help, but I'm investigating how v8 creates and tracks Strings. I want to understand how v8 can track many small strings, and still gc them when no longer used. Thinking is this could help the issue.

@bnoordhuis
Copy link
Member

I'm not sure it will help, but I'm investigating how v8 creates and tracks Strings.

With the same garbage collector that tracks buffers. But strings and buffers are different beasts. Strings don't have backlinks to other strings, buffers do.

@piscisaureus
Copy link

With the same garbage collector that tracks buffers. But strings and buffers are different beasts. Strings don't have backlinks to other strings, buffers do.

Well, eh... sliced strings :-)

@trevnorris
Copy link
Author

@bnoordhuis I wrote a benchmark test to try and accurately determine memory requirements and initialization speeds between strings and Buffers. Each one of the for() loops in the gist was run separately to get accurate measurements.

$ ./out/Release/node --expose-gc str-test.js
SlowBuffer: 4063ms
329.24 MB
SlowBuffer SIZE: 1754ms
299.32 MB
Buffer: 1856ms
137.12 MB
Buffer SIZE: 453ms
83.75 MB
String: 562ms
122.24 MB
Native: 407ms
92.84 MB

The two I feel are most relevant are Buffer and String. While Buffer SIZE is slightly faster and uses less memory than String it only allocates the memory and doesn't actually fill it. Also the poolSize for Buffer was pre-allocated to the exact size of the incoming data.

Basically what I'm getting at is there must be a faster and garbage collectible way to allocate memory for Buffers. As far as the problem of tracking memory pointers of Buffer slices, that could be mitigated down to the buf.slice() command. Instead of happening on all Buffers that are created within the poolSize.

I'm still investigating exactly how v8 does it for strings, and hoping to come across something we can use.

@piscisaureus Following through the logic in v8/src/heap.cc#3455-3546, FLAG_string_slices is only available when staring with the option --string_slices. Memory is copied otherwise.

@bnoordhuis
Copy link
Member

For posterity: I do have some ideas on how to deal with the issue of buffer retention.

To recap, the reason we have Buffers and SlowBuffers is that it's:

  1. Expensive to malloc() memory, and
  2. Expensive to have many objects that need finalization (i.e. have WeakCallbacks).

That's why we create a few big SlowBuffers (expensive) and then slice them up in many small Buffers (cheap because they're regular JS objects).

What to me seems like the most straightforward solution, is to extend V8 to support buffers natively. V8 has a compacting garbage collector so you probably don't want to allocate the buffer's memory on the JS heap (because objects get copied around all the time) but either malloc() it or allocate it in a separate zone (i.e. heap) that is not compacted.

The disadvantage of malloc() is that it can be slow. The disadvantage of a separate zone is that you need to implement something like malloc() on top in order to reclaim memory.

It's probably easiest to start with malloc() and if that doesn't pan out, switch to the zone-based solution.

gibfahn pushed a commit to ibmruntimes/node that referenced this issue Apr 26, 2016
The following significant (semver-major) changes have been made since the
previous Node v5.0.0 release.

* Buffer
  * New Buffer constructors have been added
    [nodejs#4682](nodejs/node#4682)
  * Previously deprecated Buffer APIs are removed
    [nodejs#5048](nodejs/node#5048),
    [nodejs#4594](nodejs/node#4594)
  * Improved error handling [nodejs#4514](nodejs/node#4514)
* Cluster
  * Worker emitted as first argument in 'message' event
    [nodejs#5361](nodejs/node#5361).
* Crypto
  * Improved error handling [nodejs#3100](nodejs/node#3100),
    [nodejs#5611](nodejs/node#5611)
  * Simplified Certificate class bindings
    [nodejs#5382](nodejs/node#5382)
  * Improved control over FIPS mode
    [nodejs#5181](nodejs/node#5181)
  * pbkdf2 digest overloading is deprecated
    [nodejs#4047](nodejs/node#4047)
* Dependencies
  * Reintroduce shared c-ares build support
    [nodejs#5775](nodejs/node#5775).
  * V8 updated to 5.0.71.31 [nodejs#6111](nodejs/node#6111).
* DNS
  * Add resolvePtr API to query plain DNS PTR records
    [nodejs#4921](nodejs/node#4921).
* Domains
  * Clear stack when no error handler
  [nodejs#4659](nodejs/node#4659).
* File System
  * The `fs.realpath()` and `fs.realpathSync()` methods have been updated
    to use a more efficient libuv implementation. This change includes the
    removal of the `cache` argument and the method can throw new errors
    [nodejs#3594](nodejs/node#3594)
  * FS apis can now accept and return paths as Buffers
    [nodejs#5616](nodejs/node#5616).
  * Error handling and type checking improvements
    [nodejs#5616](nodejs/node#5616),
    [nodejs#5590](nodejs/node#5590),
    [nodejs#4518](nodejs/node#4518),
    [nodejs#3917](nodejs/node#3917).
  * fs.read's string interface is deprecated
    [nodejs#4525](nodejs/node#4525)
* HTTP
  * 'clientError' can now be used to return custom errors from an
    HTTP server [nodejs#4557](nodejs/node#4557).
* Modules
  * Current directory is now prioritized for local lookups
    [nodejs#5689](nodejs/node#5689)
  * Symbolic links are preserved when requiring modules
    [nodejs#5950](nodejs/node#5950)
* Net
  * DNS hints no longer implicitly set
    [nodejs#6021](nodejs/node#6021).
  * Improved error handling and type checking
    [nodejs#5981](nodejs/node#5981),
    [nodejs#5733](nodejs/node#5733),
    [nodejs#2904](nodejs/node#2904)
* Path
  * Improved type checking [nodejs#5348](nodejs/node#5348).
* Process
  * Introduce process warnings API
    [nodejs#4782](nodejs/node#4782).
  * Throw exception when non-function passed to nextTick
    [nodejs#3860](nodejs/node#3860).
* Readline
  * Emit key info unconditionally
    [nodejs#6024](nodejs/node#6024)
* REPL
  * Assignment to `_` will emit a warning.
    [nodejs#5535](nodejs/node#5535)
* Timers
  * Fail early when callback is not a function
    [nodejs#4362](nodejs/node#4362)
* TLS
  * Rename 'clientError' to 'tlsClientError'
    [nodejs#4557](nodejs/node#4557)
  * SHA1 used for sessionIdContext
    [nodejs#3866](nodejs/node#3866)
* TTY
  * Previously deprecated setRawMode wrapper is removed
    [nodejs#2528](nodejs/node#2528).
* Util
  * Changes to Error object formatting
    [nodejs#4582](nodejs/node#4582).
* Windows
  * Windows XP and Vista are no longer supported
    [nodejs#5167](nodejs/node#5167),
    [nodejs#5167](nodejs/node#5167).
gibfahn pushed a commit to ibmruntimes/node that referenced this issue Apr 27, 2016
The following significant (semver-major) changes have been made since the
previous Node v5.0.0 release.

* Buffer
  * New Buffer constructors have been added
    [nodejs#4682](nodejs/node#4682)
  * Previously deprecated Buffer APIs are removed
    [nodejs#5048](nodejs/node#5048),
    [nodejs#4594](nodejs/node#4594)
  * Improved error handling [nodejs#4514](nodejs/node#4514)
* Cluster
  * Worker emitted as first argument in 'message' event
    [nodejs#5361](nodejs/node#5361).
* Crypto
  * Improved error handling [nodejs#3100](nodejs/node#3100),
    [nodejs#5611](nodejs/node#5611)
  * Simplified Certificate class bindings
    [nodejs#5382](nodejs/node#5382)
  * Improved control over FIPS mode
    [nodejs#5181](nodejs/node#5181)
  * pbkdf2 digest overloading is deprecated
    [nodejs#4047](nodejs/node#4047)
* Dependencies
  * Reintroduce shared c-ares build support
    [nodejs#5775](nodejs/node#5775).
  * V8 updated to 5.0.71.31 [nodejs#6111](nodejs/node#6111).
* DNS
  * Add resolvePtr API to query plain DNS PTR records
    [nodejs#4921](nodejs/node#4921).
* Domains
  * Clear stack when no error handler
  [nodejs#4659](nodejs/node#4659).
* File System
  * The `fs.realpath()` and `fs.realpathSync()` methods have been updated
    to use a more efficient libuv implementation. This change includes the
    removal of the `cache` argument and the method can throw new errors
    [nodejs#3594](nodejs/node#3594)
  * FS apis can now accept and return paths as Buffers
    [nodejs#5616](nodejs/node#5616).
  * Error handling and type checking improvements
    [nodejs#5616](nodejs/node#5616),
    [nodejs#5590](nodejs/node#5590),
    [nodejs#4518](nodejs/node#4518),
    [nodejs#3917](nodejs/node#3917).
  * fs.read's string interface is deprecated
    [nodejs#4525](nodejs/node#4525)
* HTTP
  * 'clientError' can now be used to return custom errors from an
    HTTP server [nodejs#4557](nodejs/node#4557).
* Modules
  * Current directory is now prioritized for local lookups
    [nodejs#5689](nodejs/node#5689)
  * Symbolic links are preserved when requiring modules
    [nodejs#5950](nodejs/node#5950)
* Net
  * DNS hints no longer implicitly set
    [nodejs#6021](nodejs/node#6021).
  * Improved error handling and type checking
    [nodejs#5981](nodejs/node#5981),
    [nodejs#5733](nodejs/node#5733),
    [nodejs#2904](nodejs/node#2904)
* OS X
  * MACOSX_DEPLOYMENT_TARGET has been bumped up to 10.7
    [nodejs#6402](nodejs/node#6402).
* Path
  * Improved type checking [nodejs#5348](nodejs/node#5348).
* Process
  * Introduce process warnings API
    [nodejs#4782](nodejs/node#4782).
  * Throw exception when non-function passed to nextTick
    [nodejs#3860](nodejs/node#3860).
* Readline
  * Emit key info unconditionally
    [nodejs#6024](nodejs/node#6024)
* REPL
  * Assignment to `_` will emit a warning.
    [nodejs#5535](nodejs/node#5535)
* Timers
  * Fail early when callback is not a function
    [nodejs#4362](nodejs/node#4362)
* TLS
  * Rename 'clientError' to 'tlsClientError'
    [nodejs#4557](nodejs/node#4557)
  * SHA1 used for sessionIdContext
    [nodejs#3866](nodejs/node#3866)
* TTY
  * Previously deprecated setRawMode wrapper is removed
    [nodejs#2528](nodejs/node#2528).
* Util
  * Changes to Error object formatting
    [nodejs#4582](nodejs/node#4582).
* Windows
  * Windows XP and Vista are no longer supported
    [nodejs#5167](nodejs/node#5167),
    [nodejs#5167](nodejs/node#5167).
gibfahn pushed a commit to ibmruntimes/node that referenced this issue May 6, 2016
The following significant (semver-major) changes have been made since the
previous Node v5.0.0 release.

* Buffer
  * New Buffer constructors have been added
    [nodejs#4682](nodejs/node#4682)
  * Previously deprecated Buffer APIs are removed
    [nodejs#5048](nodejs/node#5048),
    [nodejs#4594](nodejs/node#4594)
  * Improved error handling [nodejs#4514](nodejs/node#4514)
* Cluster
  * Worker emitted as first argument in 'message' event
    [nodejs#5361](nodejs/node#5361).
* Crypto
  * Improved error handling [nodejs#3100](nodejs/node#3100),
    [nodejs#5611](nodejs/node#5611)
  * Simplified Certificate class bindings
    [nodejs#5382](nodejs/node#5382)
  * Improved control over FIPS mode
    [nodejs#5181](nodejs/node#5181)
  * pbkdf2 digest overloading is deprecated
    [nodejs#4047](nodejs/node#4047)
* Dependencies
  * Reintroduce shared c-ares build support
    [nodejs#5775](nodejs/node#5775).
  * V8 updated to 5.0.71.31 [nodejs#6111](nodejs/node#6111).
* DNS
  * Add resolvePtr API to query plain DNS PTR records
    [nodejs#4921](nodejs/node#4921).
* Domains
  * Clear stack when no error handler
  [nodejs#4659](nodejs/node#4659).
* File System
  * The `fs.realpath()` and `fs.realpathSync()` methods have been updated
    to use a more efficient libuv implementation. This change includes the
    removal of the `cache` argument and the method can throw new errors
    [nodejs#3594](nodejs/node#3594)
  * FS apis can now accept and return paths as Buffers
    [nodejs#5616](nodejs/node#5616).
  * Error handling and type checking improvements
    [nodejs#5616](nodejs/node#5616),
    [nodejs#5590](nodejs/node#5590),
    [nodejs#4518](nodejs/node#4518),
    [nodejs#3917](nodejs/node#3917).
  * fs.read's string interface is deprecated
    [nodejs#4525](nodejs/node#4525)
* HTTP
  * 'clientError' can now be used to return custom errors from an
    HTTP server [nodejs#4557](nodejs/node#4557).
* Modules
  * Current directory is now prioritized for local lookups
    [nodejs#5689](nodejs/node#5689)
  * Symbolic links are preserved when requiring modules
    [nodejs#5950](nodejs/node#5950)
* Net
  * DNS hints no longer implicitly set
    [nodejs#6021](nodejs/node#6021).
  * Improved error handling and type checking
    [nodejs#5981](nodejs/node#5981),
    [nodejs#5733](nodejs/node#5733),
    [nodejs#2904](nodejs/node#2904)
* OS X
  * MACOSX_DEPLOYMENT_TARGET has been bumped up to 10.7
    [nodejs#6402](nodejs/node#6402).
* Path
  * Improved type checking [nodejs#5348](nodejs/node#5348).
* Process
  * Introduce process warnings API
    [nodejs#4782](nodejs/node#4782).
  * Throw exception when non-function passed to nextTick
    [nodejs#3860](nodejs/node#3860).
* Readline
  * Emit key info unconditionally
    [nodejs#6024](nodejs/node#6024)
* REPL
  * Assignment to `_` will emit a warning.
    [nodejs#5535](nodejs/node#5535)
* Timers
  * Fail early when callback is not a function
    [nodejs#4362](nodejs/node#4362)
* TLS
  * Rename 'clientError' to 'tlsClientError'
    [nodejs#4557](nodejs/node#4557)
  * SHA1 used for sessionIdContext
    [nodejs#3866](nodejs/node#3866)
* TTY
  * Previously deprecated setRawMode wrapper is removed
    [nodejs#2528](nodejs/node#2528).
* Util
  * Changes to Error object formatting
    [nodejs#4582](nodejs/node#4582).
* Windows
  * Windows XP and Vista are no longer supported
    [nodejs#5167](nodejs/node#5167),
    [nodejs#5167](nodejs/node#5167).
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants