Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON.stringify on process.nodeModules can result in an OOM when using custom serializer #27536

Closed
coopa opened this issue May 2, 2019 · 5 comments
Labels
help wanted Issues that need assistance from volunteers or PRs that need help to proceed. module Issues and PRs related to the module subsystem.

Comments

@coopa
Copy link

coopa commented May 2, 2019

  • Version: 8+
  • Platform: MacOS 18.5.0, Ubuntu 16.04
  • Subsystem:

I was working through an upgrade from Node 7.9 to the latest LTS at Node 10.15 when I was met with OOM errors that suddenly started happening post-upgrade. After some digging I discovered that the culprit was a call to JSON.stringify on an object that happened to contain a reference to NodeJS.Process (current process). The call to stringify process tries to enumerate process.mainModule NodeModule object and OOMs while trying to do so.

Each NodeModule has two properties that point to the hierarchy of requires: parent and children. parent points to the current module's requirer and children points to all modules this one requires. When running JSON.stringify on mainModule with default JSON.stringify settings it quickly falls out with a circular reference due to a child pointing to a parent. We are using a library called json-stringify-safe (https://github.com/moll/json-stringify-safe) that makes it so stringify does not throw on circular references. This library adds a circular marker instead of the would-be value into the JSON output without throwing, and runs until it exhausts every non-circular path. I have found that there was a change in behavior after Node 8 that caused this number of non-circular paths to explode exponentially in Node modules.

From what I can tell in Node 7 each module is loaded and pointed to only once via the NodeModule.children property: only by the first parent to require the module. The child in turn points to that first parent via NodeModule.parent property. However, in Node 8+ every parent module holds children references to every module this parent requires instead of just those it was first to require, while the child's parent property points to the first parent to have loaded this child. This means that every package to require some child module that it was not the first one to require will also end up traversing the entire tree of the first module to have loaded it when stringified.

For a more illustrative example, imagine the following scenario:

Module Foo requires modules A and B.
Module Bar requires modules A and D.

In Node 7 the following relationship would be created:
Foo.children = [A, B]
A.parent = Foo
B.parent = Foo
Bar.children = [D] // A is not part of this collection
D.parent = Bar

In Node 8+ the following relationships are created:
Foo.children = [A, B]
A.parent = Foo
B.parent = Foo
Bar.children = [A, D]
// A's parent is Foo
D.parent = Bar

When Node 7 stringify executes the following sequence of keys will be visited:

  • Foo
    • Foo.children
      • Foo.children.A
        • Foo.children.A.parent (circular, stop)
      • Foo.children.B
        • Foo.children.B.parent (circular, stop)
  • Bar
    • Bar.children
      • Bar.children.D
        • Bar.children.D.parent (circular, stop)

When Node 10 stringify runs the same thing will happen for Foo, however Bar gets more interesting:

  • Bar
    • Bar.children
      • Bar.children.A
        • Bar.children.A.parent (Foo)
          • Bar.children.A.parent.children
            • Bar.children.A.parent.children.A (circular, stop)
            • Bar.children.A.parent.children.B
              • Bar.children.A.parent.children.B.parent (circular, stop)
                // D runs fine

This behavior multiplied out to dozens (hundreds?) of packages explodes exponentially easily chugging through gigs of RAM. The following script can be run to easily reproduce this issue on any version of Node after 8. You'll have to install the winston package for this example but there are others that will do it too: winston was just an easy one with lots of monthly downloads.

// Load the winston module
// Source: https://www.npmjs.com/package/winston
// Install:
// > npm i winston
require("winston");

//// External module: json-stringify-safe
// I left it copy-pasted here for easier experimentation.
// Source: https://github.com/moll/json-stringify-safe
function stringify(obj, replacer, spaces, cycleReplacer) {
  return JSON.stringify(obj, serializer(replacer, cycleReplacer), spaces)
}

function serializer(replacer, cycleReplacer) {
  var stack = [], keys = []

  if (cycleReplacer == null) cycleReplacer = function(key, value) {
    if (stack[0] === value) return "[Circular ~]"
    return "[Circular ~." + keys.slice(0, stack.indexOf(value)).join(".") + "]"
  }

  return function(key, value) {
    if (stack.length > 0) {
      var thisPos = stack.indexOf(this)
      ~thisPos ? stack.splice(thisPos + 1) : stack.push(this)
      ~thisPos ? keys.splice(thisPos, Infinity, key) : keys.push(key)
      if (~stack.indexOf(value)) value = cycleReplacer.call(this, key, value)
    }
    else stack.push(value)

    return replacer == null ? value : replacer.call(this, key, value)
  }
}
//// End json-stringify-safe

// Engage!
stringify(process);

I ended up figuring out a simple fix for this in our code base by declaring the property pointing to Process as non-enumerable. This way for-in, stringify, or Object.keys cannot see it and won't try to delve in.

I wanted to bring this up as a potential issue. The module hierarchy is obviously working as intended, but the hierarchy does have this potential of blowing out heap on attempts to traverse it. My gut reaction suggests making parent non-enumerable is easiest, but I also don't know what the intent or direction might be from Node maintainers or if this is even seen as a problem.

@bnoordhuis bnoordhuis added the module Issues and PRs related to the module subsystem. label May 2, 2019
@bnoordhuis
Copy link
Member

Thanks for the good bug report! What you're describing sounds like the change from #14132. That PR makes it possible to have cycles, see #7131 for why that is necessary.

No strong opinion on whether it's a good idea to make some properties non-enumerable but CITGM should hopefully catch ecosystem regressions.

@BridgeAR
Copy link
Member

BridgeAR commented May 2, 2019

Shall we really hide the children parent? We could instead add a toJSON function and also a custom inspection function for util.inspect on the prototype. That way the inspection should be fine while still showing most properties as they are.

@coopa
Copy link
Author

coopa commented May 6, 2019

Either direction would work. Overriding the serialization methods leaves custom serializers able to enumerate the parent property. I actually ended up going in that direction with my code as well instead of making the properties non-enumerable. This limits the scope of the change to just the problematic bit.

@jasnell jasnell added the help wanted Issues that need assistance from volunteers or PRs that need help to proceed. label Jun 26, 2020
@jasnell
Copy link
Member

jasnell commented Jun 26, 2020

It's not clear what the next step on this one should be.

@marco-ippolito
Copy link
Member

this issue doesnt seem to affect node 20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issues that need assistance from volunteers or PRs that need help to proceed. module Issues and PRs related to the module subsystem.
Projects
None yet
Development

No branches or pull requests

5 participants