Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: refactor prototype and Function.call.bind usage #18773

Closed
wants to merge 1 commit into from

Conversation

devsnek
Copy link
Member

@devsnek devsnek commented Feb 14, 2018

Moves usage of Object.prototype.* and methods saved with Function.call.bind to a single "primordials" "meta module" which ensures the safety of the method, cleans up the lib a bit, and i think also plays nicely with when we start doing snapshots in the future if @hashseed can speak to that.

This is intended to sit on or maybe replace #18750

Fixes: #17434

cc @apapirovski @hashseed @BridgeAR

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

crypto, assert, async_hooks, lib, bootstrap

@nodejs-github-bot nodejs-github-bot added the lib / src Issues and PRs related to general changes in the lib or src directory. label Feb 14, 2018
@devsnek
Copy link
Member Author

devsnek commented Feb 14, 2018

Copy link
Member

@apapirovski apapirovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still -0 on this approach. It's a pain to maintain, it's a pain to use, it's a pain to introduce new contributors to it. It also does nothing for dependencies like acorn...

But to the PR itself: I think this doesn't fix all of the current usage, right? For example, EventEmitter is littered with various Object.XYZ calls (incl getPrototypeOf, create, etc.).

This would also likely be semver-major because we have no clue who (/ if anyone) relies on monkey-patching these currently.

@@ -250,7 +252,7 @@ function newUid() {
}

function getOrSetAsyncId(object) {
if (object.hasOwnProperty(async_id_symbol)) {
if (ReflectHas(object, async_id_symbol)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't semantically equivalent. It includes the prototype properties.

Copy link
Member Author

@devsnek devsnek Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there random symbols on object prototype chains? the performance difference (according to jsperf) seems well worth the switch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more that if we don't care about prototype then we should just have this be an undefined check. I assume this was done for a specific reason but I wasn't the one that wrote that code so I don't know...

Copy link
Member

@hashseed hashseed Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not change the semantics here. You just need to set the prototype of the object to another object that already has an async id to make this behave in an unintended way. Besides, this PR should be a mechanical refactoring, not also change semantics if possible.

@devsnek
Copy link
Member Author

devsnek commented Feb 14, 2018

@apapirovski its only intended to be used moving forward, also since we already do things like const ArrayMap = Function.call.bind(Array.prototype.map) in core at this very moment this can be seen as tidying up if you like. in any event it is definitely not semver major, thats specifically why i didn't go back and replace existing usage in events and such.

Copy link
Member

@benjamingr benjamingr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 I think this makes the code harder to read and most of these were only bound once anyway. I'm definitely willing to reconsider if we see significant numbers in benchmarks.

@hashseed
Copy link
Member

hashseed commented Feb 14, 2018

I'm +1 on this general approach. Internal implementation should not be monkey-patchable. That's just as bad, if not worse, than userland modules using internal APIs.

This pattern is also used in the JavaScript spec. Functions used to implement JS builtins are original, untouched internals unless explicitly specified via monkey-patchable APIs.

@benjamingr
Copy link
Member

I can totally get why this is done in the JavaScript spec - but I don't think we should be defensive with this in Node.js

@hashseed
Copy link
Member

Why not? You run the risk of not being able to change internal implementation because it is exposed via monkey-patching. And the risk of accidental breakages due to monkey-patching.

This is the same argument as moving internal APIs away from underscore properties.


ObjectIsPrototypeOf: uncurryThis(Object.prototype.isPrototypeOf),
ObjectHasOwnProperty: uncurryThis(Object.prototype.hasOwnProperty),
ObjectKeys: Object.keys,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if we go this way, we might want to add all Object properties?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more or less why I'm not enthusiastic about this work.

@BridgeAR BridgeAR added the semver-major PRs that contain breaking changes and should be released in the next major version. label Feb 14, 2018
@BridgeAR
Copy link
Member

I still have to think about this. If we add it, I would like to expose it as a module though. The reason is that users could use those as well, so it would actually have a benefit.

@hashseed
Copy link
Member

That is actually a good point. Users may want to be able to use untampered builtins too.

So it would be nicer to extract these builtins programmatically rather than hardcode a list.

@benjamingr
Copy link
Member

Who are we protecting ourselves from though?

Users can override a lot of built ins at the moment (not just there) - Node has never made any guarantee about what happens if you override Function.prototype.apply and this PR doesn't guard all built ins anyway.

I'm afraid that if we guard against this people will expect it to be... guarded - and that would be a lot more work and a maintenance burden.

@hashseed
Copy link
Member

Like I said, same arguments apply to underscored internal APIs. It would be consistent to make the same decision.

I think one of the reasons is that it's often opaque what the dependency tree of npm modules do, and impossible to police.

/cc @addaleax

@benjamingr
Copy link
Member

@hashseed I'm not sure I agree

The problem with underscored properties is that we need to maintain APIs that are scoped to an underscore.

I don't think anyone would expect Node to work if you override Object.prototype.toString.

@hashseed
Copy link
Member

I don't think anyone would expect Node to work if you override Object.prototype.toString.

You would expect a browser to keep working correctly if user code did that. Why should Node.js have a lower bar wrt robustness? When we had similar issues in V8 it was not a question that it should be fixed.

And where do you draw the line? Is String.prototype.replace fine to override? How about String.prototype.repeat?

Maybe this is just something to discuss for the TSC? I don't think this is going anywhere.

@devsnek
Copy link
Member Author

devsnek commented Feb 15, 2018

@hashseed if generating these programmatically, would it make more sense to wait until we have snapshots for perf reasons? i don't want to delay startup with a bunch of loops over all the well known globals. also if we expose them i would like them to be Object.freezed otherwise it kind of defeats the purpose of it since someone could swap out a method, although i realize freezing exports of builtins is an unpopular idea.

as a fuller reference impl: https://gist.github.com/devsnek/76aab1dcd169c96b952fbe8c74404475
line 27 would be what decides if this is exposed to users. imo we should not expose this

@jasnell jasnell requested a review from a team February 15, 2018 04:05
@jasnell
Copy link
Member

jasnell commented Feb 15, 2018

Flagged this tsc-review and requested a review from tsc members.

For my part, the primordials really should be an internal module rather than a new top level module. Beyond that, I'm +0 on this.

@benjamingr
Copy link
Member

benjamingr commented Feb 15, 2018

You would expect a browser to keep working correctly if user code did that. Why should Node.js have a lower bar wrt robustness?

I'm not sure what you mean, if you do Promise.prototype.then = () => {} you break any API that returns promises (like fetch for example).

Maybe this is just something to discuss for the TSC? I don't think this is going anywhere.

We've literally been discussing this here for less than 2 days - I don't think my opinion is set in stone (and I hope yours isn't either). The pull request isn't ready yet - I'm not sure why we'd want to involve the TSC at this stage (but feel free to do so if you feel differently).

@bnoordhuis
Copy link
Member

I'm +1 on the concept. Node.js fails in inscrutable ways when built-ins are monkey-patched poorly; reducing user confusion outweighs new contributor confusion (and I don't think it's going to be that bad, not with a lint rule.)

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the rationale of having a public non-monkeypatchable primitives. But this might be a different discussion and not something that is part of a mechanical refactoring. If that is the end goal, it would be better to add the new module/public functions and then refactor the internals. If the end goal is an internal refactoring, then let's do that and talk about exposing this.
(Using the big red cross to talk about it explicitly).

In any case, whatever we add should have tests of its own.

@hashseed @devsnek can you please explain how this would help the snapshots? IMHO that is the goal that makes this worth doing.

// Refs: https://git.io/vAOyO
function uncurryThis(func) {
return (thisArg, ...args) => ReflectApply(func, thisArg, args);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hashseed @bmeurer @devsnek can you confirm that this has zero overhead? Some of these calls sits in hot paths and I do not want to add an utility that we need to be careful in using.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a 17% performance regression with this benchmark on Node 8, but with newest V8 there is none.

const ReflectApply = Reflect.apply;

function uncurryThis(func) {
  return (thisArg, ...args) => ReflectApply(func, thisArg, args);
}
var StringStartsWith = uncurryThis(String.prototype.startsWith);

var s = "Hi Matteo";
function test1() {
  var r = 0;
  console.time("monkey-patchable");
  for (var i = 0; i < 1E7; i++) {
    if (s.startsWith(i)) r++;
  }
  console.timeEnd("monkey-patchable");
  return r;
}

function test2() {
  var r = 0;
  console.time("cached original");
  for (var i = 0; i < 1E7; i++) {
    if (StringStartsWith(s, i)) r++;
  }
  console.timeEnd("cached original");
  return r;
}

test1();
test2();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow! Good to know!
Which version of V8 did you check? what we have on master? What we plan to have for Node 10?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also ran with current Node master. There is no regression there either.


ObjectIsPrototypeOf: uncurryThis(Object.prototype.isPrototypeOf),
ObjectHasOwnProperty: uncurryThis(Object.prototype.hasOwnProperty),
ObjectKeys: Object.keys,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more or less why I'm not enthusiastic about this work.

@hashseed
Copy link
Member

hashseed commented Feb 15, 2018

I'm not sure what you mean, if you do Promise.prototype.then = () => {} you break any API that returns promises (like fetch for example).

While I'm not familiar with fetch, it definitely sounds like a bug. I filed one in Chrome.

We've literally been discussing this here for less than 2 days - I don't think my opinion is set in stone (and I hope yours isn't either). The pull request isn't ready yet - I'm not sure why we'd want to involve the TSC at this stage (but feel free to do so if you feel differently).

I think it makes sense to know whether this should advance before spending more time in making this PR ready?

can you please explain how this would help the snapshots? IMHO that is the goal that makes this worth doing.

This is off-topic, as you already called out before. But just to explain... if during bootstrapping we run code to iterate all builtin objects to collect the original builtins, that would slow down startup. This issue disappears if Node.js supports startup snapshots. This PR does not help snapshot. Snapshot helps this PR.

@benjamingr
Copy link
Member

@hashseed

While I'm not familiar with fetch, it definitely sounds like a bug. I filed one in Chrome.

Overriding then in Promise should cause this behavior, I'm pretty sure the spec even requires it.

fetch is not isolated here - there are a ton of cases in browsers where this is the case (if reporting them all to crbug is useful in your opinion let me know). There is tremendous value in being able to override certain things in primitives (like XMLHttpRequest for instance). Although wouldn't expect the browser to "behave well" if someone overrides getOwnPropertyNames.

In addition, overriding builtins is actually very important for some polyfills - for example the WeakMap polyfill overrode getOwnPropertyNames and Object.keys for a good reason in order to polyfill a language feature that was missing.

I'm not sure we should guard against this - and I think there might be legitimate use cases for extending or overriding primitives in JavaScript.

I think it makes sense to know whether this should advance before spending more time in making this PR ready?

Right, but involving the TSC explicitly in an issue is typically done when the consensus seeking process has exhausted itself in core issues. (Pinging the TSC in is pretty taxing for the TSC who have to learn an issue). Again - you were fully within your right to escalate the issue to the TSC, I was just confused by this.

It seems like there are still pros/cons being discussed and I haven't 100% made up my mind yet.

I think that we should move this discussion to an issue more likely - as this PR doesn't cover all the usages of built-ins anyway - as Node uses built ins in an unguarded way much more often than this PR fixes. For example in the first 450 lines of fs.js (which this PR doesn't touch):

So if we decide to guard against this - I think it should be a more complete solution.

@hashseed
Copy link
Member

hashseed commented Feb 15, 2018

Overriding then in Promise should cause this behavior, I'm pretty sure the spec even requires it.

That's fine then. Like I mentioned before, JavaScript spec also has explicitly defined points for monkey-patching. RegExp.prototype.exec can be overwritten, which, by design, affect other RegExp builtins.

But Node.js builtins are not spec'ed. Whether a particular part of the internal implementation is monkey-patchable should be a conscious design choice, not by accident or due to implementation detail, and not the default.

Although wouldn't expect the browser to "behave well" if someone overrides getOwnPropertyNames.

All APIs that V8 offer in C++ use the original builtin. Including v8::Object::GetOwnPropertyNames. Let's say, just hypothetically, Node.js core wants to move the implementation of some functions from JS to C++, it then must not use v8::Object::GetOwnPropertyNames, but instead load and call global.Object.getOwnPropertyNames. Otherwise it would observably change behavior.

If some feature in Chrome is indeed affected by overriding getOwnPropertyNames, it would have been a conscious choice or a bug.

Node uses built ins in an unguarded way much more often than this PR fixes.

That's true. I was expecting this to be the first of many PRs.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making my -1 explicit. I don't think this needs to go to the TSC right now. Let's bake it for a bit.
The TSC will likely "send it back to github" at this stage.

@benjamingr
Copy link
Member

All APIs that V8 offer in C++ use the original builtin. Including v8::Object::GetOwnPropertyNames. Let's say, just hypothetically, Node.js core wants to move the implementation of some functions from JS to C++, it then must not use v8::Object::GetOwnPropertyNames, but instead load and call global.Object.getOwnPropertyNames. Otherwise it would observably change behavior.

You're right, at the moment Node makes no guarantees about the behavior when you override a built in - if we move code from/to C++ we are effectively breaking possible implicit assumptions that users might make with those APIs.

On the other hand unlike our own host objects - the community has never abused this fact (yet) and Node has never made any guarantees about it (of course, if a popular library decided to change Object.keys tomorrow - that changes - but that's also heavily frowned upon).

That's true. I was expecting this to be the first of many PRs.

This sounds like an interesting project of defining explicitly how the JavaScript runtime should interact with Node.js APIs themselves.

I definitely think we should explore (and specify) the behavior of what happens in Node when built ins are changed. If you'd like to open a discussion in the tracker about this I'm willing to gladly collaborate and work with you to bring a concrete proposal to the TSC about what the behavior should be.

@hashseed
Copy link
Member

Sounds like a plan! Let's open an issue on this.

@BridgeAR BridgeAR added the blocked PRs that are blocked by other issues or PRs. label Mar 6, 2018
@fhinkel
Copy link
Member

fhinkel commented Mar 17, 2018

Closing this as we need to figure out a solution in #18795 first. Feel free to reopen if needed.

@fhinkel fhinkel closed this Mar 17, 2018
@devsnek devsnek deleted the refactor/primordials branch April 1, 2018 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked PRs that are blocked by other issues or PRs. lib / src Issues and PRs related to general changes in the lib or src directory. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants