Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss: Expose Structured Clone on v8 Module #34355

Closed
amiller-gh opened this issue Jul 14, 2020 · 7 comments
Closed

Discuss: Expose Structured Clone on v8 Module #34355

amiller-gh opened this issue Jul 14, 2020 · 7 comments

Comments

@amiller-gh
Copy link
Member

amiller-gh commented Jul 14, 2020

Hi all! Relatively niche feature idea, but will likely be useful for a lot of lower level state management libraries. I'm a little out of my expertise with the C++ implementation here, so please excuse me if I'm missing something that is already possible using existing APIs.

Context

Deep cloning objects in javascript is notoriously hard – and even harder if you want to to be performant. Luckily we now have the Structured Clone Algorithm as a native option.

The native V8 implementation was exposed in the Node.js runtime in [email protected] as the Serialization API. Discussion happened in #6300

Problem

Using the V8 serialization API, Node.js apps can run a structured clone like so:

const v8 = require('v8');
const structuredClone = (o) => v8.deserialize(v8.serialize(o));

However, for large objects this is actually still fairly slow! I assume this is because calling v8.serialize/deserialize shuttles data back and forth over the JS/C++ boundary twice. In my use case, I've found that a rather naive recursive clone function can actually out-perform it! Not ideal.

Instead, I've discovered that taking the rather roundabout method of leveraging MessageChannels can give me the native performance gains I'm expecting:

const { MessageChannel } = require('worker_threads');
function structuredClone (o) {
  const { port1, port2 } = new MessageChannel();
  return new Promise((resolve) => {
    port2.on('message', resolve);
    port2.on('close', port2.close);
    port1.postMessage(o);
    port1.close();
  });
}
const clone = await structuredClone({ foo: 'bar' });

MessageChannel also uses the structured clone algorithm to pass data from one port to the next. However, it runs faster than v8.serialize/deserialize for this use case since it doesn't unnecessarily send data back and forth in order to clone the object.

Note: I have validated the performance differences in my current project, but neglected to take screenshots of the flame graphs! If there is interest in exploring this proposal I'm happy to come up with a contrived perf test comparing v8.serialize/deserialize, a simple recursive clone function, and MessageChannel.

However, using MessageChannel is also not ideal:

  1. It requires creating a new message channel for each clone, or maintaining a shared message channel with some method of discerning between cloned responses. This adds runtime overhead and code complexity.
  2. It forces us to use an async API for cloning objects. Often fine, but not ideal for some implementations.
  3. It requires a good degree of boilerplate just to access a native algorithm that the project already intends to expose.

Proposed Solution

Relatively simply, we can choose to expose a sync and async API for V8's structured clone:

const v8 = require('v8');
const syncClone = v8.structuredCloneSync({ foo: 'bar' });
const asyncClone = await v8.structuredClone({ foo: 'bar' });

This should out-perform both v8.serialize/deserialize and MessageChannel since it avoids the overhead of excess data shuttling and MessagePort creation, while also enabling a fully synchronous API.

Alternatives

  • Continue to use MessageChannel and publish as a user-land module, forgoing a synchronous, performant API
  • ???
@amiller-gh amiller-gh changed the title Discuss: Expose Structure Clone on v8 Module Discuss: Expose Structured Clone on v8 Module Jul 14, 2020
@targos
Copy link
Member

targos commented Jul 14, 2020

Note that you can make the MessagePort version sync with the receiveMessageOnPort function:

const { MessageChannel, receiveMessageOnPort } = require('worker_threads');
const { port1, port2 } = new MessageChannel();
function structuredClone (o) {
  port1.postMessage(o);
  return receiveMessageOnPort(port2).message;
}
const clone = structuredClone({ foo: 'bar' });

@amiller-gh
Copy link
Member Author

Found a similar discussion going on over in the whatwg repo: whatwg/html#793

There seems to be general consensus in the utility of a structuredClone function.

Great to know about receiveMessageOnPort @targos. Given that, do you think there is value in exposing a standalone method for this, or should using MessageChannel be the recommend way of pulling this off?

@targos
Copy link
Member

targos commented Jul 14, 2020

If we can make it faster as a standalone function, that seems valuable.

@amiller-gh
Copy link
Member Author

amiller-gh commented Jul 14, 2020

Here are flame charts for MessageChannel used as both sync and async for structured clone in my project. The save call has to call deepClone three times (for assorted reasons). This specific call is working with a particularly large object.

Interestingly – but probably as expected – they take approximately the same total time, but the synchronous version shuttles the copied object to Node.js while blocking the main thread where the async version does it in the background allowing Node to do other work.

Async

Screen Shot 2020-07-14 at 11 09 56 AM

The async version creates and destroys a new MessageChannel for each call. It does not appear to have a significant effect at runtime.

Sync

Screen Shot 2020-07-14 at 11 10 41 AM

I was hoping that the synchronous call to receiveMessageOnPort would end up being faster, but it appears to just unnecessarily block the process!

@addaleax, since you implemented the original serialization bindings you may have some context on if there is some extra speed to be squeezed out of a standalone function – though to me this is looking more like just the cost of shuttling data around ¯\(ツ)

If you have any top-of-mind ideas where a standalone function may improve perf I'm happy to brush off my C++ skills and take a stab at testing them out when I have time, otherwise I'm happy to keep this in user-land for now.

@addaleax
Copy link
Member

@amiller-gh The serialization/deserialization steps create an intermediate Buffer where they need to write + read all strings and other data structures into. If all you want is structured clone, and you want it to be fast, then I think you can get a significant perf boost by implementing that in JS only – I think that’s basically what you’re looking for?

@amiller-gh
Copy link
Member Author

amiller-gh commented Jul 14, 2020

I started with that approach, but this ended up being faster. I experimented using this function to clone objects:

function deepClone<T>(o: T): T {
  if (typeof o !== 'object') { return o; }
  if (!o) { return o; }

  // https://jsperf.com/deep-copy-vs-json-stringify-json-parse/25
  if (Array.isArray(o)) {
    const newO = [] as unknown as T;
    for (let i = 0; i < o.length; i += 1) {
      const val = (!o[i] || typeof o[i] !== 'object') ? o[i] : deepCloneSync(o[i]);
      newO[i] = val === undefined ? null : val;
    }
    return newO;
  }

  const newO = {} as unknown as T;
  for (const i of Object.keys(o)) {
    const val = (!o[i] || typeof o[i] !== 'object') ? o[i] : deepCloneSync(o[i]);
    if (val === undefined) { continue; }
    newO[i] = val;
  }
  return newO;
}

It's fast – certainly faster than JSON.parse(JSON.stringify(o)), and faster than v8.deserialize(v8.serialize(o)) for the reasons described above – but because of the repeat type checking and object creation (and I'd assume non-standard function shape triggering de-opts) it turns out that going through EventEmitter still shaves off a few milliseconds from each clone for large objects. The async version also has the added benefit of taking a lot of the work off the main thread.

This version of recursive deep clone is about as simple as you can make it – it doesn't handle complex objects, or cycles like structured clone does – so it was largely just a proof of concept as I was exploring options. Adding those extra features would only slow it down more. In addition, these objects I'm cloning eventually get sent over IPC anyway, so its convenient to use the same algorithm.

Ideally I'd statically build cloning helper functions from my typescript types like I'm doing elsewhere, but unfortunately thats not possible here!

@amiller-gh
Copy link
Member Author

Going to go ahead and self-close this one to help keep the backlog tidy :)

I'm fairly convinced this is just the overhead cost of deep cloning, and the sync option outlined above means a standalone function is just syntactic sugar.

If whatwg/html#793 progresses, Node.js may want to consider the extra v8 method to keep a degree of parity, but for now this can live in user-land.

Thank you both for the input! Hopefully this issue will help some intrepid developer Googling for Node.js deep clone optimization one day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants