module: initialize hook returns load, resolve #50044

ggoodman · 2023-10-04T19:41:34Z

This commit allows the initialize() hook to optionally return an object having the resolve() and load() hooks as properties. This allows state passed into initialize() to be shared with the resolve() and load() hooks either via closure or class instance.

In addition to developer ergonomics, supporting this model will make it easier to write tests against a loader module. The existing design
forces state to be shared at the module level which puts the burden of invalidating the ESM module cache on anyone hoping to write isolated
tests against a loader module.

TODO:

Obtain consent / consensus to explore this api.
Write tests.

Fixes: #50042

nodejs-github-bot · 2023-10-04T19:41:38Z

Review requested:

@nodejs/loaders

lib/internal/modules/esm/hooks.js

GeoffreyBooth · 2023-10-04T20:32:49Z

It would help in understanding your PR if you could update the docs too. That would make it very clear what your intended API is, with examples. (Obviously it’ll also need tests and so on, but having at least docs would help at this stage.)

This might be related to #49159; the deregister API from that PR would allow you to register and deregister hooks over and over for testing.

ggoodman · 2023-10-04T22:52:21Z

Hi @GeoffreyBooth I think the deregister function described in that issue is complementary.

allow hook methods to receive a fourth state argument – so instead of returning the result of initialize and discard it, we now keep the result attached to the loader instance. this value is then fed back to hook invocations. after thinking about the comments in nodejs/loaders#147 this actually seems necessary if there can be an arbitrary number of "copies" of any given loader; how can loaders differentiate their internal state from one another? there isn't a good way without something like this.

I think my pull request addresses this use-case in a more elegant way. Instead of adding new arguments to loader hook methods, the design in this PR means that data passed into initialize is now trivially available by the returned hooks.

It would help in understanding your PR if you could update the docs too. That would make it very clear what your intended API is, with examples.

I'll attach the example from the issue I originally created that I think illustrates the consumer-facing benefits of the design.

register.js:

import { register } from 'node:modules';

const mc = new MessageChannel();

mc.port2.onmessage = (msg) => {
  console.debug(msg);
};

register('./loader.js', { port: mc.port1 });

loader.js:

Here's an example of how we can implement a loader as a class that holds instance-level state in a private field. This avoids needing to have mutable module-level state for the message port.

// Here is the initialize function that accepts the data passed into `register()`.
export async function initialize(data) {
  // We don't need to store `data` in module-level state anymore.
  return new MyLoader(data);
}

// The actual loader can now be a class instance.
class MyLoader {
  #port;
  
  constructor(data) {
    this.#port = data.port;
  }
  
  // The class instance implements the resolve contract
  resolve(specifier, context, nextResolve) {
    this.#port.postMessage(['resolve', specifier, context]);
    // No-op, defer to built-in
    return nextResolve(specifier, context);
  }
  
  // The class instance implements the load contract
  load(url, context, nextLoad) {
    this.#port.postMessage(['load', url, context]);
    // No-op, defer to built-in
    return nextLoad(url, context);
  } 
}

test.js:

Here's an example of how it is easier to write a unit test against the proposed API. The key is that ./loader.js isn't stateful; it doesn't have module-level hidden state that needs to be reset between tests. The loader is the resolved value of the initialize() function meaning that it can be instantiated more than once with different options.

import test from 'node:test';

import { initialize } from './loader.js';

test('My custom loader', async (t) => {
  await t.test('resolve()', async (t) => {
    await t.test('does useful stuff', async (t) => {
      const mc = new MessageChannel();
      const loader = await initialize({ port: mc.port1 });

      // Example of using this _instance_ of the loader to test it.
      await runFirstTestOnLoaderInstance(t, loader);
    });

    await t.test('does other useful stuff', async (t) => {
      const mc = new MessageChannel();
      const loader = await initialize({ port: mc.port1 });

      // Example of using another _instance_ of the loader to test it.
      await runSecondTestOnLoaderInstance(t, loader);
    });
  });
});

ggoodman · 2023-10-18T15:12:12Z

I know that @GeoffreyBooth suggested that I flesh out docs for this to better illustrate the changes in API but I would prefer to hold off on that until I get some signals of interest from the @nodejs/modules / @nodejs/loaders folks.

This change is proposed as a small incremental improvement. If the team prefers, I can propose a more dramatic change to the loaders API with ideas taken from esbuild's plugin API that seem relevant to the problem being solved.

GeoffreyBooth · 2023-10-18T23:42:52Z

If the team prefers, I can propose a more dramatic change to the loaders API with ideas taken from esbuild‘s plugin API that seem relevant to the problem being solved.

What change did you have in mind? We recently marked the overall API as “release candidate,” meaning we don’t anticipate any further changes before marking it stable, so it’s not really a good time for breaking design changes unless there’s a strong motivator.

The proposal about initialize returning the other hooks seems more or less fine to me; an earlier version of initialize had its return value passed along into the return value of register, so I’m hesitant to repurpose the return for something else in case we need to go back to that earlier design. Though I guess it could always become a new property on the return object, like initialize could return { resolve, load, registerReturnValue } or whatever.

ggoodman · 2023-10-19T18:11:59Z

I'll share my general feedback on the loaders API in #50042 so that we can leave this PR to discuss the incremental 'improvement' proposed here.

This commit allows the `initialize()` hook to optionally return an object having the `resolve()` and `load()` hooks as properties. This allows state passed into `initialize()` to be shared with the `resolve()` and `load()` hooks either via closure or class instance. In addition to developer ergonomics, supporting this model will make it easier to write tests against a loader module. The existing design forces state to be shared at the module level which puts the burden of invalidating the ESM module cache on anyone hoping to write isolated tests against a loader module. Fixes: nodejs#50042

JakobJingleheimer · 2023-10-24T17:51:40Z

I'm also leaning in favour of this for the communication aspect—message channel seems a heavy-handed solution for a simple problem (and a simple problem ought to have a simple solution), not to mention the latency. I'm also generally in favour of OOP.

What is/are the specific scenario(s) that need addressing? Are multiple workers the only one where this surfaces? There is overhead to a class that's otherwise not paid with functional exports, and from what I can think of, module-level state generally fits the bill. I'm looking to ensure we address a problem people will actually have—if only 1 person will be in this scenario, maybe the overhead etc this would impose on everyone isn't worth the non-ideal message channel design.

If it's merely to facilitate testing, that is easily a simply addressed with a generation query param on the import specifier.

I see @giltayar's #50042 (comment), and he authors multiple major libraries effected by this (so that gives me a fair amount of confidence there is wide(r) issue this will address). But I think I'm missing a piece of context that seems very important to understanding the problem.

nodejs-github-bot added esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run. labels Oct 4, 2023

ggoodman force-pushed the instanced-loader-hooks branch from 3c73118 to 565a5e3 Compare October 4, 2023 19:44

ggoodman commented Oct 4, 2023

View reviewed changes

lib/internal/modules/esm/hooks.js Outdated Show resolved Hide resolved

ggoodman commented Oct 4, 2023

View reviewed changes

lib/internal/modules/esm/hooks.js Outdated Show resolved Hide resolved

ggoodman force-pushed the instanced-loader-hooks branch from 565a5e3 to 74c5ec9 Compare October 4, 2023 20:21

ggoodman mentioned this pull request Oct 4, 2023

ESM Loader Hooks (20.8) should support a more testable, less stateful design #50042

Closed

ggoodman force-pushed the instanced-loader-hooks branch from 74c5ec9 to 28d6893 Compare October 4, 2023 20:36

GeoffreyBooth added the loaders Issues and PRs related to ES module loaders label Oct 4, 2023

GeoffreyBooth mentioned this pull request Oct 9, 2023

Node.js Loaders Team Meeting 2023-10-10 nodejs/loaders#165

Closed

GeoffreyBooth mentioned this pull request Oct 21, 2023

Node.js Loaders Team Meeting 2023-10-24 nodejs/loaders#167

Closed

ggoodman force-pushed the instanced-loader-hooks branch from 8069d62 to c035843 Compare October 24, 2023 00:30

ggoodman force-pushed the instanced-loader-hooks branch from c035843 to d27cce8 Compare October 24, 2023 00:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

module: initialize hook returns load, resolve #50044

module: initialize hook returns load, resolve #50044

ggoodman commented Oct 4, 2023 •

edited

Loading

nodejs-github-bot commented Oct 4, 2023

GeoffreyBooth commented Oct 4, 2023

ggoodman commented Oct 4, 2023

ggoodman commented Oct 18, 2023

GeoffreyBooth commented Oct 18, 2023

ggoodman commented Oct 19, 2023

JakobJingleheimer commented Oct 24, 2023

module: initialize hook returns load, resolve #50044

Are you sure you want to change the base?

module: initialize hook returns load, resolve #50044

Conversation

ggoodman commented Oct 4, 2023 • edited Loading

nodejs-github-bot commented Oct 4, 2023

GeoffreyBooth commented Oct 4, 2023

ggoodman commented Oct 4, 2023

ggoodman commented Oct 18, 2023

GeoffreyBooth commented Oct 18, 2023

ggoodman commented Oct 19, 2023

JakobJingleheimer commented Oct 24, 2023

ggoodman commented Oct 4, 2023 •

edited

Loading