Skip to content
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.

File extension/directory index resolution in ESM #268

Closed
GeoffreyBooth opened this issue Feb 19, 2019 · 35 comments
Closed

File extension/directory index resolution in ESM #268

GeoffreyBooth opened this issue Feb 19, 2019 · 35 comments

Comments

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Feb 19, 2019

The topic of file extension and directory index resolution (whether automatic or opt-in) is on the agenda for this week, but I can’t seem to find an issue where it has been discussed on its own. It’s not one of the features on the list in our README, though it was mentioned in our road map as being a feature removed as part of Phase 1. It was mentioned by @devsnek and @ljharb in #253 as features that they would want before the new implementation is released. I thought this topic should get its own issue so that the broader group’s attention is drawn to it, since the entire group is significantly larger than the people who attend the meetings.

I’m going to try to keep this post neutral, and I’ll update this top post as needed when people point out things I haven’t thought of.

What file extension/directory index resolution is

In CommonJS, it’s idiomatic to write code like this:

require('./startup');

Where ./startup is resolved as either ./startup.js or ./startup/index.js, or potentially other file extensions.

Why users like it

Aside from saving users from typing three or nine characters, there’s a big usability benefit in saving refactoring effort when a file grows to the point where it needs to be refactored from one into several. In the example above, your project could start with a startup.js file and when it grows unwieldy you could refactor into startup/index.js which in turn requires several other files, and you wouldn’t need to update any of the files that already require('./startup').

What we need to decide

Since we’re planning on building loaders, and since it seems fair to assume that loaders will be capable of rewriting specifiers, file extension/directory index resolution will be possible one way or another in ESM. So the question isn’t whether it should be possible, but rather how easy it should be: enabled by default, enabled via opt-in package.json configuration, or enabled via a package-level loader. Or looked at another way, do we want to encourage the practice, slightly discourage it, or strongly discourage it? The arguments for not enabling it automatically are usually the same, so I’m grouping those together.

Arguments for supporting file extension/directory index resolution automatically

  • This is already supported automatically by require in CommonJS. That’s what users are familiar with.

  • It’s idiomatic to leave off extensions in CommonJS; actually typing them is unusual.

  • It’s also idiomatic to leave off extensions in ESM JavaScript written to be transpiled by Babel or esm. There are countless projects out there written in ESM syntax that assume automatic file extension/directory index resolution, and many of those projects might very well work out-of-the-box if we continued to support automatic file extension/directory index resolution.

Arguments against supporting file extension/directory index resolution automatically

  • Browsers don’t automatically resolve file extensions or directory index files (unless a server is explicitly configured to do so, but this is practically unheard of), and we’re aiming to be as equivalent with browsers as possible. As the import maps proposal puts it:

    Note how unlike some Node.js usages, we include the ending .js here. File extensions are required in browsers; unlike in Node, we do not have the luxury of trying multiple file extensions until we find a good match. Fortunately, including file extensions also works in Node.js; that is, if everyone uses file extensions for submodules, their code will work in both environments. . . .

    Unlike in Node.js, in the browser we don’t have the luxury of a reasonably-fast file system that we can crawl looking for modules. Thus, we cannot implement the Node module resolution algorithm directly; it would require performing multiple server round-trips for every import statement, wasting bandwidth and time as we continue to get 404s. We need to ensure that every import statement causes only one HTTP request; this necessitates some measure of precomputation.

    By making file extension/directory index resolution opt-in, we encourage people to write packages that are compatible with browsers by default. This is especially important because leaving off the extension is currently idiomatic; we’re encouraging the establishment of a new idiom, where code is idiomatically capable of running in either environment (at least, with regards to this; obviously there are lots of other ways people can create packages that are incapable of running in browsers).

  • Node may someday support import of network path URLs, like import Three from 'https://unpkg.com/[email protected]/src/Three.js'. Presumably extension searching will not apply to such URLs. It would be inconsistent for automatic file extension/directory index resolution to work for file:// URLs but not https:// URLs. Such a system would mean that a package relying on automatic file extension/directory index resolution would work when installed locally but fail when loaded from a network URL.

  • There’s a performance cost to automatic file extension/directory index resolution, that continues to grow as more supported file extensions are added. It’s one thing to search for startup.js, and then when it’s not found to search for startup/index.js; it’s another to search for startup.js, startup.mjs, startup.wasm (etc.) and then startup/index.js, startup/index.mjs, startup/index.wasm etc. This performance hit is a big reason that require.extensions was deprecated. From Node’s docs:

    Note that the number of file system operations that the module system has to perform in order to resolve a require(…) statement to a filename scales linearly with the number of registered extensions. In other words, adding extensions slows down the module loader and should be discouraged.

    By requiring this to be opt-in via configuration or a loader, that configuration or loader can specify how this feature should behave. For example, maybe .mjs is the only supported extension automatically resolved, and the performance cost is reduced.

  • The UX becomes more complicated now that we have multiple supported extensions. When a folder contains file.js, file.mjs and file.wasm and you import ‘./file’, which file gets loaded? When support for new extensions is added to core, do they always get added last in precedence?

  • We can release our new implementation unflagged without automatic file extension/directory index resolution and choose to add it later, but we can’t do the reverse.

Arguments for supporting it via package-level configuration

  • As part of Node proper, performance would be as optimized as it could be.

  • This would standardize users’ use of file extension/directory index resolution in a way that supporting it via third-party loaders would not.

Arguments for supporting it via a package-level loader but not via configuration

  • This would send a strong signal from Node that leaving off extensions is discouraged. Getting the feature would require either the performance hit of adding a loader, or the complexity of transpilation/building.

If people offer new arguments (or correct these) I’ll update this post.

Background: https://youtu.be/M3BM9TB-8yA?t=835

@GeoffreyBooth GeoffreyBooth added modules-agenda To be discussed in a meeting discussion features labels Feb 19, 2019
@devsnek
Copy link
Member

devsnek commented Feb 19, 2019

Thank you for opening this issue and listing everything out 😄

I'd like to pick the option that obstructs the fewest use cases. Additionally, I don't think that node should push things like browser conformance on users, given that it isn't a browser and that a huge section of the ecosystem feels this is a useful and valid feature, that that it's been part of the ecosystem for ten years.

It is absolutely valid to want to continue to use resolution, and if resolution exists by default it doesn't stop those who don't want to use it from not using it. If you put full filenames in, you can be browser compatible and avoid what you judge to be performance issues, all without needlessly blocking other people's preferences.

My point in all the above is that I don't think either side of this disagreement is hurt by leaving this in. I'd be very open to adding options to disable the feature, but I really think everyone gets what exactly what they want by leaving it on by default. I'm hopeful we can compromise :)

@robpalme
Copy link
Contributor

I think the primary argument in favour of extension-probing is continuity with the old module system. It preserves those conventions.

The main consequence of this decision will be to influence the conventions used by the modules written and published to npm after we release/unflag. Defaults are very powerful, so we need to ask ourselves not "what did we do before?" but "what do we want to encourage in the future?".

On this topic, my opinion is that we ought to promote convergence with Browsers (and other emergent JS platforms). I accept nothing compels us to strive for this, and this is a subjective viewpoint, and that others are happy for Node and Browser to diverge.

We've covered this extension-probing a few times over the years without killer-arguments and without concensus. Maybe we're due a binding vote on this? I think a decision (any decision) will be healthy for the working group and the ecosystem.

@GeoffreyBooth
Copy link
Member Author

Maybe we're due a binding vote on this? I think a decision (any decision) will be healthy for the working group and the ecosystem.

Shipping ESM without automatic resolution isn't necessarily making a decision to not support automatic resolution in ESM ever; support can always be added at any time via PR. Indeed that might be preferable, as then a much broader constituency can debate the merits of encouraging browser compatibility versus providing the convenience of automatic resolution.

In contrast, shipping with automatic resolution would be making a decision, as later removing it would mean a breaking change that would affect a lot of users. ESM would be stuck with it forever like CommonJS is.

@ljharb
Copy link
Member

ljharb commented Feb 19, 2019

Yes, it is - shipping it later is a breaking change. For defaults, at least, we largely have only one shot at this - everything we want needs to be included at the outset, or it likely won’t ever be able to be added.

I’ll also point out; without extension resolution, coffeescript could never have taken off, as migrating a file to or from coffeescript would have been a breaking change. CJS isn’t “stuck” with it, it has massively benefited from it, and continues to do so.

@zenparsing
Copy link

Thanks @GeoffreyBooth for this review. The only thing I would add is that I found the two questions presented here to be highly coupled: if we are going to allow directory imports, then we will probably want to do so using some kind of index plus file extension search, as with require. (The other option would be to use/abuse package.json for directory imports.) If we decided to support file extension lookup for directory imports, then it we might as well support it for non-directory lookups as well.

@MylesBorins
Copy link
Contributor

MylesBorins commented Feb 19, 2019 via email

@ljharb
Copy link
Member

ljharb commented Feb 19, 2019

With extension lookup matching cjs, an extensionless file next to an extensioned file of the same name matches the extensioned version first; without it, it matches only the extensionless file. Adding extension lookup breaks this case, unless I’m misunderstanding something.

We could of course deviate from how cjs does it and be nonbreaking, but that wipes out many of the reasons to do it (altho not all).

@jkrems
Copy link
Contributor

jkrems commented Feb 19, 2019

@ljharb You mean require('./foo') where a literal 'foo' exists? I'm pretty sure that would take the literal 'foo' first:

> echo 'console.log("foo");' > foo

> echo 'console.log("foo.js");' > foo.js

> node -e 'require("./foo")'
foo

@GeoffreyBooth
Copy link
Member Author

GeoffreyBooth commented Feb 19, 2019

With extension lookup matching cjs, an extensionless file next to an extensioned file of the same name matches the extensioned version first; without it, it matches only the extensionless file. Adding extension lookup breaks this case, unless I’m misunderstanding something.

As far as I’m aware, and someone please correct me if I’m wrong: In ESM, import only works for recognized extensions. You can’t import an extensionless file.

This was done to avoid breaking changes as new extensions are added in the future. For example in CommonJS I think require('./file.wasm') today imports file.wasm as CommonJS JavaScript, the same as if it were file.foo, which is a problem whenever we add support for WASM someday. In ESM, unrecognized extensions throw so that we can add extensions like .wasm in the future without those additions being breaking changes.

So import './file' in the current implementation should throw, even if an extensionless file named file exists. Therefore we can add automatic resolution later and it’s not a breaking change, any more than adding new syntax causes previously throwing things to no longer error.

Edit: I’m referring to ESM mode in our current new implementation, not the ESM spec itself.

@bmeck

This comment has been minimized.

@GeoffreyBooth

This comment has been minimized.

@bmeck

This comment has been minimized.

@MylesBorins
Copy link
Contributor

@ljharb to just confirm what @jkrems said... the behavior you specified is not accurate. If a file with no extension is available that is the first one that is executed. I still am of the belief that adding extension searching at a later point would be Semver-Minor

screen shot 2019-02-20 at 1 10 57 pm

@jkrems
Copy link
Contributor

jkrems commented Feb 20, 2019

I'll also highlight that this is one of the reasons why I'd love to see the extension-magic go away. Not even the members of this WG can say with certainty how it works (at least not without looking things up).

@GeoffreyBooth
Copy link
Member Author

Just to build off of what @MylesBorins wrote, in our current ecmascript-modules implementation, neither import of extensionless files nor import of unknown extensions are supported:

✦ ls -1
extensionless
test1.mjs
test2.mjs
unknown.ext

✦ cat test1.mjs
import './extensionless'

✦ cat test2.mjs
import './unknown.ext'~/Sites/ecmascript-modules/node --experimental-modules ./test1.mjs
(node:79626) ExperimentalWarning: The ESM module loader is experimental.
internal/assert.js:5
    require('assert')(value, message);
                     ^

AssertionError [ERR_ASSERTION]: Code: ERR_UNKNOWN_FILE_EXTENSION; The provided arguments length (2) does not match the required ones (1).
    at assert (internal/assert.js:5:22)
    at getMessage (internal/errors.js:223:3)
    at new NodeError (internal/errors.js:153:13)
    at Loader.resolve [as _resolve] (internal/modules/esm/default_resolve.js:70:13)
    at Loader.resolve (internal/modules/esm/loader.js:65:33)
    at Loader.getModuleJob (internal/modules/esm/loader.js:136:40)
    at ModuleWrap.promises.module.link (internal/modules/esm/module_job.js:38:40)
    at link (internal/modules/esm/module_job.js:37:36)

✦ ~/Sites/ecmascript-modules/node --experimental-modules ./test2.mjs
(node:79639) ExperimentalWarning: The ESM module loader is experimental.
internal/assert.js:5
    require('assert')(value, message);
                     ^

AssertionError [ERR_ASSERTION]: Code: ERR_UNKNOWN_FILE_EXTENSION; The provided arguments length (2) does not match the required ones (1).
    at assert (internal/assert.js:5:22)
    at getMessage (internal/errors.js:223:3)
    at new NodeError (internal/errors.js:153:13)
    at Loader.resolve [as _resolve] (internal/modules/esm/default_resolve.js:70:13)
    at Loader.resolve (internal/modules/esm/loader.js:65:33)
    at Loader.getModuleJob (internal/modules/esm/loader.js:136:40)
    at ModuleWrap.promises.module.link (internal/modules/esm/module_job.js:38:40)
    at link (internal/modules/esm/module_job.js:37:36)

This could probably use a better error message, but it seems to me that Myles is right, that adding extension searching later would not be a breaking change.

@MylesBorins
Copy link
Contributor

@GeoffreyBooth we should potentially move forward in our implementation of support files without file extensions at all (longer term)... but at least in the short term making extensions required is a good enough first step to leaving the design space open.

This could potentially break some workflows that currently exist for scripts that rely on a shebang... specifically that many of these files often do not have an extension as they want to look like a binary

@GeoffreyBooth
Copy link
Member Author

Also the language in the Node docs on require.extensions is strikingly forceful:

Deprecated. In the past, this list has been used to load non-JavaScript modules into Node.js by compiling them on-demand. However, in practice, there are much better ways to do this, such as loading modules via some other Node.js program, or compiling them to JavaScript ahead of time.

Since the module system is locked, this feature will probably never go away. However, it may have subtle bugs and complexities that are best left untouched.

Note that the number of file system operations that the module system has to perform in order to resolve a require(…) statement to a filename scales linearly with the number of registered extensions.

In other words, adding extensions slows down the module loader and should be discouraged.

“Since the module system is locked, this feature will probably never go away” sounds to me like this author, at least, wishes it could go away; presumably because of the performance cost that it puts on Node. I know that performance is a very high priority of the Node project overall, so it wouldn’t surprise me if the Node core team would choose better performance over developer convenience, irregardless of any browser equivalence goals. If given a chance to weigh in, they might welcome the chance to make extension searching “go away” as part of the transition to ESM.

And if not, it can be added to ESM later. But I think we should punt this decision to the larger group. There’s nothing overly technical about this that makes modules specialists better equipped to make this decision than core Node contributors.

@GeoffreyBooth
Copy link
Member Author

This could potentially break some workflows that currently exist for scripts that rely on a shebang… specifically that many of these files often do not have an extension as they want to look like a binary

Per the proposal, at least, extensionless files would still be supported as initial entry points, just not as the specifier of an import statement. So you could run them from a command line, but not import them.

@ljharb
Copy link
Member

ljharb commented Feb 20, 2019

@GeoffreyBooth note that those docs aren't suggesting extension lookup should go away, they're suggesting allowing users to compile on demand should go away - a feature which esm relies upon, and which loaders will provide.

@GeoffreyBooth
Copy link
Member Author

note that those docs aren’t suggesting extension lookup should go away, they’re suggesting allowing users to compile on demand should go away

That’s not how I read it. In “this feature will probably never go away,” “this feature” is require.extensions. The author is wishing that require.extensions would go away, because “adding extensions slows down the module loader and should be discouraged”. The module loader is slowed down because “the number of file system operations that the module system has to perform in order to resolve a require(…) statement to a filename scales linearly with the number of registered extensions.” The author is specifically citing the extension searching itself as the performance problem.

@ljharb
Copy link
Member

ljharb commented Feb 20, 2019

imo usability and refactorability far outweigh any performance considerations.

@bmeck
Copy link
Member

bmeck commented Feb 20, 2019

So you could run them from a command line, but not import them.

This was not my understanding of nodejs/ecmascript-modules#32 . Is it intended that a file could have a known format but not act like it has that format?

@MylesBorins
Copy link
Contributor

@GeoffreyBooth require.extensions and adding extensions is specifically talking about that particular API and is not related to the resolution algorithm at all afaict

@devsnek
Copy link
Member

devsnek commented Feb 20, 2019

just to be really clear about require.extensions, the reason the docs are worded like that is because we wanted to bring in a different hook system which was more granular than the extension of the file.

@GeoffreyBooth
Copy link
Member Author

GeoffreyBooth commented Feb 20, 2019

@devsnek and others who want this enabled by default, do you mind opening a new issue where you propose how it should work in our new implementation as things stand now? What would take precedence between .mjs and .js and .cjs files when all three are side-by-side (e.g. file.mjs, file.js, file.cjs)? Regardless of whether this is enabled or not by default, it could be configurable; what would such a configuration block look like?

There might very well be consensus for enabling extension searching via configuration, since it clearly will be possible via loaders (I think). So we might as well sketch out what that would look like, especially since configurability arguably improves the feature even if it’s on by default, since configuration improves performance by limiting the number of extensions that Node needs to try.

@devsnek
Copy link
Member

devsnek commented Feb 21, 2019

@GeoffreyBooth i made an implementation locally where it goes mjs -> cjs -> the rest. i haven't made a pr yet because it doesn't include the mode flag, and i'd rather not put work in for something that's going to be replaced, so i'm waiting to see how i can layer this on the changes brad is making for loaders.

As for the configuration, webpack has option to configure these extensions that also happens to be ordered, so i think we have something to work from there. Overall, i've never actually seen someone have an issue with the ordering, so i'm not that worried, but it would definitely make a good addition. people not using esm would benefit from that option too, so i don't think we need to tie it to our implementation.

@ljharb
Copy link
Member

ljharb commented Feb 21, 2019

We didn't get time to discuss .cjs in today's meeting; I still feel like it kind of got slipped in - I did not realize that consensus on that PR would include that extension, and had I, I wouldn't have agreed to it.

@GeoffreyBooth
Copy link
Member Author

GeoffreyBooth commented Feb 21, 2019

@devsnek I think the purpose of the configuration wouldn’t only be to specify the search order, but also to specify which extensions are searched; and which directory indexes are searched (if any). For example if a user has no .node files in their project, configuration can tell Node not to try foo.node for './foo'. It might also be nice to be able to specify that a directory index might not necessarily be index.*, but rather main.* or whatever else people might want.

@ljharb You’re welcome to open an issue to propose removing it from the new implementation. I’m sorry you didn’t realize it was part of the PR. The explanation for why we felt it was necessary is here.

@ljharb
Copy link
Member

ljharb commented Feb 21, 2019

Proposing removal would be premature without at least openly discussing it in a single meeting, so I can try to understand why it's needed.

That explanation, fwiw, doesn't seem to justify it when a more general "parse extension X as if it's extension Y" exists. .js already means CJS.

In a CommonJS package, one can deep import an .mjs file to load it as ESM despite the package’s CommonJS scope

You can't import anything in CJS; only in ESM, and I don't know what a "commonJS package" is - "commonJS" and "ESM" are adjectives that describe a file, not a package.

@GeoffreyBooth
Copy link
Member Author

@ljharb Please open a new issue.

@ljharb

This comment has been minimized.

@weswigham
Copy link
Contributor

That explanation, fwiw, doesn't seem to justify it when a more general "parse extension X as if it's extension Y" exists. .js already means CJS.

IIRC the primary reason for a .cjs extension would be to have an extension-based disambiguator in the cjs direction (since it's supposed to guess if .js means cjs or esm based on the surrounding package context). This way even in a package with type: "esm" (so all .js files are modules) you can still include cjs-type files if need be.

@devsnek
Copy link
Member

devsnek commented Feb 21, 2019

@GeoffreyBooth i'm agreed there should be lots of config here, i just don't really want to get into the design. i think some magic combo of loaders and copying from https://webpack.js.org/configuration/resolve/#resolveextensions will be enough. I'm just not sure what that is because we don't have loaders finalized yet.

Even without the above, i think searching stands as a fantastic feature on its own and as there are no good technical arguments against turning it on by default[1] i'm hopeful we can move at a good pace.

1: in the current impl there is a chance of a performance hit if your dependencies use searching that hits a lot of extensions further down the list. This is possible to fix by performing all existence checks in parallel, but this has not been done yet. given that this is possible, i wouldn't consider performance a blocker.

@zenparsing
Copy link

zenparsing commented Feb 21, 2019

One other point: the extension-searching feature is also highly coupled with our strategy for resolving bare-specifier imports. If we use "index" or "main", as in the CJS implementation, then presumably we'd need to search over extensions in a like manner. "Deep package imports" are subject to the same considerations.

On the other hand, if we are committed to not searching over extensions, then that seems to imply that we will need a different mechanism (e.g. the exports map proposal or something else) for resolving bare imports or deep bare imports.

@GeoffreyBooth
Copy link
Member Author

Resolved per nodejs/ecmascript-modules#48

@GeoffreyBooth GeoffreyBooth removed modules-agenda To be discussed in a meeting labels May 9, 2019
johndaniels added a commit to johndaniels/es-interpreter that referenced this issue Jun 6, 2021
It looks like per nodejs/modules#268
that node's resolution for modules requires a file extension
by default. Typescript wants this to be included by adding "js"
to files that are imported.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants