-
-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(commonjs): node_modules lookup depth limit to avoid infinite loops #985
Conversation
@@ -172,7 +172,8 @@ export function commonjsResolveImpl (path, originalModuleDir, testCache) { | |||
if (path[0] === '/') { | |||
originalModuleDir = '/'; | |||
} | |||
while (true) { | |||
let depth = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let depth = 0; | |
var depth = 0; |
Shouldn’t this be es5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although actually I already see another let
and it’s already a template string 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's inside the generated code, not the plugin's code, should be backwards compatible :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh so rollup itself tries to be compatible with ES3 by default (see rollup/rollup#4215), so I think ideally this should be var
, and we shouldn't be using template strings (edit: template string is not in the output)
Can you help me understand this better? How does resolving node_modules hierarchically lead to a performance cliff? From an abstract perspective "check to root" lookups on trees are always fast as they have convergent bounds. Are you sure there isn't something else going that this isn't just patching over? |
Sure!
In a virtual fs, you don't know where the root is.
You'd think, right? |
This is the part that seems very odd to me. Are you telling me you have ids in your resolution system that are resolved to the final id of Is this a custom resolver or the default one? If a custom one I would suggest seeing if you can use better ids in the architecture. If a default one, that sounds like a serious bug to me. If we really wanted to make this accepted behaviour, then if the path were |
The case I presented you with was an easy one, as it's a common request (look at the issues and stackoverflow...) to require from a different "main folder", or even a different drive. The piece of code that raised this issue to my awareness was an annoying one, some hack used in some packages, to avoid detection of the I first tried to tackle these two different scenarios exactly but then realized that it will be code that's written for very specific scenarios and I'll be possibly missing other scenarios. The loop there did not have a clear stop in case it does not detect another stop, and it needs one. |
Can you please just clarify again for me what these virtual paths look like? Do they really start with Perhaps you can also explain why you could not just implement the stop check in your own algorithm as |
The paths are all relative. In the beginning we put full paths in the system (each virtual require call gets two parameters, one is the path to require, the second is the path of the current module). But if you have such a structure: And if someone works around the commonjs plugin (taking a ref to We could theoretically find the lowest common root for all paths required in the project, but then we would have 2 main problems and o deal with:
Currently the max-depth thing works and solves the issue. |
I see the situation and it's definitely a big problem that internal paths so easily get exposed. Another angle here is - how are those internal ids being exposed in the first place? RollupJS doesn't expose its absolute ids in builds so long as the source maps root is being configured correctly. It should also be possible to hook the source maps naming specifically for masking exposure. It's just a fact of life at this point that all JS build environments build URLs or paths that contain internal info. Defining the public / private aspects of how this is exposed is how we shift the context rather than trying to operate on "opaque paths" I think. So I'd really still advise against any root backtracking scheme as the id scheme where possible. But even if you must use such a backtracking scheme, as mentioned a root check is still possible. You shouldn't need to collect the list of roots, but instead just resolve the path to its absolute file path and check if that absolute path is a root path. In Windows and Posix the definition of a root path is a well defined concept, without needing to touch the file system or check over files - whether it is Both conversations are interesting, happy to continue both. |
I think you are missing something here :-) The way for rollup to generate a product with internal path info, is if we do not strip that info. The paths are needed for resolving in runtime. So either we put the full paths - including the root "C:" or the root "/var/whatever" in the final product js file, so we can test for a root, or we just decide when to stop looking for a root. We can't resolve those requires in the rollup stage, that's just the opposite of what this specific feature is accomplishing. |
Ah I missed this embedded runtime resolver part. I will abstain from reviewing further then. |
No matter, I missed the conversations in this repo 🤣 |
I'd be happy to dig into a separate conversation elsewhere about why |
In general I prefer to make the un-analyzable analyzable as opposed to implementing mock runtime layers. And where mock runtime layers are implemented, separating them from from general approaches like this plugin. I think the dynamic require stuff is a huge maintenance burden for this plugin. |
The problem is that most projects can't rollup without it. And the commonjs plugin would actually break it in a way that a separate plugin could not undo. Unfortunately |
This project not supporting cycles is something that has been an issue from the start and can be solved separately via execution wrapping when CJS executions exist in cycles to match CJS execution ordering semantics. I was never able to work on this myself unfortunately, but it was always a hope. So the root solution to cycles is very much via wrapping approaches that can ensure the CJS execution invariants, and dynamic layers are more ways to work around that. For highly dynamic requires - extending the ability for these requires to be analyzed is always preferable because that optimization can then feed back into the tree shaking and bundling layers. In the extreme cases of completely non-analyzable requires, maintaining a dynamic require registry could be done more simply by allowing eg a hook from this plugin itself that allows substituting a require implementation, and then have that require implementation use an analysis layer to point to the right module. Note this type of referencing also relates to ideas in rollup/rollup#4167. |
Alternatively, configuring the runtime such that dynamic requires will be hit is a good solution. Eg via a simple dynamic require remapping table like: const requireRemaps = {
'buffer': '#buffer',
'fs/': '#fs/'
};
// was: require(unknown)
require(remap(unknown)); where the |
@guybedford can we merge this? |
I'm not blocking this landing I'm just not explicitly approving so @shellscape you are welcome to land this. I was more just expressing concern at the long-term maintenance requirements of this feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again, I think I am no longer happy with this solution. The reason is that it introduces a new parameter to manage a case where we cannot make an informed decision what is imported anyway because we "lost" the original file path. And to me, this just means unnecessary configuration complexity, especially since hardly anyone will ever touch that option.
For now, I see two different approaches:
- Just detect cases where the injected second require parameter is missing and always return that we did not find a file. Simple but effective. Or:
- Fix it. One approach could be to use "this" for the second parameter instead by replacing dynamic requires with
require.bind('/path/to/module')
instead. Then you can pass around that require as much as you like and it will still work.
I can integrate one of the solutions in #1038.
I think you misunderstand the actual issue. But I can think of another possible solution: Calculate the max depth level on compile time, and embed it in the runtime. So we'll know we have reached the virtual root. |
I only have the example you provide in your test and that can indeed be fixed by replacing a standalone |
@danielgindi I added my solution to #1038 and also published it as |
@danielgindi gentle ping |
I would prefer if we could close this one in favor of #1038 which also fixes the issue, otherwise it will just be a difficult merge conflict |
Closing in favor of #1038 per request. Thank you to everyone who worked on this. |
Rollup Plugin Name:
commonjs
This PR contains:
Are tests included?
Breaking Changes?
Description
For some freaking reason, I stumbled upon a few packages (like js-yaml etc.) that include a "hack" like this:
In this case, rollup currently does not keep it with a native
require
, but goes through commonjsRequire. And in dynamic mode, this means that it may try forever to resolve the upper levelnode_modules
, as it has no knowledge of the possible depth or the physical file system.For this I have introduced a barrier,
nodeModulesLookupDepth
, that limits the number of lookup loops.The default for this is
15
which I found to be reasonable and allowing.It will only need to hit that barrier in extreme circumstances anyway.
This replaces the previous barrier which only detected when user input allowed it (
endsWith('../')
) but could theoretically fail too early.