Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Globbing a symlink cycle never returns #86

Closed
vojtajina opened this issue Jan 31, 2014 · 12 comments
Closed

Globbing a symlink cycle never returns #86

vojtajina opened this issue Jan 31, 2014 · 12 comments

Comments

@vojtajina
Copy link

keeps globbing...

@despairblue
Copy link

👍

@isaacs
Copy link
Owner

isaacs commented Oct 14, 2014

Works for me.

$ ls -laF foo
total 8
drwxr-xr-x+  3 isaacs  staff  102 Oct 14 14:22 ./
drwxrwxrwx+ 15 isaacs  staff  510 Oct 14 14:21 ../
lrwxr-xr-x   1 isaacs  staff    6 Oct 14 14:22 bar@ -> ../bar

$ ls -laF bar
total 8
drwxr-xr-x+  3 isaacs  staff  102 Oct 14 14:22 ./
drwxrwxrwx+ 15 isaacs  staff  510 Oct 14 14:21 ../
lrwxr-xr-x   1 isaacs  staff    6 Oct 14 14:22 foo@ -> ../foo

$ ls -laF bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo
lrwxr-xr-x  1 isaacs  staff  6 Oct 14 14:22 bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo@ -> ../foo

$ echo $NODE_PATH


$ export NODE_PATH=$PWD

$ node -e "console.log(require('glob').sync('foo/**'))"
[ 'foo',
  'foo/bar',
  'foo/bar/foo',
  'foo/bar/foo/bar',
  'foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar' ]

$ node -e "require('glob')('foo/**', console.log)"
null [ 'foo',
  'foo/bar',
  'foo/bar/foo',
  'foo/bar/foo/bar',
  'foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo',
  'foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar' ]

Do you have a test case?

@isaacs isaacs closed this as completed Oct 14, 2014
@despairblue
Copy link

Any puppet module with tests, for example https://github.com/kietsy/kietsy-tinc

In that repo after initializing the test infrastructure

  • bundle install --path vendor/bundle
  • bundle exec rake spec_prep
    issuing node -e "console.log(require('glob').sync('spec/**/*spec.rb'))" > results will result in a 5.6mb big results file. Doing this for a puppet module with like 5 dependencies will crash node after it consumed almost 2 gig of ram:
node -e "console.log(require('glob').sync('spec/**/*_spec.rb'))" > result
FATAL ERROR: JS Allocation failed - process out of memory
[1]    14906 abort (core dumped)  node -e "console.log(require('glob').sync('spec/**/*_spec.rb'))" > result

So the problem is not an infinite loop, but rather that node-glob should stop at foo/bar, just like the shell globbing works (which only gives you unique files). Another solution would be to be able to tell node-glob not to follow symlinks at all.

@isaacs
Copy link
Owner

isaacs commented Oct 15, 2014

Hm. So, shell globbing doesn't "only give you unique files", nor does it "not follow symlinks".

Example:

$ tree
.
├── bar/
│   ├── baz -> ../baz/
│   ├── foo -> ../foo/
│   ├── barfile
│   └── lnsrcfile
├── baz/
│   ├── bar -> ../bar/
│   ├── foo -> ../foo/
│   └── bazfile
└── foo/
    ├── bar -> ../bar/
    ├── baz -> ../baz/
    ├── dir/
    │   ├── a/
    │   │   └── b/
    │   │       └── c/
    │   │           └── d/
    │   │               ├── foolink -> ../../../../../../bar/foo/bar/foo/
    │   │               ├── recurselink -> ../../../../../../
    │   │               ├── weirdlink -> ../../../../../../bar/foo/bar/
    │   │               └── deepfile
    │   └── dirfile
    ├── foofile
    └── lnfile -> ../bar/lnsrcfile

17 directories, 7 files

$ shopt -s extglob; shopt -s globstar

$ for i in +(foo|bar|baz)/**/*file; do echo $i; done
bar/barfile
bar/baz/bazfile
bar/foo/foofile
bar/foo/lnfile
bar/lnsrcfile
baz/bar/barfile
baz/bar/lnsrcfile
baz/bazfile
baz/foo/foofile
baz/foo/lnfile
foo/bar/barfile
foo/bar/lnsrcfile
foo/baz/bazfile
foo/dir/a/b/c/d/deepfile
foo/dir/a/b/c/d/foolink/foofile
foo/dir/a/b/c/d/foolink/lnfile
foo/dir/a/b/c/d/weirdlink/barfile
foo/dir/a/b/c/d/weirdlink/lnsrcfile
foo/dir/dirfile
foo/foofile
foo/lnfile

Note that symbolic links ARE being followed by **, and the "same" files DO appear multiple times (eg bar/foo/lnfile, baz/foo/lnfile).

It seems, however, that it does not explore symbolic links that reference a path portion of the current path to the symbolic link itself. So, even though foo/bar/foo/foofile would match, it detects that foo/bar/foo refers to the foo/ part of itself, so it doesn't explore that bit.

Furthermore, this ** behavior appears to have changed between Bash 4.1 and Bash 4.3! Probably because it is unwieldy to deal with a gigantic mess of results when there are recursive symbolic links.

The crash that you're seeing is happening because there are a large number of recursive symbolic links, rather than a particularly deep set of them, so you're hitting the memory limit before hitting the ECYCLE error from the fs. Updating to Bash 4.3-style behavior would fix the problem.

@isaacs isaacs reopened this Oct 15, 2014
@despairblue
Copy link

It's actually crashing because of the large set of redundant results. There is just one symlink.

$ ll spec/fixtures/modules
total 4.0K
drwxr-xr-x 7 despairblue users 4.0K Oct 15 00:44 stdlib
lrwxrwxrwx 1 despairblue users   37 Oct 15 00:44 tinc -> /home/despairblue/vcs/git/kietsy-tinc

I used this script

#/usr/bin/bash

echo $*

and the tab completion of either bash or zsh as a reference and compared it to node-globs behaviour. This gist illustrates two cases, yours above and mine for a puppet module with one dependency. As one can see, the current shell globbing outputs each file only once, if there is another path to the same file it's omitted in the results. Don't know if it's always been like this, I can only test it with zsh 5.0.7 and bash 4.3.030 (oh and I just noticed that the script does not work with bash since tab globbing does not seem to be supported for bash, it just replaces the spec/**/*_spec.rb with the first match, instead of all matches, but tree works as expected and won't follow the symlink)

@isaacs
Copy link
Owner

isaacs commented Oct 15, 2014

Your test is interesting, but non-authoritative. The reference implementation that node-glob targets is bash 4's globstar and extglob behavior. The example I posted above does show that Bash 4.1 and Bash 4.3 have different behavior, and it seems that Bash 4.3's globstar behavior is superior, so that's what we should target.

It doesn't matter what zsh or tab-completion do, ultimately, though this is one example of a use-case where the more "conservative" results are much more useful.

@despairblue
Copy link

You're right, I didn't grasp everything in your comment at first. Adopting the bash4.3 globstar behavior would fix this (and would actually matches the one from zsh). Just to be sure, the command for i in +(foo|bar|baz)/**/*file; do echo $i; done in your example illustrates the bash4.3 behavior, right? And if so, what was the bash4.1 behavior? I simply don't see two different outputs for the same command, but your comment suggests that you demonstrated both behaviors.

@isaacs
Copy link
Owner

isaacs commented Oct 15, 2014

No, I don't have bash 4.1 installed any longer (upgraded due to shellshock some time ago).

But if I re-generate the test fixtures with bash 4.3, I see different results. So, yeah, we're doing the wrong thing now (but only because "the right thing" changed!)

@isaacs
Copy link
Owner

isaacs commented Oct 15, 2014

Here's the diff in the test fixture results from bash 4.1 (used to generate them originally) and bash 4.3 (which I have now) https://gist.github.com/isaacs/f748618f744f61756f55

@isaacs
Copy link
Owner

isaacs commented Oct 15, 2014

So, it seems like the behavior can be summed up as: "While expanding matches for a ** section, do not explore any directories already part of that particular ** expansion".

This, of course, is after consecutive ** segments are collapsed.

@despairblue
Copy link

👍 Yep, if adopting bash's new behavior is OK for you that would be great. It would suffice my use case and probably the other two's referenced above as well.

@isaacs
Copy link
Owner

isaacs commented Nov 17, 2014

Landed in 4.0, made MUCH faster in 4.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants