Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

canonicalizePath behaves strangely with paths that do not exist #23

Closed
ekmett opened this issue Feb 22, 2015 · 12 comments
Closed

canonicalizePath behaves strangely with paths that do not exist #23

ekmett opened this issue Feb 22, 2015 · 12 comments
Labels
type: a-bug The described behavior is not working as intended.
Milestone

Comments

@ekmett
Copy link
Member

ekmett commented Feb 22, 2015

https://ghc.haskell.org/trac/ghc/ticket/4215

@ekmett
Copy link
Member Author

ekmett commented Feb 22, 2015

@Rufflewind
Copy link
Member

Seeing as canonicalizePath is primarily for dereferencing symbolic links, I'm not sure why there is a need for it to work on non-existent paths at all. I think it should simply result in an error if the path doesn't exist, just like how realpath works on Linux. Plus, there are many reasons it can fail, not simply because it doesn't exist, but also due to permission problems, network issues, symbolic-link loops, etc. There's really no guarantee.

The alternative is to have it fallback to pure path normalization whenever it fails. I'm not a fan of this, because it means errors are silently swallowed.

However, I understand that:

  • Someone might have a use case that I couldn't think of ­– in particular, I'd be interested to hear how others use this function.
  • For backward-compatibility it might be better to stick with the lenient behavior (of course, it can still break existing code …).

So perhaps we could add another function (e.g. realPath) that has the strict behavior while canonicalizePath becomes the lenient option.

Hence, the choice is between:

  • Make canonicalizePath strict and always raise errors if the path doesn't exist.
  • Make canonicalizePath lenient and add a stricter version realPath that raises errors if the path doesn't exist or is inaccessible.

Comments?

@snoyberg
Copy link
Contributor

I very regularly want to have absolute paths to simplify other parts of the code, such as when a subprocess will be run from a different directory. I will generally use canonicalizePath in that case, even in the absence of a symlink. It would be convenient for that use case if the function worked on non-existent paths.

I think I prefer the idea of keeping the current function with the current behavior, and adding a new function with the new behavior. That would prevent accidentally writing code that compiles with both an old and new version of directory, but has buggy runtime behavior on an old version.

@Rufflewind
Copy link
Member

I very regularly want to have absolute paths to simplify other parts of the code, such as when a subprocess will be run from a different directory.

Ah, that's a good point. But do you care about whether it dereferences symbolic links?

@snoyberg
Copy link
Contributor

It's actually irrelevant to me in that use case; I just want an absolute path. I could generally just append the relative path to the current working directory, but canonicalizePath feels more correct, so I reach for it first.

@Rufflewind
Copy link
Member

That's kind of my argument: I think canonicalizePath should be reserved strictly for dereferencing symbolic links, not to absolutize a path. The latter operation can be made to work all the time and completely predictably.

Perhaps we can add another convenience function to directory:

makeAbsolute :: FilePath -> IO FilePath
makeAbsolute path = fmap (</> path) getCurrentDirectory

and update the documentation in System.FilePath.makeRelative to reflect this.

@argiopetech
Copy link
Contributor

One small quibble. realpath (which is seemingly implemented inconsistently across a variety of platforms per T4215) is perfectly fine with some paths which don't exist. For example:

$ ls /tmp/bar
ls: cannot access /tmp/bar: No such file or directory
$ realpath /tmp/bar
/tmp/bar
$ realpath /tmp/bar/baz
realpath: ‘/tmp/bar/baz’: No such file or directory

@Rufflewind
Copy link
Member

@argiopetech realpath (shell utility) and realpath (POSIX function), despite the names, can do very different things. The former doesn't even exist on some systems.

@argiopetech
Copy link
Contributor

I think it should simply result in an error if the path doesn't exist, just like how realpath works on Linux

I inferred the command line tool from this statement. Apologies for the misunderstanding.

I'm in favor of not silently swallowing errors. I do like the idea of a makeAbsolute function which would replace canonicalizePath for absolutizing paths, leaving canonicalizePath to potentially fail due to permission/existence/etc. errors (thereby maintaining backward compatibility).

Potentially this would allow us to close T5014 as WillNotFix with a reference to the new function and leave us to tackle T4215.

Fortunately, T4215 may have been fixed for POSIX systems when T4113 was fixed in 96d3478. It now throws errors as in realpath() on Linux and FreeBSD. It will need to be checked on an OS X machine to be certain (I don't have one). Windows effectively returns an escaped version of whatever you pass it, regardless of existence, so we'll need to decide whether to leave that as is to avoid breaking backward compatibility or to add some checks to maintain cross-platform compatibility.

@Rufflewind
Copy link
Member

The latter operation can be made to work all the time and completely predictably.

I retract that somewhat. Turns out it's possible for getCurrentDirectory to fail if the working directory is inaccessible. This is because getCurrentDirectory always returns the physical path, which requires dereferencing symlinks. (But this should be a rare and unusual occurrence.)

@Rufflewind Rufflewind added type: a-bug The described behavior is not working as intended. type: a-feature-request This is a request for a new feature. type: b-discussion This is a discussion with unclear objectives. labels Mar 3, 2015
@Rufflewind Rufflewind added this to the 1.2.2.1 milestone Mar 3, 2015
@Rufflewind
Copy link
Member

Edit: actually, disregard this post. Neil pointed out that this doesn't really work for symlinks.


Besides makeAbsolute, I think another really useful function would be something like canonicalizePath but without symbolic link dereferencing. Call it, say, simplifyPath:

simplifyPath "./foo" == "foo"
simplifyPath "/foo/../bar" == "/bar"

However, such a function would be entirely pure, so it's not really within the scope of directory. It should belong in filepath instead.

@Rufflewind
Copy link
Member

  • Leave canonicalizePath as is and deprecate it. Rationale: backward compatibility; inconsistent behavior (aside from the aforementioned problems, it also fails to resolve symbolic links on Windows). [Edit: changed to return reasonable results even for nonexistent paths.]
  • Add realPath, which correctly resolves symbolic links and always requires the target to exist. [Edit: skipped as this seems redundant; just use canonicalizePath while verifying existence]
  • Add makeAbsolute, which prepends the current working directory to a relative path (does nothing for absolute paths).
  • Update the docs in makeRelative to refer to makeAbsolute instead of canonicalizePath.

@Rufflewind Rufflewind modified the milestones: 1.2.2.2, 1.2.2.1 Apr 11, 2015
@Rufflewind Rufflewind modified the milestones: 1.2.3.0, 1.2.2.2 Apr 24, 2015
@Rufflewind Rufflewind removed type: b-discussion This is a discussion with unclear objectives. type: a-feature-request This is a request for a new feature. labels May 30, 2015
Rufflewind added a commit to Rufflewind/directory that referenced this issue May 31, 2015
Apply realpath to the largest prefix of the given path that is still
accessible (that can be stat successfully) so that canonicalizePath will
return reasonable results even for non-existent paths.  It should very
rarely throw exceptions now.

Fixes haskell#23.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: a-bug The described behavior is not working as intended.
Projects
None yet
Development

No branches or pull requests

4 participants