Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up ripgrep for compilation on non-unix, non-windows platforms #2787

Merged
merged 7 commits into from
Apr 23, 2024
6 changes: 2 additions & 4 deletions crates/cli/src/hostname.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,8 @@ pub fn hostname() -> io::Result<OsString> {
}
#[cfg(not(any(windows, unix)))]
{
io::Error::new(
io::ErrorKind::Other,
"hostname could not be found on unsupported platform",
)
// backup solution for systems that do not have gethostname():
Ok(OsString::from("localhost"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of giving wrong answers like this. And it seems like WASI can't really support hyperlinks anyway, due to the ban on absolute paths. So I think this change should probably be reverted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I was trying to fix here, and that's also a question I have about setting up CI, is that the current version of the code causes a compilation error:

Compiling grep-cli v0.1.10 (/Users/holzschu/src/Xcode_iPad/ripgrep_clone/crates/cli)
error[E0308]: mismatched types
  --> crates/cli/src/hostname.rs:28:9
   |
16 |   pub fn hostname() -> io::Result<OsString> {
   |                        -------------------- expected `Result<OsString, std::io::Error>` because of return type
...
28 | /         io::Error::new(
29 | |             io::ErrorKind::Other,
30 | |             "hostname could not be found on unsupported platform",
31 | |         )
   | |_________^ expected `Result<OsString, Error>`, found `Error`
   |
   = note: expected enum `Result<OsString, std::io::Error>`
            found struct `std::io::Error`
help: try wrapping the expression in `Err`
   |
28 ~         Err(io::Error::new(
29 |             io::ErrorKind::Other,
30 |             "hostname could not be found on unsupported platform",
31 ~         ))
   |

For more information about this error, try `rustc --explain E0308`.

So if the goal of setting up CI is to get the same result after the PR as before, that's not going to happen.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. I think you just want Err(io::Error::new(...)), as suggested in the error message you're showing. This particular code path is probably entirely untested in CI at present, and thus the fact that it is wrong now and has always been wrong was never noticed.

My main objection was to silently using a potentially incorrect value. I'm open to doing that in the future to get hyperlinks working in WASI after we've considered the possible alternatives. But since std::fs::canonicalize isn't going to work on WASI right now anyway, we should just avoid changing the semantic intent of this code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done the changes for hostname.rs. But then, for symmetry, I think I should have a branch #[cfg(not(any(windows, unix)))] instead of #[cfg(target_os = "wasi")] in hyperlink.rs. But what should from_path() return in that case?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a DEBUG log message. Mentioned here: #2787 (comment)

}
}

Expand Down
51 changes: 51 additions & 0 deletions crates/printer/src/hyperlink.rs
Original file line number Diff line number Diff line change
Expand Up @@ -853,6 +853,57 @@ impl HyperlinkPath {
}
HyperlinkPath(result)
}
#[cfg(target_os = "wasi")]
pub(crate) fn from_path(original_path: &Path) -> Option<HyperlinkPath> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WASI spec seems to ban absolute paths and the stdlib doesn't implement canonicalize. I don't know how black-and-white this is (especially in the long term) but right now this might as well be stubbed out with None.

When running on top of Windows I imagine you'd want to return a Windows path here, even though WASI is more Unix-like internally.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious how std::path::absolute will work here. It seems to depend on what std::env::current_dir will return, and that does seem to have an implementation on WASI: https://github.com/rust-lang/rust/blob/eff958c59e8c07ba0515e164b825c9001b242294/library/std/src/sys/pal/wasi/os.rs#L71-L95

But I agree, for now, this should just always return None and emit a DEBUG log that hyperlinks aren't supported on WASI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT it works like this:

  • WASI syscalls take a directory descriptor and a path relative to that directory descriptor.
  • The WASI runtime gives the WASM binary a list of directory descriptors plus their actual original paths (preopens). So the runtime grants selective access. It could easily pretend the filesystem is smaller than it really is, chroot-style, but it doesn't have to: it can grant a /home/user descriptor with the /home/user path.
  • wasi-libc takes absolute paths and resolves them to a directory descriptor + relative path based on the longest matching prefix from the preopens, if any. (This is inspired by libpreopen.)
  • wasi-libc also emulates a working directory in WASM-space.

So it's not as bad as I thought. Thanks to the emulation absolute paths should for the most part just work as long as access to that part of the filesystem is granted, particularly on Unix. (Except for missing syscalls I guess.)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. So std::env::current_dir() works, and thus std::path::absolute might in turn work here as well.

use std::os::wasi::ffi::OsStrExt;

// We canonicalize the path in order to get an absolute version of it
// without any `.` or `..` or superfluous separators. Unfortunately,
// this does also remove symlinks, and in theory, it would be nice to
// retain them. Perhaps even simpler, we could just join the current
// working directory with the path and be done with it. There was
// some discussion about this on PR#2483, and there generally appears
// to be some uncertainty about the extent to which hyperlinks with
// things like `..` in them actually work. So for now, we do the safest
// thing possible even though I think it can result in worse user
// experience. (Because it means the path you click on and the actual
// path that gets followed are different, even though they ostensibly
// refer to the same file.)
//
// There's also the potential issue that path canonicalization is
// expensive since it can touch the file system. That is probably
// less of an issue since hyperlinks are only created when they're
// supported, i.e., when writing to a tty.
//
// [1]: https://github.com/BurntSushi/ripgrep/pull/2483
let path = match original_path.canonicalize() {
Ok(path) => path,
Err(err) => {
log::debug!(
"hyperlink creation for {:?} failed, error occurred \
during path canonicalization: {}",
original_path,
err,
);
return None;
}
};
let bytes = path.as_os_str().as_bytes();
// This should not be possible since one imagines that canonicalization
// should always return an absolute path. But it doesn't actually
// appear guaranteed by POSIX, so we check whether it's true or not and
// refuse to create a hyperlink from a relative path if it isn't.
if !bytes.starts_with(b"/") {
log::debug!(
"hyperlink creation for {:?} failed, canonicalization \
returned {:?}, which does not start with a slash",
original_path,
path,
);
return None;
}
Some(HyperlinkPath::encode(bytes))
}
}

#[cfg(test)]
Expand Down