Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port from libc to rustix #401

Closed
notgull opened this issue Feb 26, 2024 · 17 comments · Fixed by #520
Closed

Port from libc to rustix #401

notgull opened this issue Feb 26, 2024 · 17 comments · Fixed by #520

Comments

@notgull
Copy link

notgull commented Feb 26, 2024

rustix is a crate which provides I/O-safe wrappers around raw system calls on Linux and libc on other Unixes. By using it we can use raw syscalls instead of libc calls, which has a number of performance, safety and code size advantages.

I am wiling to write a PR for this; however there are a couple of items of discussions we have to go over first.

  • This will bump this crate's MSRV to v1.63. Is this acceptable?
  • It will also add a couple of (mostly inconsequential) dependencies. Is this acceptable?
@newpavlov
Copy link
Member

newpavlov commented Feb 26, 2024

I considered it (personally, I like rustix and use it instead of libc in my projects), but there are several issues:

  • MSRV 1.63 is a bit too high for us. rustix has a laxer MSRV policy, so we probably would have to wait until MSRV-aware resolver will be adopted widely enough (or until next breaking getrandom release which may have sufficiently high MSRV).
  • Some users are quite sensitive to additional dependencies, regardless whether they are "mostly inconsequential" or not.
  • rustix::rand::getrandom accepts &mut [u8], while we provide API which works over &mut [MaybeUninit<u8>].

So we probably will not be migrating to rustix at the very least until hypothetical getrandom v0.3.

@notgull
Copy link
Author

notgull commented Feb 26, 2024

For your last point, getrandom_uninit is exposed in rustix now.

For the rest, I agree. We’ll have to wait.

@newpavlov
Copy link
Member

getrandom_uninit is exposed in rustix now.

Ah, good to know!

@josephlr
Copy link
Member

josephlr commented May 2, 2024

@notgull is it the intent of rustix to support all target_family = unix Rust targets? We try to support every rust target, and I would be a pain to deal with both rustix and libc crates.

We could maybe switch just the use_file.rs code to use rustix, which would require Linux, Android, Haiku, Redox, NTO and AIX support.

@newpavlov
Copy link
Member

newpavlov commented May 2, 2024

I think it makes sense to use rustix only on Linux and Android for the raw syscall feature. For other targets it's essentially a glorified libc wrapper. We probably can limit its use only to the getrandom path and make it a (disabled by default) crate feature.

@notgull
Copy link
Author

notgull commented May 3, 2024

@notgull is it the intent of rustix to support all target_family = unix Rust targets? We try to support every rust target, and I would be a pain to deal with both rustix and libc crates.

rustix supports all of the Unix targets, as well as some of the other targets, like Windows (through winsock).

We could maybe switch just the use_file.rs code to use rustix, which would require Linux, Android, Haiku, Redox, NTO and AIX support.

Yes, I think this would work.

I think it makes sense to use rustix only on Linux and Android for the raw syscall feature. For other targets it's essentially a glorified libc wrapper. We probably can limit its use only to the getrandom path and make it a (disabled by default) crate feature.

This sounds like a pain to manage, especially when rustix is present enough in the ecosystem that you're probably going to import it anyways.

@newpavlov
Copy link
Member

This sounds like a pain to manage, especially when rustix is present enough in the ecosystem that you're probably going to import it anyways.

We are unlikely to use rustix by default and, as noted above, I don't see reasons to use it on non-Linux platforms in our case. So rustix-based code is likely to be behind a feature flag. Enabling the hypothetical linux-raw feature will be no more painful than enabling optional rustix dependency.

@briansmith
Copy link
Contributor

briansmith commented Jun 7, 2024

I looked more into rustix:

  • Rustix enables its file I/O functionality by default and it can't be disabled. It has a default implementation of getauxval() that on Linux kernels < 6.4, will do file I/O on "/proc/self/auxv. (Linux 6.4 introduced a syscall to get the auxv vector). Otherwise, it has an option to use libc::getauxval()instead to avoid that file I/O, and it has another option to provide the auxv vector in aninit()call instead. It doesn't matter for us as we don't usegetauxval()` for anything, but it means that we need to be careful about
  • I did not see any sanitizer support, especially msan, in rustix or linux-raw-sys, so I think the unpoison in Android/Linux: Support msan; unpoison output of getrandom syscall. #463 would still be necessary. Perhaps we should file a bug (against rustix? and/or against linux-raw-sys?) for them to add it. LLVM's interception support for its libc::getrandom may be useful as a reference: llvm/llvm-project@ac9ee01
  • I noticed https://github.com/llvm/llvm-project/blob/bd6e324b67cdadde2593327753e99782146d9bf8/libc/startup/linux/x86_64/tls.cpp#L68-L74 which points out that in general using errno in the implementation of getrandom limits the contexts that getrandom can be used in, though this is mitigated somewhat because Rust doesn't have "life before main" and I guess TLS is always set up before any Rust code can run. Regardless, Rustix doesn't use thread-local errno (not sure if it uses thread-local storage for anything), so that's a benefit of Rustix.
  • Rustix MSRV policy is "This crate currently works on the version of Rust on Debian stable, which is currently Rust 1.63. This policy may change in the future, in minor version releases, so users using a fixed version of Rust should pin to a specific version of this crate." The first part seems pretty reasonable. I am not sure why 1.63 would be a problematic MSRV for getrandom but I do see the concern raised above.

@notgull
Copy link
Author

notgull commented Jun 8, 2024

It doesn't matter for us as we don't usegetauxval() for anything, but it means that we need to be careful about

Any particular reason for this?

  • Perhaps we should file a bug (against rustix? and/or against linux-raw-sys?) for them to add it.

There is already a common function for initializing a written-to buffer. So hopefully we can just add the "unpoison" call there.

  • The first part seems pretty reasonable. I am not sure why 1.63 would be a problematic MSRV for getrandom but I do see the concern raised above.

IIRC The main reason is because Debian Stable is considered the "low water mark" for what we reasonably need to support. I'm not sure what the lowest version this crate needs to support it.

It should also be mentioned that the MSRV is bumped to v1.64 if the "std" feature on Rustix is disabled.

notgull added a commit to forkgull/getrandom that referenced this issue Jun 8, 2024
This commit adds a backend based on the rustix crate. The main advantage
of rustix, in addition to greater safety, is that it uses raw syscalls
instead of going through libc.

I've tried to modify the existing libc code as little as possible, in
order to make this change as auditable as possible. I haven't touched
the pthreads code yet, as that's a little tricky.

This exists for discussion purposes.

cc rust-random#401

Signed-off-by: John Nunley <[email protected]>
@josephlr
Copy link
Member

@notgull thanks for demoing how an rustix implementation might work (#470). From what I understand, this would still use libc on non aarch64/arm/x86_64/x86/riscv targets. This is understandable, and is related to #424 (comment) where I noted we couldn't (yet) give up our use of libc::syscall.

Looking at that implementation, I think a lot of the complexity comes from the rustix dependency being optional. If we were going to use rustix, it wouldn't be an optional dependancy; it would entirely replace our use of libc on Linux. As for the use_file implementation, I think a better approach is to use the standard library to provide such an implementation (see #453).

However, I don't think its very likely we would use the rustix backend (at least for now). Given that we would still need to depend on libc for almost all targets (due to libc::syscall), it seems easier for us and our users to depend on libc directly.

@notgull
Copy link
Author

notgull commented Jun 10, 2024

However, I don't think its very likely we would use the rustix backend (at least for now). Given that we would still need to depend on libc for almost all targets

I've modified #470 to completely remove libc from the Linux backend. It also makes rustix non-optional on Linux and Linux only.

@briansmith
Copy link
Contributor

See also
https://github.com/bytecodealliance/rustix/blob/3346526711e885611c770ad09764b8ff8d2924ce/build.rs#L22-L27, which gives some ideas about what targets are not supported by Rustix but which are (perhaps accidentally) already supported by getrandom.

@briansmith
Copy link
Contributor

As for the use_file implementation, I think a better approach is to use the standard library to provide such an implementation (see #453 for the use_file implementation, I think a better approach is to use the standard library to provide such an implementation (see #453).

We really should stop forcing users to depend on a pthreads mutex implementation. @notgull did add a vendored implementation from libstd of a futex-based Mutex in their PR, rewritten to use Rustix, which is awsome. OTOH, IDK that getrandom wants to maintain a fork/vendoring of that Mutex.

I think there is tension between getrandom, which wants to be as narrowly-scoped as possible (basically one syscall or equivalent) across every OS, and Rustix, which wants to have a very broad scope (the entire kernel<->userspace API, basically), across a small range of targets.

I think the intersection of those interests that isn't currently being met is in supporting --linux-none and similar targets without a libc. We already have a PR in #461 that addresses that for x86_64-*-linux-none, which is the only such target we know about. That PR is easy to "scale" across target architectures, even ones that rustix doesn't support, because we basically just need to gather all the architecture -> syscall number mappings from the libc crate. OTOH, it seems difficult (for us) to port Rustix to every architecture that Rust supports for Linux.

@notgull
Copy link
Author

notgull commented Jun 11, 2024

OTOH, IDK that getrandom wants to maintain a fork/vendoring of that Mutex.

That implementation is mostly taken from rustix-futex-sync. Unfortunately that crate has dependencies other than Rustix at the moment, which would make me hesitate to include it here. See sunfishcode/rustix-futex-sync#10.

I think there is tension between getrandom, which wants to be as narrowly-scoped as possible (basically one syscall or equivalent) across every OS, and Rustix, which wants to have a very broad scope (the entire kernel<->userspace API, basically), across a small range of targets.

This tension exists in libc as well, I'd say.

We already have a PR in #461 that addresses that for x86_64-*-linux-none, which is the only such target we know about. That PR is easy to "scale" across target architectures

There's a lot of subtlety in the details when it comes to syscalls, especially across multiple architectures. For instance, there's different conventions on x86 depending on the host CPU, not to mention the whole vDSO thing. It gets tricky once you get down in the weeds.

As mentioned here, it is tricky to get system calls right, especially across architectures. While @rust-random might have the resources to maintain this, other crates will not. It would be a bad precedent to set, especially since getrandom is one of the most downloaded crates and therefore has high visibility.

OTOH, it seems difficult (for us) to port Rustix to every architecture that Rust supports for Linux.

rustix is in use by many different parties, so it's not like @rust-random will be solely on the hook for porting rustix to new architectures. I'm involved in rustix's maintenance as @smol-rs and @rust-windowing depend upon it. Many other parties depend on rustix as well.

@newpavlov
Copy link
Member

newpavlov commented Jun 11, 2024

This tension exists in libc as well, I'd say.

libc provides only raw bindings to the functions without any wrapping on top. Effectively, it's just a glorified header file. And even then, on many targets we do our own extern definitions without relying on "platform" crates.

There's a lot of subtlety in the details when it comes to syscalls, especially across multiple architectures.

I wouldn't say it's "a lot". The x86 case is the main stumbling block. And if we are fine with the subpar performance of int 0x80, then even it becomes relatively straightforward.

Also, this is why I strongly argue that raw syscall functions should be exposed by linux-raw-sys or a similar crate. If it will not be done by rustix developers, we can do it either as part of getrandom or in our own crate.

I think a good way forward may be to unconditionally use raw syscalls on Linux for the getrandom syscall, while the fallback code will continue to use libc. By enabling the linux-disable-fallback crate feature getrandom will be libc-free. Technically, the crate itself will still depend on libc, but resulting binary code will not be linked to it.

@notgull
Copy link
Author

notgull commented Jun 16, 2024

I think a good way forward may be to unconditionally use raw syscalls on Linux for the getrandom syscall, while the fallback code will continue to use libc. By enabling the linux-disable-fallback crate feature getrandom will be libc-free. Technically, the crate itself will still depend on libc, but resulting binary code will not be linked to it.

As I've demonstrated, rustix is usable in this position as well.

I wouldn't say it's "a lot". The x86 case is the main stumbling block. And if we are fine with the subpar performance of int 0x80, then even it becomes relatively straightforward.

My main issue here, as I've stated above, is that this sets an awful precedent. @rust-random may be able to properly maintain syscall bindings, but other, smaller crates won't be able to.

@notgull
Copy link
Author

notgull commented Jun 16, 2024

That being said, pragmatically I would prefer to have a "raw syscall" backend for getrandom than being locked into libc. So I would be reluctantly willing to investigate a "raw syscall" crate for this reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants