Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite wasi-common in terms of cap-std #2487

Merged
merged 267 commits into from
Feb 5, 2021
Merged

Conversation

pchickey
Copy link
Contributor

@pchickey pchickey commented Dec 8, 2020

This PR is a complete rewrite of the wasi-common crate. It uses @sunfishcode 's new cap-std family of crates to provide a sandboxed implementation of WASI on the local filesystem.

Note that this is a breaking change for all users of WASI with Wasmtime, except for the C API.

Re-architecting wasi-common

Over the past year or so, we've run up against many design problems in wasi-common. Many of these problems ended up being so fundamental that a rewrite, as prolonged as it was, ended up being more tractable than incremental changes.

The Table

wasi-common now has a Table type that is designed to map u32 handles to resources. The table is now part of the public interface to a WasiCtx - it is reference counted so that it can be shared beyond a WasiCtx with other WASI proposals (e.g. wasi-crypto and wasi-nn) to manage their resources. Elements in the Table are Any typed.

The Table type is intended to model how the Interface Types concept of Resources is shaping up. Right now it is just an approximation.

The WasiFile and WasiDir traits

The WASI specification only defines one handle type, fd, on which all operations on both files and directories (aka dirfds) are defined. We believe this is a design mistake, and are architecting wasi-common to make this straightforward to correct in future snapshots of WASI. Wasi-common internally treats files and directories as two distinct resource types in the table - Box<dyn WasiFile> and Box<dyn WasiDir>. The snapshot 0 and 1 interfaces via fd will attempt to downcast a table element to one or both of these interfaces depending on what is appropriate - e.g. fd_close operates on both files and directories, fd_read only operates on files, and fd_readdir only operates on directories.

The WasiFile and WasiDir traits are defined by wasi-common in terms of types defined directly in the crate's source code (I decided it should NOT those generated by the wiggle proc macros, see snapshot architecture below), as well as the cap_std::time family of types. And, importantly, wasi-common itself provides no implementation of WasiDir, and only two trivial implementations of WasiFile on the crate::pipe::{ReadPipe, WritePipe} types, which in turn just delegate to std::io::{Read, Write}. In order for wasi-common to access the local filesystem at all, you need to provide WasiFile and WasiDir impls through either the new wasi-cap-std-sync crate found at crates/wasi-common/cap-std-sync - see the section on that crate below - or by providing your own implementation from elsewhere.

This design makes it possible for wasi-common embedders to statically reason about access to the local filesystem by examining what impls are linked into an application. We found that this separation of concerns also makes it pretty enjoyable to write alternative implementations, e.g. a virtual filesystem (which will land in a future PR).

Traits for the rest of WASI's features

Other aspects of a WASI implementation are not yet considered resources and accessed by handle. We plan to correct this design deficiency in WASI in the future, but for now we have designed the following traits to provide embedders with the same sort of implementation flexibility they get with WasiFile/WasiDir:

  • Timekeeping: clocks::WasiSystemClock and clock::WasiMonotonicClock provide the two interfaces for a clock. WasiSystemClock represents time as a cap_std::time::SystemTime, and WasiMonotonicClock represents time as cap_std::time::Instant.
  • Randomness: we re-use the cap_rand::RngCore trait to represent a randomness source. A trivial Deterministic impl is provided.
  • Scheduling: The WasiSched trait abstracts over the sched_yield and poll_oneoff functions.

Users can provide implementations of each of these interfaces to the WasiCtx::builder(...) function. The wasi_cap_std_sync::WasiCtxBuilder::new() function uses this public interface to plug in its own implementations of each of these resources.

Snapshot architecture

One goal we've had for a while, but not quite met, is for multiple WASI snapshots to provide an interface to the same underlying WasiCtx. This provides us a path to evolve WASI by allowing the same WASI Command to import functions from different snapshots - e.g. the user could use Rust's std which imports snapshot 1, but also depend directly on the wasi crate which imports some future snapshot 2. Right now, this amounts to supporting snapshot 1 and "snapshot 0" aka wasi_unstable at once.

The architectural rules for snapshots are:

  • Snapshots are arranged into modules under crate::snapshots::.
  • Each snapshot should invoke wiggle::from_witx! with ctx: crate::WasiCtx in its module, and impl all of the required traits.
  • Snapshots can be implemented in terms of other snapshots. For example, snapshot 0 is mostly implemented by calling the snapshot 1 implementation, and converting its own types back and forth with the snapshot 1 types. In a few cases, that is not feasible, so snapshot 0 carries its own implementations in terms of the WasiFile and WasiSched traits.
  • Snapshots can be implemented in terms of the Wasi* traits given by WasiCtx. No further downcasting via the as_any escape hatch is permitted.

The wasi_common::Error type

wasi_common::Error is now anyhow::Error. wasi_common::snapshots::preview_1 contains all of the logic for transforming an Error into an Errno, by downcasting the error into any of

  • std::io::Error - these are thrown by std, cap_std, etc for most of the operations WASI is concerned with.
  • wasi_common::ErrorKind - these are a subset of the Errnos, and are constructed directly by wasi-common or an impl rather than coming from the OS or some library which doesn't know about WASI.
  • wiggle::GuestError
  • std::num::TryFromIntError
  • std::str::Utf8Error
    and then applying specialized logic to translate each of those into Errnos.

The wasi_common::ErrorExt trait provides human-friendly constructors for the wasi_common::ErrorKind variants .

If you throw an error that does not downcast to one of those, it will turn into a wiggle::Trap and terminate execution.

The real value of using anyhow::Error here is being able to use anyhow::Result::context to aid in debugging of errors.

The wasi-cap-std-sync implementation of the WASI traits

The wasi-cap-std-sync crate provides impl of WasiFile and WasiDir in terms of cap_std::fs::{File, Dir}. These types provide sandboxed access to the local filesystem on both Unix and Windows.

The entire design story of cap-std is much bigger than we can address here, but in short, its functionality replaces all of the wasi_common::sys hierarchy, as well as the yanix / winx crates. All syscalls are hidden behind the cap-std hierarchy, with the lone exception of the sched implementation, which is provided for both unix and windows in separate modules.

Any wasi_common::{WasiCtx, WasiCtxBuilder} is interoperable with the wasi-cap-std-sync crate. However, for convenience, wasi-cap-std-sync provides its own WasiCtxBuilder that hooks up to all of the crate's components, i.e. it fills in all of the arguments to WasiCtx::builder(...), presents preopen_dir in terms of cap_std::fs::Dir, and provides convenience methods for inheriting the parent process's stdio, args, and env.

The only place we expect to run into long-term compatibility issues between wasi-cap-std-sync and the other impl crates that will come later is in the Sched abstraction. Once we can build an async scheduler based on Rust Futures, async impls will be able to interoperate, but the synchronous scheduler depends on downcasting the WasiFile type down to concrete types it knows about (which in turn impl AsRawFd for passing to unix poll, or the analogous traits on windows).

Why is this impl suffixed with -sync? #2434 async is coming soon! The async impl may end up depending on tokio or other relatively heavy deps, so we will retain a sync implementation so that wasi-common users have an option of not pulling in an async runtime.

Improving the WASI test suite

The bulk of wasi-common's test suite lives at crates/test-programs. This test suite was extremely useful for guiding this rewrite. We improved the test suite in numerous ways, including

  • Breaking tests out into smaller units, so that failing behavior can be isolated more easily
  • Introducing a scheme which allows the test runner to describe to the wasm code what behaviors to expect of the embedding. Behaviors which vary between platforms (e.g. fd_allcoate is impossible to faithfully implement on Windows and is not provided by MacOS) are communicated to the guest by environment variables, which in turn is available in the guest behind the global TESTCONFIG struct.
  • Introducing the assert_errno! macro, which pretty-prints errnos by name rather than number in the error message. assert_errno! integrates with TESTCONFIG to specify which errno is expected on which platform. This allows the test source to describe the full set of acceptable errnos a call may return (all possible errnos will be acceptable if no ERRNO_MODE_* env var is set), and also detect regressions on any given platform.
  • We still use crates/test-platforms/build.rs to generate the test suite, and use functions directly in build.rs to describe the TESTCONFIG expectations, as well as which tests are to be ignored (expected to fail due to regressions) on what platforms.
  • The test suite has been generalized to support multiple backends to the wasi-common crate. This PR will only land with support for testing wasi-cap-std-sync, but it will be easy to add more runtimes under tests/wasm_tests/runtime/ as more impls are created, with a minimum of boilerplate in build.rs.

Changes to wiggle-wasmtime and dependents

You used to pass a WasiCtx (or whatever ctx type) directly to the Wasi struct generated by wiggle-wasmtime, and the generated code would wrap the ctx into a Rc<RefCell<ctx>>. This got in the way of supporting multiple snapshots simultaneously, so now callers of the generated code have to pass in an Rc<RefCell<ctx>>.

I made mechanical changes to wasi-nn (cc @abrown) and wasi-crypto (cc @jedisct1) to accommodate these changes. This was of particular help to the wasi-crypto code - I was able to erase an entire Rc<> indirection by getting wiggle out of our way. Sorry that I did not take the time to break this out into a standalone PR.

Changes to wasmtime-wasi

wasmtime-wasi now supports using multiple snapshots to interface to the same WasiCtx!

  • wasmtime_wasi::Wasi::new(&Store, WasiCtx) is now a struct which owns your WasiCtx and provides linkage to every available snapshot.
  • Individual snapshots are available through wasmtime_wasi::snapshots::preview_{0, 1}::Wasi::new(&Store, Rc<RefCell<WasiCtx>>).

Everyone should use wasmtime_wasi::Wasi unless you have a really good reason not to. The C API is the only spot that I didn't port to use this.

Regressions

Some behavior around trailing slashes in paths has changed from the test suite. We expect there are some improvements we'll make to these corner cases after this PR lands.

For now, the path_rename_file_trailing_slashes and remove_directory_trailing_slashes tests are #[ignore]'d on all platforms, and additionally the interesting_paths test is ignored on Windows.

Additionally, some behavior around the FdFlags::{SYNC, DSYNC, RSYNC} flags have changed. For now, cap-std does not support opening a file with these flags. Using these flags will result in an Errno::NOTSUP (Not supported).

Outstanding issues required to merge this PR:

crates/wasi-c2/src/file.rs Outdated Show resolved Hide resolved
crates/wasi-c2/src/file.rs Outdated Show resolved Hide resolved
crates/wasi-c2/src/file.rs Outdated Show resolved Hide resolved
crates/wasi-c2/src/file.rs Outdated Show resolved Hide resolved
@github-actions github-actions bot added the wasi Issues pertaining to WASI label Dec 16, 2020
@github-actions
Copy link

Subscribe to Label Action

cc @kubkon

This issue or pull request has been labeled: "wasi"

Thus the following users have been cc'd because of the following labels:

  • kubkon: wasi

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasi Issues pertaining to WASI wasmtime:c-api Issues pertaining to the C API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants