From 139d7f0121bd95dec5bf75d615c8dcd2cb6c602f Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Mon, 24 May 2021 00:21:37 +0100 Subject: [PATCH 01/32] Initial version of trim-path RFC --- text/3127-trim-path.md | 192 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 192 insertions(+) create mode 100644 text/3127-trim-path.md diff --git a/text/3127-trim-path.md b/text/3127-trim-path.md new file mode 100644 index 00000000000..31e567f661d --- /dev/null +++ b/text/3127-trim-path.md @@ -0,0 +1,192 @@ +- Feature Name: trim-path +- Start Date: 2021-05-24 +- RFC PR: [rust-lang/rfcs#3127](https://github.com/rust-lang/rfcs/pull/3127) +- Rust Issue: N/A + +# Summary +[summary]: #summary + +Cargo should have a [profile setting](https://doc.rust-lang.org/cargo/reference/profiles.html#profile-settings) named `trim-path` +to sanitise absolute paths introduced during compilation that may be embedded in the compilation output. This should be enabled by default for +`release` profile. + +# Motivation +[motivation]: #motivation + +## Sanitising local paths that are currently embedded +Currently, executables and libraies built by Cargo have a lot of embedded absolute paths. They most frequently appear in debug information and +panic messages (pointing to the panic location source file). As an example, consider the following package: + +`Cargo.toml`: + +```toml +[package] +name = "rfc" +version = "0.1.0" +edition = "2018" + +[dependencies] +rand = "0.8.0" +``` +`src/main.rs` + +```rust +use rand::prelude::*; + +fn main() { + let r: f64 = rand::thread_rng().gen(); + println!("{}", r); +} +``` + +Then run + +```bash +$ cargo build --release +$ strings target/release/rfc | grep $HOME +``` + +We will see some absolute paths pointing to dependency crates downloaded by Cargo, containing our username: + +``` +could not initialize thread_rng: /home/username/.cargo/registry/src/github.com-1ecc6299db9ec823/rand-0.8.3/src/rngs/thread.rs +/home/username/.cargo/registry/src/github.com-1ecc6299db9ec823/rand_chacha-0.3.0/src/guts.rsdescription() is deprecated; use Display +/home/username/.cargo/registry/src/github.com-1ecc6299db9ec823/getrandom-0.2.2/src/util_libc.rs +``` + +This is undesirable for the following reasons: + +1. **Privacy**. `release` binaries may be distributed, and anyone could then see the builder's local OS account username. + Additionally, some CI (such as [GitLab CI](https://docs.gitlab.com/runner/best_practice/#build-directory)) checks out the repo under a path where + it may include things that really aren't meant to be public. Without sanitising the path by default, this may be inadvertently leaked. +2. **Build reproducibility**. We would like to make it easier to reproduce binary equivalent builds. While it is not required to maintain + reproducibility across different environments, removing environment-sensitive information from the build will increase tolerance on the inevitable + environment differences when trying to verify builds. + +## Handling sysroot paths +At the moment, paths to the source files of standard and core libraries, even when they are present, always begin with a virtual prefix in the form +of `/rustc/[SHA1 hash]/library`. This is not an issue when the source files are not present (i.e. when `rust-src` component is not installed), but +when a user installs `rust-src` they expect the path to their local copy of source files to be visible. Hence the user should be given an option for +the local paths to show up in panic messages and backtraces. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +`trim-path` is a profile setting which can be set to either `true` or `false`. This is enabled by default when you do a release build, +such as via `cargo build --release`. You can also manually override it by specifying this option in `Cargo.toml`: +```toml +[profile.dev] +trim-path = true + +[profile.release] +trim-path = false +``` + +With `trim-path` option enabled, the compilation process will not introduce any absolute paths into the build output. Instead, paths containing +certain prefixes will be replaced with something stable by the following rules: + +1. Path to the source files of the standard and core library will begin with `/rustc/[rustc version]`. + E.g. `/home/username/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs` -> + `/rustc/1.52.1/library/core/src/result.rs` +2. Path to the working directory will be replaced with `.`. E.g. `/home/username/crate/src/lib.rs` -> `./src/lib.rs`. +3. Path to packages outside of the working directory will be replaced with `[package name]-[version]`. E.g. `/home/username/deps/foo/src/lib.rs` -> `foo-0.1.0/src/lib.rs` + +If using MSVC toolchain, path to the .pdb file containing debug information are be embedded as the file name of the .pdb file only, wihtout any path +information. + +With `trim-path` option disabled, the embedding of path to the source files of the standard and core library will depend on if `rust-src` component is present. If it is, then the real path pointing to a copy of the source files on your file system will be embedded; if it isn't, then they will +show up as `/rustc/[rustc version]/library/...` (just like when `trim-path` is enabled). Path to all other source files will not be affected. + +Note that this will not affect any hard-coded paths in the source code. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +## `trim-path` implementation in Cargo +We only need to change the behaviour for `Test` and `Build` compile modes. + +If `trim-path` is enabled, Cargo will emit two `--remap-path-prefix` arguments to `rustc` for each compilation unit. One mapping is from the path of +the local sysroot to `/rustc/[rust version]`. The other mapping depends on if the package containing the compilation unit is under the working +directory. If it is, then the mapping is from the absolute path to the working directory to `.`. If it's outside the working directory, then the +mapping is from the absolute path of the package root to `[package name]-[package version]`. + +Some interactions with compiler-intrinstic macros need to be considered, though these are entirely down to `rustc`'s implementation of +`--remap-path-prefix`: +1. Path (of the current file) introduced by [`file!()`](https://doc.rust-lang.org/std/macro.file.html) *will* be remapped. **Things may break** if + the code interacts with its own source file at runtime by using this macro. +2. Path introduced by [`include!()`](https://doc.rust-lang.org/std/macro.include.html) *will* be remapped, given that the included file is under + the current working directory or a dependency package. + +If the user further supplies custom `--remap-path-prefix` arguments via `RUSTFLAGS` or similar mechanisms, they will take precedence over the one +supplied by `trim-path`. This means that the user-defined `--remap-path-prefix`s must be supplied *after* Cargo's own remapping. + +Additionally, when using MSVC linker, Cargo should emit `/PDBALTPATH:%_PDB%` to the linker via `-C link-arg`. This makes the linker embed +only the file name of the .pdb file without the path to it. + +## Changing handling of sysroot path +The virtualisation of sysroot files to `/rustc/[SHA1 hash]/library/...` was done at compiler bootstraping, specifically when +`remap-debuginfo = true` in `config.toml`. This is done for Rust distribution on all channels. + +At `rustc` runtime (i.e. compiling some code), we try to correlate this virtual path to a real path pointing to the file on the local file system. +Currently the result is represented internally as if the path was remapped by `--remap-path-prefix`, holding both the virtual name and local path. +Only the virtual name is ever emitted for metadata or codegen. We want to change this behaviour such that, when `rust-src` source files can be +discovered, the virutal path is discarded and therefore will be embedded unless being remapped by `--remap-path-prefix` in the usual way. The relevant part of the code is here: +https://github.com/rust-lang/rust/blob/d8af907491e20339e41d048d6a32b41ddfa91dfe/compiler/rustc_metadata/src/rmeta/decoder.rs#L1637-L1765 + +We would also like to change the virtualisation of sysroot to `/rustc/[rustc version]/library/...`, instead of the rustc commit hash. This is shorter and more helpful as an identifier, and makes `trim-path` easier to implement: to make the embedded path the same whether or not `rust-src` is installed, we need to emit the same sysroot virutalisation as was done during bootstrapping. Getting the version number is easier than getting the commit hash. The relevant part of the code is here: https://github.com/rust-lang/rust/blob/d8af907491e20339e41d048d6a32b41ddfa91dfe/src/bootstrap/lib.rs#L831-L834 + +# Drawbacks +[drawbacks]: #drawbacks + +With `trim-path` enabled, if the `debug` option is simultaneously not `false` (it is turned off by default under `release` profile), paths in +debuginfo will also be remapped. Debuggers will no longer be able to automatically discover and load source files outside of the working directory. +This can be remidated by [debugger features](https://lldb.llvm.org/use/map.html#miscellaneous) remapping the path back to a filesystem path. + +The user also will not be able to `Ctrl+click` on any paths provided in panic messages or backtraces outside of the working directory. But +there shouldn't be any confusion as the combination of pacakge name and version can be used to pinpoint the file. + +As mentioned above, `trim-path` may break code that relies on `file!()` to evaluate to an accessible path to the file. Hence enabling +it by default for release builds may be a technically breaking change. Occurances of such use should be extremely rare but should be investigated +via a Crater run. In case this breakage is unacceptable, `trim-path` can be made an opt-in option rather than default in any build profile. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +There has been an issue (https://github.com/rust-lang/rust/issues/40552) asking for path sanitisation to be implemented and enabled by default for +release builds. It has, over the past 4 years, gained a decent amount of popular support. The remapping rule proposed here is very simple to +implement. + +Path to sysroot crates are specially handled by `rustc`. Due to this, the behaviour we currently have is that all such paths are virtualised. +Although good for privacy and reproducibility, some people find it a hinderance for debugging: https://github.com/rust-lang/rust/issues/85463. +Hence the user should be given control on if they want the virtual or local path. + +One alternative for the sysroot handling is to keep the logic in `rustc` largely the same, always emitting the virutalised path by default, and +then introduce an extra option named `--embed-local-sysroot` to embed the local paths if the source files can be found. This inovles adding an extra +option to `rustc` and prevents any uniformity in `--remap-path-prefix`'s handling over sysroot paths, compared to other paths (it currently doesn't +affect sysroot paths at all). + +# Prior art +[prior-art]: #prior-art + +The name `trim-path` came from the [similar feature](https://golang.org/cmd/go/#hdr-Compile_packages_and_dependencies) in Go. An alternative name +`sanitize-paths` was first considered but the spelling of "sanitise" differs across the pond and down under. It is also not as short and concise. + +Go does not enable this by default. Since Go does not differ between debug and release builds, removing absolute paths for all build would be +a hassle for debugging. However this is not an issue for Rust as we have separate debug build profile. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +- Should the option be called `trim-paths` (plural) instead of `trim-path`? Quite a few other option names are plural, such as `debug-assertions` + and `overflow-checks`. +- Should we treat the current working directory the same as other packages? We could have one fewer remapping rule by remapping all + package roots to `[package name]-[version]`. A minor downside to this is not being able to `Ctrl+click` on paths to files the user is working + on from panic messages. +- Should we use a slightly more complex remapping rule, like distinguishing packages from registry, git and path, as mentioned in https://github.com/rust-lang/rust/issues/40552? +- Will these cover all potentially embedded paths? Have we missed anything? +- Should we make this affect more `CompileMode`s, such as `Check`, where the emitted `rmeta` file will also contain absolute paths? + +# Future possibilities +[future-possibilities]: #future-possibilities + +N/A \ No newline at end of file From bcbd1318950fa5333f6867390518c5a74633d31e Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Sun, 30 May 2021 10:42:22 +0100 Subject: [PATCH 02/32] Update text/3127-trim-path.md Co-authored-by: teor --- text/3127-trim-path.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3127-trim-path.md b/text/3127-trim-path.md index 31e567f661d..92c15f1dda8 100644 --- a/text/3127-trim-path.md +++ b/text/3127-trim-path.md @@ -95,7 +95,7 @@ If using MSVC toolchain, path to the .pdb file containing debug information are information. With `trim-path` option disabled, the embedding of path to the source files of the standard and core library will depend on if `rust-src` component is present. If it is, then the real path pointing to a copy of the source files on your file system will be embedded; if it isn't, then they will -show up as `/rustc/[rustc version]/library/...` (just like when `trim-path` is enabled). Path to all other source files will not be affected. +show up as `/rustc/[rustc version]/library/...` (just like when `trim-path` is enabled). Paths to all other source files will not be affected. Note that this will not affect any hard-coded paths in the source code. @@ -189,4 +189,4 @@ a hassle for debugging. However this is not an issue for Rust as we have separat # Future possibilities [future-possibilities]: #future-possibilities -N/A \ No newline at end of file +N/A From 97e410490c0ee339392c4f4b6303333ba16c0d74 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Tue, 1 Jun 2021 21:06:02 +0100 Subject: [PATCH 03/32] Update text/3127-trim-path.md Co-authored-by: Josh Triplett --- text/3127-trim-path.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3127-trim-path.md b/text/3127-trim-path.md index 92c15f1dda8..aa43d8134c6 100644 --- a/text/3127-trim-path.md +++ b/text/3127-trim-path.md @@ -130,7 +130,7 @@ The virtualisation of sysroot files to `/rustc/[SHA1 hash]/library/...` was done At `rustc` runtime (i.e. compiling some code), we try to correlate this virtual path to a real path pointing to the file on the local file system. Currently the result is represented internally as if the path was remapped by `--remap-path-prefix`, holding both the virtual name and local path. Only the virtual name is ever emitted for metadata or codegen. We want to change this behaviour such that, when `rust-src` source files can be -discovered, the virutal path is discarded and therefore will be embedded unless being remapped by `--remap-path-prefix` in the usual way. The relevant part of the code is here: +discovered, the virtual path is discarded and therefore will be embedded unless being remapped by `--remap-path-prefix` in the usual way. The relevant part of the code is here: https://github.com/rust-lang/rust/blob/d8af907491e20339e41d048d6a32b41ddfa91dfe/compiler/rustc_metadata/src/rmeta/decoder.rs#L1637-L1765 We would also like to change the virtualisation of sysroot to `/rustc/[rustc version]/library/...`, instead of the rustc commit hash. This is shorter and more helpful as an identifier, and makes `trim-path` easier to implement: to make the embedded path the same whether or not `rust-src` is installed, we need to emit the same sysroot virutalisation as was done during bootstrapping. Getting the version number is easier than getting the commit hash. The relevant part of the code is here: https://github.com/rust-lang/rust/blob/d8af907491e20339e41d048d6a32b41ddfa91dfe/src/bootstrap/lib.rs#L831-L834 From d8344ef268701c8decc1a8ae1fc07df42d7084f3 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Tue, 1 Jun 2021 21:10:16 +0100 Subject: [PATCH 04/32] Use plural --- .../{3127-trim-path.md => 3127-trim-paths.md} | 34 +++++++++---------- 1 file changed, 16 insertions(+), 18 deletions(-) rename text/{3127-trim-path.md => 3127-trim-paths.md} (82%) diff --git a/text/3127-trim-path.md b/text/3127-trim-paths.md similarity index 82% rename from text/3127-trim-path.md rename to text/3127-trim-paths.md index aa43d8134c6..410053ee798 100644 --- a/text/3127-trim-path.md +++ b/text/3127-trim-paths.md @@ -1,4 +1,4 @@ -- Feature Name: trim-path +- Feature Name: trim-paths - Start Date: 2021-05-24 - RFC PR: [rust-lang/rfcs#3127](https://github.com/rust-lang/rfcs/pull/3127) - Rust Issue: N/A @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Cargo should have a [profile setting](https://doc.rust-lang.org/cargo/reference/profiles.html#profile-settings) named `trim-path` +Cargo should have a [profile setting](https://doc.rust-lang.org/cargo/reference/profiles.html#profile-settings) named `trim-paths` to sanitise absolute paths introduced during compilation that may be embedded in the compilation output. This should be enabled by default for `release` profile. @@ -72,17 +72,17 @@ the local paths to show up in panic messages and backtraces. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -`trim-path` is a profile setting which can be set to either `true` or `false`. This is enabled by default when you do a release build, +`trim-paths` is a profile setting which can be set to either `true` or `false`. This is enabled by default when you do a release build, such as via `cargo build --release`. You can also manually override it by specifying this option in `Cargo.toml`: ```toml [profile.dev] -trim-path = true +trim-paths = true [profile.release] -trim-path = false +trim-paths = false ``` -With `trim-path` option enabled, the compilation process will not introduce any absolute paths into the build output. Instead, paths containing +With `trim-paths` option enabled, the compilation process will not introduce any absolute paths into the build output. Instead, paths containing certain prefixes will be replaced with something stable by the following rules: 1. Path to the source files of the standard and core library will begin with `/rustc/[rustc version]`. @@ -94,18 +94,18 @@ certain prefixes will be replaced with something stable by the following rules: If using MSVC toolchain, path to the .pdb file containing debug information are be embedded as the file name of the .pdb file only, wihtout any path information. -With `trim-path` option disabled, the embedding of path to the source files of the standard and core library will depend on if `rust-src` component is present. If it is, then the real path pointing to a copy of the source files on your file system will be embedded; if it isn't, then they will -show up as `/rustc/[rustc version]/library/...` (just like when `trim-path` is enabled). Paths to all other source files will not be affected. +With `trim-paths` option disabled, the embedding of path to the source files of the standard and core library will depend on if `rust-src` component is present. If it is, then the real path pointing to a copy of the source files on your file system will be embedded; if it isn't, then they will +show up as `/rustc/[rustc version]/library/...` (just like when `trim-paths` is enabled). Paths to all other source files will not be affected. Note that this will not affect any hard-coded paths in the source code. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -## `trim-path` implementation in Cargo +## `trim-paths` implementation in Cargo We only need to change the behaviour for `Test` and `Build` compile modes. -If `trim-path` is enabled, Cargo will emit two `--remap-path-prefix` arguments to `rustc` for each compilation unit. One mapping is from the path of +If `trim-paths` is enabled, Cargo will emit two `--remap-path-prefix` arguments to `rustc` for each compilation unit. One mapping is from the path of the local sysroot to `/rustc/[rust version]`. The other mapping depends on if the package containing the compilation unit is under the working directory. If it is, then the mapping is from the absolute path to the working directory to `.`. If it's outside the working directory, then the mapping is from the absolute path of the package root to `[package name]-[package version]`. @@ -118,7 +118,7 @@ Some interactions with compiler-intrinstic macros need to be considered, though the current working directory or a dependency package. If the user further supplies custom `--remap-path-prefix` arguments via `RUSTFLAGS` or similar mechanisms, they will take precedence over the one -supplied by `trim-path`. This means that the user-defined `--remap-path-prefix`s must be supplied *after* Cargo's own remapping. +supplied by `trim-paths`. This means that the user-defined `--remap-path-prefix`s must be supplied *after* Cargo's own remapping. Additionally, when using MSVC linker, Cargo should emit `/PDBALTPATH:%_PDB%` to the linker via `-C link-arg`. This makes the linker embed only the file name of the .pdb file without the path to it. @@ -133,21 +133,21 @@ Only the virtual name is ever emitted for metadata or codegen. We want to change discovered, the virtual path is discarded and therefore will be embedded unless being remapped by `--remap-path-prefix` in the usual way. The relevant part of the code is here: https://github.com/rust-lang/rust/blob/d8af907491e20339e41d048d6a32b41ddfa91dfe/compiler/rustc_metadata/src/rmeta/decoder.rs#L1637-L1765 -We would also like to change the virtualisation of sysroot to `/rustc/[rustc version]/library/...`, instead of the rustc commit hash. This is shorter and more helpful as an identifier, and makes `trim-path` easier to implement: to make the embedded path the same whether or not `rust-src` is installed, we need to emit the same sysroot virutalisation as was done during bootstrapping. Getting the version number is easier than getting the commit hash. The relevant part of the code is here: https://github.com/rust-lang/rust/blob/d8af907491e20339e41d048d6a32b41ddfa91dfe/src/bootstrap/lib.rs#L831-L834 +We would also like to change the virtualisation of sysroot to `/rustc/[rustc version]/library/...`, instead of the rustc commit hash. This is shorter and more helpful as an identifier, and makes `trim-paths` easier to implement: to make the embedded path the same whether or not `rust-src` is installed, we need to emit the same sysroot virutalisation as was done during bootstrapping. Getting the version number is easier than getting the commit hash. The relevant part of the code is here: https://github.com/rust-lang/rust/blob/d8af907491e20339e41d048d6a32b41ddfa91dfe/src/bootstrap/lib.rs#L831-L834 # Drawbacks [drawbacks]: #drawbacks -With `trim-path` enabled, if the `debug` option is simultaneously not `false` (it is turned off by default under `release` profile), paths in +With `trim-paths` enabled, if the `debug` option is simultaneously not `false` (it is turned off by default under `release` profile), paths in debuginfo will also be remapped. Debuggers will no longer be able to automatically discover and load source files outside of the working directory. This can be remidated by [debugger features](https://lldb.llvm.org/use/map.html#miscellaneous) remapping the path back to a filesystem path. The user also will not be able to `Ctrl+click` on any paths provided in panic messages or backtraces outside of the working directory. But there shouldn't be any confusion as the combination of pacakge name and version can be used to pinpoint the file. -As mentioned above, `trim-path` may break code that relies on `file!()` to evaluate to an accessible path to the file. Hence enabling +As mentioned above, `trim-paths` may break code that relies on `file!()` to evaluate to an accessible path to the file. Hence enabling it by default for release builds may be a technically breaking change. Occurances of such use should be extremely rare but should be investigated -via a Crater run. In case this breakage is unacceptable, `trim-path` can be made an opt-in option rather than default in any build profile. +via a Crater run. In case this breakage is unacceptable, `trim-paths` can be made an opt-in option rather than default in any build profile. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -168,7 +168,7 @@ affect sysroot paths at all). # Prior art [prior-art]: #prior-art -The name `trim-path` came from the [similar feature](https://golang.org/cmd/go/#hdr-Compile_packages_and_dependencies) in Go. An alternative name +The name `trim-paths` came from the [similar feature](https://golang.org/cmd/go/#hdr-Compile_packages_and_dependencies) in Go. An alternative name `sanitize-paths` was first considered but the spelling of "sanitise" differs across the pond and down under. It is also not as short and concise. Go does not enable this by default. Since Go does not differ between debug and release builds, removing absolute paths for all build would be @@ -177,8 +177,6 @@ a hassle for debugging. However this is not an issue for Rust as we have separat # Unresolved questions [unresolved-questions]: #unresolved-questions -- Should the option be called `trim-paths` (plural) instead of `trim-path`? Quite a few other option names are plural, such as `debug-assertions` - and `overflow-checks`. - Should we treat the current working directory the same as other packages? We could have one fewer remapping rule by remapping all package roots to `[package name]-[version]`. A minor downside to this is not being able to `Ctrl+click` on paths to files the user is working on from panic messages. From 408dc507c55103a7174146df2d68ede5cb727888 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Tue, 31 Aug 2021 11:11:50 +0100 Subject: [PATCH 05/32] Add `--remap-scope` proposal --- text/3127-trim-paths.md | 132 ++++++++++++++++++++++++++-------------- 1 file changed, 88 insertions(+), 44 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 410053ee798..76fa3304b1d 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -7,8 +7,15 @@ [summary]: #summary Cargo should have a [profile setting](https://doc.rust-lang.org/cargo/reference/profiles.html#profile-settings) named `trim-paths` -to sanitise absolute paths introduced during compilation that may be embedded in the compilation output. This should be enabled by default for -`release` profile. +to sanitise absolute paths introduced during compilation that may be embedded in the compiled binary executable or library, and optionally in +the separate debug symbols file (depending on `split-debuginfo` settings). + +`cargo build` with the default `release` profile should not produce any host filesystem dependent paths into binary executable or library. But +it will retain the paths in separate debug symbols file, if one exists, to help debuggers and profilers locate the source files. + +To facilitate this, a new flag named `--remap-scope` should be added to `rustc` controlling the behaviour of `--remap-path-prefix`, allowing us to fine +tune the scope of remapping, speicifying paths under which context (in marco expansion, in debuginfo or in diagnostics) +should or shouldn't be remapped. # Motivation [motivation]: #motivation @@ -66,86 +73,119 @@ This is undesirable for the following reasons: ## Handling sysroot paths At the moment, paths to the source files of standard and core libraries, even when they are present, always begin with a virtual prefix in the form of `/rustc/[SHA1 hash]/library`. This is not an issue when the source files are not present (i.e. when `rust-src` component is not installed), but -when a user installs `rust-src` they expect the path to their local copy of source files to be visible. Hence the user should be given an option for -the local paths to show up in panic messages and backtraces. +when a user installs `rust-src` they may want the path to their local copy of source files to be visible. Hence the default behaviour when `rust-src` +is installed should be to embed the local path. These local paths should be then affected by path remappings in the usual way. + +## Preserving debuginfo to help debuggers +At the moment, `--remap-path-prefix` will cause paths to source files in debuginfo to be remapped. On platforms where the debuginfo resides in a +separate file from the distributable binary, this may be unnecessary and it prevents debuggers from being able to find the source. Hence `rustc` +should support finer grained control over paths in which contexts should be remapped. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -`trim-paths` is a profile setting which can be set to either `true` or `false`. This is enabled by default when you do a release build, -such as via `cargo build --release`. You can also manually override it by specifying this option in `Cargo.toml`: +## The rustc book: Command-line arguments + +### `--remap-scope`: configure the scope of path remapping + +When the `--remap-path-prefix` option is passed to rustc then source path prefixes in all output will be affected. +The `--remap-scope` argument can be used in conjunction with `--remap-path-prefix` to determine paths in which output context should be affected. +This flag accepts a comma-separated list of values and may be specified multiple times. The valid scopes are: + +- `macro` - apply remappings to the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from +- `debuginfo` - apply remappings to debug information +- `diagnostics` - apply remappings to printed compiler diagnostics + +## Cargo + +`trim-paths` is a profile setting which controls the sanitisation of file paths in compilation outputs. It has three valid options: +- `0` or `false`: no sanitisation at all +- `1`: sanitise only the paths in emitted executable or library binaries. It always affects paths from macros such as panic messages, and in debug information + only if they will be embedded together with the binary (the default on platforms with ELF binaries, such as Linux and windows-gnu), + but will not touch them if they are in a separate symbols file (the default on Windows MSVC and macOS) +- `2` or `ture`: sanitise paths in all compilation outputs, including compiled executable/library, separate symbols file (if one exists), and compiler diagnostics. + +The default release profile uses option `1`. You can also manually override it by specifying this option in `Cargo.toml`: ```toml [profile.dev] -trim-paths = true +trim-paths = 2 [profile.release] -trim-paths = false +trim-paths = 0 ``` -With `trim-paths` option enabled, the compilation process will not introduce any absolute paths into the build output. Instead, paths containing -certain prefixes will be replaced with something stable by the following rules: +When a path is in scope for sanitisation, it is replaced with the following rules: -1. Path to the source files of the standard and core library will begin with `/rustc/[rustc version]`. +1. Path to the source files of the standard and core library (sysroot) will begin with `/rustc/[rustc commit hash]`. E.g. `/home/username/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs` -> - `/rustc/1.52.1/library/core/src/result.rs` + `/rustc/fe72845f7bb6a77b9e671e6a4f32fe714962cec4/library/core/src/result.rs` 2. Path to the working directory will be replaced with `.`. E.g. `/home/username/crate/src/lib.rs` -> `./src/lib.rs`. 3. Path to packages outside of the working directory will be replaced with `[package name]-[version]`. E.g. `/home/username/deps/foo/src/lib.rs` -> `foo-0.1.0/src/lib.rs` -If using MSVC toolchain, path to the .pdb file containing debug information are be embedded as the file name of the .pdb file only, wihtout any path -information. - -With `trim-paths` option disabled, the embedding of path to the source files of the standard and core library will depend on if `rust-src` component is present. If it is, then the real path pointing to a copy of the source files on your file system will be embedded; if it isn't, then they will -show up as `/rustc/[rustc version]/library/...` (just like when `trim-paths` is enabled). Paths to all other source files will not be affected. +When a path to the source files of the standard and core library is *not* in scope for sanitisation, the emitted path will depend on if `rust-src` component +is present. If it is, then the real path pointing to a copy of the source files on your file system will be emitted; if it isn't, then they will +show up as `/rustc/[rustc commit hash]/library/...` (just like when it is selected for sanitisation). Paths to all other source files will not be affected. -Note that this will not affect any hard-coded paths in the source code. +This will not affect any hard-coded paths in the source code. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation ## `trim-paths` implementation in Cargo + We only need to change the behaviour for `Test` and `Build` compile modes. -If `trim-paths` is enabled, Cargo will emit two `--remap-path-prefix` arguments to `rustc` for each compilation unit. One mapping is from the path of -the local sysroot to `/rustc/[rust version]`. The other mapping depends on if the package containing the compilation unit is under the working -directory. If it is, then the mapping is from the absolute path to the working directory to `.`. If it's outside the working directory, then the -mapping is from the absolute path of the package root to `[package name]-[package version]`. +If `trim-paths` is `0` (`false`), no extra flag is supplied to `rustc`. + +If `trip-paths` is `1` or `2` (`true`), then two `--remap-path-prefix` arguments are supplied to `rustc`: +- From the path of the local sysroot to `/rustc/[commit hash]`. +- If the compilation unit is under the working directory, from the absolute path to the working directory to `.`. + If it's outside the working directory, from the absolute path of the package root to `[package name]-[package version]`. + +A further `--remap-scope` is also supplied for options `1` and `2`: + +If `trim-path` is `1`, then it depends on the setting of `split-debuginfo` (whether the setting is explicitly supplied or from the default) +- If `split-debuginfo` is `off`, then `--remap-scope=macro,debuginfo`. +- If `split-debuginfo` is `packed` or `unpacked`, then `--remap-scope=macro` +This is because we always want to remap panic messages as they will always be embedded in executable/library, but we don't need to touch the separate +symbols file + +If `trim-path` is `2` (`true`), all paths will be affected, equivalent to `--remap-scope=macro,debuginfo,diagnostics` -Some interactions with compiler-intrinstic macros need to be considered, though these are entirely down to `rustc`'s implementation of -`--remap-path-prefix`: + +Some interactions with compiler-intrinstic macros need to be considered: 1. Path (of the current file) introduced by [`file!()`](https://doc.rust-lang.org/std/macro.file.html) *will* be remapped. **Things may break** if the code interacts with its own source file at runtime by using this macro. 2. Path introduced by [`include!()`](https://doc.rust-lang.org/std/macro.include.html) *will* be remapped, given that the included file is under the current working directory or a dependency package. -If the user further supplies custom `--remap-path-prefix` arguments via `RUSTFLAGS` or similar mechanisms, they will take precedence over the one -supplied by `trim-paths`. This means that the user-defined `--remap-path-prefix`s must be supplied *after* Cargo's own remapping. +If the user further supplies custom `--remap-path-prefix` arguments via `RUSTFLAGS` +or similar mechanisms, they will take precedence over the one supplied by `trim-paths`. This means that the user-defined remapping arguments must be +supplied *after* Cargo's own remapping. + Additionally, when using MSVC linker, Cargo should emit `/PDBALTPATH:%_PDB%` to the linker via `-C link-arg`. This makes the linker embed only the file name of the .pdb file without the path to it. -## Changing handling of sysroot path -The virtualisation of sysroot files to `/rustc/[SHA1 hash]/library/...` was done at compiler bootstraping, specifically when +## Changing handling of sysroot path in `rustc` + +The virtualisation of sysroot files to `/rustc/[commit hash]/library/...` was done at compiler bootstraping, specifically when `remap-debuginfo = true` in `config.toml`. This is done for Rust distribution on all channels. -At `rustc` runtime (i.e. compiling some code), we try to correlate this virtual path to a real path pointing to the file on the local file system. -Currently the result is represented internally as if the path was remapped by `--remap-path-prefix`, holding both the virtual name and local path. +At `rustc` runtime (i.e. compiling some code), we try to correlate this virtual path to a real path pointing to the file on the local file system +Currently the result is represented internally as if the path was remapped by a `--remap-path-prefix`, from local `rust-src` path to the virtual path. Only the virtual name is ever emitted for metadata or codegen. We want to change this behaviour such that, when `rust-src` source files can be -discovered, the virtual path is discarded and therefore will be embedded unless being remapped by `--remap-path-prefix` in the usual way. The relevant part of the code is here: -https://github.com/rust-lang/rust/blob/d8af907491e20339e41d048d6a32b41ddfa91dfe/compiler/rustc_metadata/src/rmeta/decoder.rs#L1637-L1765 +discovered, the virtual path is discarded and therefore the local path will be embedded, unless there is a `--remap-path-prefix` that causes this +local path to be remapped in the usual way. -We would also like to change the virtualisation of sysroot to `/rustc/[rustc version]/library/...`, instead of the rustc commit hash. This is shorter and more helpful as an identifier, and makes `trim-paths` easier to implement: to make the embedded path the same whether or not `rust-src` is installed, we need to emit the same sysroot virutalisation as was done during bootstrapping. Getting the version number is easier than getting the commit hash. The relevant part of the code is here: https://github.com/rust-lang/rust/blob/d8af907491e20339e41d048d6a32b41ddfa91dfe/src/bootstrap/lib.rs#L831-L834 # Drawbacks [drawbacks]: #drawbacks -With `trim-paths` enabled, if the `debug` option is simultaneously not `false` (it is turned off by default under `release` profile), paths in -debuginfo will also be remapped. Debuggers will no longer be able to automatically discover and load source files outside of the working directory. -This can be remidated by [debugger features](https://lldb.llvm.org/use/map.html#miscellaneous) remapping the path back to a filesystem path. - -The user also will not be able to `Ctrl+click` on any paths provided in panic messages or backtraces outside of the working directory. But +The user will not be able to `Ctrl+click` on any paths provided in panic messages or backtraces outside of the working directory. But there shouldn't be any confusion as the combination of pacakge name and version can be used to pinpoint the file. -As mentioned above, `trim-paths` may break code that relies on `file!()` to evaluate to an accessible path to the file. Hence enabling +As mentioned above, `trim-paths` may break code that relies on `std::file!()` to evaluate to an accessible path to the file. Hence enabling it by default for release builds may be a technically breaking change. Occurances of such use should be extremely rare but should be investigated via a Crater run. In case this breakage is unacceptable, `trim-paths` can be made an opt-in option rather than default in any build profile. @@ -160,10 +200,10 @@ Path to sysroot crates are specially handled by `rustc`. Due to this, the behavi Although good for privacy and reproducibility, some people find it a hinderance for debugging: https://github.com/rust-lang/rust/issues/85463. Hence the user should be given control on if they want the virtual or local path. -One alternative for the sysroot handling is to keep the logic in `rustc` largely the same, always emitting the virutalised path by default, and -then introduce an extra option named `--embed-local-sysroot` to embed the local paths if the source files can be found. This inovles adding an extra -option to `rustc` and prevents any uniformity in `--remap-path-prefix`'s handling over sysroot paths, compared to other paths (it currently doesn't -affect sysroot paths at all). +An alternative to `--remap-scope` is to have individual `--remap-path-prefxi`-like flags, one each for macro, debuginfo and diagnostics, requiring +the full mapping to be given for each context. This is similar to what GCC and Clang does as described below, but we have added a third context +for diagnostics. This technically enables for even finer grained control, allowing different paths in different +contexts to be remapped differently. However it will cause the command line to be very verbose under most normal use cases. # Prior art [prior-art]: #prior-art @@ -174,13 +214,17 @@ The name `trim-paths` came from the [similar feature](https://golang.org/cmd/go/ Go does not enable this by default. Since Go does not differ between debug and release builds, removing absolute paths for all build would be a hassle for debugging. However this is not an issue for Rust as we have separate debug build profile. +GCC and Clang both have a flag equivalent to `--remap-path-prefix`, but they also both have two separate flags one for only macro expansion and +the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. This is the origin of the `--remap-scope` idea. + # Unresolved questions [unresolved-questions]: #unresolved-questions - Should we treat the current working directory the same as other packages? We could have one fewer remapping rule by remapping all package roots to `[package name]-[version]`. A minor downside to this is not being able to `Ctrl+click` on paths to files the user is working on from panic messages. -- Should we use a slightly more complex remapping rule, like distinguishing packages from registry, git and path, as mentioned in https://github.com/rust-lang/rust/issues/40552? +- Should we use a slightly more complex remapping rule, like distinguishing packages from registry, git and path, as mentioned in + https://github.com/rust-lang/rust/issues/40552? - Will these cover all potentially embedded paths? Have we missed anything? - Should we make this affect more `CompileMode`s, such as `Check`, where the emitted `rmeta` file will also contain absolute paths? From f92a321768e273ed14cbf3ab1f709838fae41ab6 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Sun, 3 Oct 2021 11:53:38 +0100 Subject: [PATCH 06/32] Fix typos --- text/3127-trim-paths.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 76fa3304b1d..19bc276a2a1 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -14,14 +14,14 @@ the separate debug symbols file (depending on `split-debuginfo` settings). it will retain the paths in separate debug symbols file, if one exists, to help debuggers and profilers locate the source files. To facilitate this, a new flag named `--remap-scope` should be added to `rustc` controlling the behaviour of `--remap-path-prefix`, allowing us to fine -tune the scope of remapping, speicifying paths under which context (in marco expansion, in debuginfo or in diagnostics) +tune the scope of remapping, speicifying paths under which context (in macro expansion, in debuginfo or in diagnostics) should or shouldn't be remapped. # Motivation [motivation]: #motivation ## Sanitising local paths that are currently embedded -Currently, executables and libraies built by Cargo have a lot of embedded absolute paths. They most frequently appear in debug information and +Currently, executables and libraries built by Cargo have a lot of embedded absolute paths. They most frequently appear in debug information and panic messages (pointing to the panic location source file). As an example, consider the following package: `Cargo.toml`: @@ -103,7 +103,7 @@ This flag accepts a comma-separated list of values and may be specified multiple - `1`: sanitise only the paths in emitted executable or library binaries. It always affects paths from macros such as panic messages, and in debug information only if they will be embedded together with the binary (the default on platforms with ELF binaries, such as Linux and windows-gnu), but will not touch them if they are in a separate symbols file (the default on Windows MSVC and macOS) -- `2` or `ture`: sanitise paths in all compilation outputs, including compiled executable/library, separate symbols file (if one exists), and compiler diagnostics. +- `2` or `true`: sanitise paths in all compilation outputs, including compiled executable/library, separate symbols file (if one exists), and compiler diagnostics. The default release profile uses option `1`. You can also manually override it by specifying this option in `Cargo.toml`: ```toml @@ -186,7 +186,7 @@ The user will not be able to `Ctrl+click` on any paths provided in panic message there shouldn't be any confusion as the combination of pacakge name and version can be used to pinpoint the file. As mentioned above, `trim-paths` may break code that relies on `std::file!()` to evaluate to an accessible path to the file. Hence enabling -it by default for release builds may be a technically breaking change. Occurances of such use should be extremely rare but should be investigated +it by default for release builds may be a technically breaking change. Occurrences of such use should be extremely rare but should be investigated via a Crater run. In case this breakage is unacceptable, `trim-paths` can be made an opt-in option rather than default in any build profile. # Rationale and alternatives @@ -200,7 +200,7 @@ Path to sysroot crates are specially handled by `rustc`. Due to this, the behavi Although good for privacy and reproducibility, some people find it a hinderance for debugging: https://github.com/rust-lang/rust/issues/85463. Hence the user should be given control on if they want the virtual or local path. -An alternative to `--remap-scope` is to have individual `--remap-path-prefxi`-like flags, one each for macro, debuginfo and diagnostics, requiring +An alternative to `--remap-scope` is to have individual `--remap-path-prefix`-like flags, one each for macro, debuginfo and diagnostics, requiring the full mapping to be given for each context. This is similar to what GCC and Clang does as described below, but we have added a third context for diagnostics. This technically enables for even finer grained control, allowing different paths in different contexts to be remapped differently. However it will cause the command line to be very verbose under most normal use cases. From 2bd279272f1f88cc055c2d7124188e4123bddd0a Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Sat, 4 Dec 2021 22:37:31 +0000 Subject: [PATCH 07/32] Rename flag to --remap-path-scope and typo fixes --- text/3127-trim-paths.md | 44 +++++++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 21 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 19bc276a2a1..55eddf79fcb 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -13,9 +13,9 @@ the separate debug symbols file (depending on `split-debuginfo` settings). `cargo build` with the default `release` profile should not produce any host filesystem dependent paths into binary executable or library. But it will retain the paths in separate debug symbols file, if one exists, to help debuggers and profilers locate the source files. -To facilitate this, a new flag named `--remap-scope` should be added to `rustc` controlling the behaviour of `--remap-path-prefix`, allowing us to fine -tune the scope of remapping, speicifying paths under which context (in macro expansion, in debuginfo or in diagnostics) -should or shouldn't be remapped. +To facilitate this, a new flag named `--remap-path-scope` should be added to `rustc` controlling the behaviour of `--remap-path-prefix`, allowing us to fine +tune the scope of remapping, specifying paths under which context (in macro expansion, in debuginfo or in diagnostics) +should or shouldn't be remapped. # Motivation [motivation]: #motivation @@ -65,10 +65,11 @@ This is undesirable for the following reasons: 1. **Privacy**. `release` binaries may be distributed, and anyone could then see the builder's local OS account username. Additionally, some CI (such as [GitLab CI](https://docs.gitlab.com/runner/best_practice/#build-directory)) checks out the repo under a path where - it may include things that really aren't meant to be public. Without sanitising the path by default, this may be inadvertently leaked. + non-public information is included. Without sanitising the path by default, this may be inadvertently leaked. 2. **Build reproducibility**. We would like to make it easier to reproduce binary equivalent builds. While it is not required to maintain - reproducibility across different environments, removing environment-sensitive information from the build will increase tolerance on the inevitable - environment differences when trying to verify builds. + reproducibility across different environments, removing environment-sensitive information from the build will increase the tolerance on the + inevitable environment differences. This helps with build verification, as well as producing deterministic builds when using a distributed build + system. ## Handling sysroot paths At the moment, paths to the source files of standard and core libraries, even when they are present, always begin with a virtual prefix in the form @@ -86,10 +87,10 @@ should support finer grained control over paths in which contexts should be rema ## The rustc book: Command-line arguments -### `--remap-scope`: configure the scope of path remapping +### `--remap-path-scope`: configure the scope of path remapping When the `--remap-path-prefix` option is passed to rustc then source path prefixes in all output will be affected. -The `--remap-scope` argument can be used in conjunction with `--remap-path-prefix` to determine paths in which output context should be affected. +The `--remap-path-scope` argument can be used in conjunction with `--remap-path-prefix` to determine paths in which output context should be affected. This flag accepts a comma-separated list of values and may be specified multiple times. The valid scopes are: - `macro` - apply remappings to the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from @@ -119,7 +120,7 @@ When a path is in scope for sanitisation, it is replaced with the following rule 1. Path to the source files of the standard and core library (sysroot) will begin with `/rustc/[rustc commit hash]`. E.g. `/home/username/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs` -> `/rustc/fe72845f7bb6a77b9e671e6a4f32fe714962cec4/library/core/src/result.rs` -2. Path to the working directory will be replaced with `.`. E.g. `/home/username/crate/src/lib.rs` -> `./src/lib.rs`. +2. Path to the working directory will be stripped. E.g. `/home/username/crate/src/lib.rs` -> `src/lib.rs`. 3. Path to packages outside of the working directory will be replaced with `[package name]-[version]`. E.g. `/home/username/deps/foo/src/lib.rs` -> `foo-0.1.0/src/lib.rs` When a path to the source files of the standard and core library is *not* in scope for sanitisation, the emitted path will depend on if `rust-src` component @@ -137,23 +138,23 @@ We only need to change the behaviour for `Test` and `Build` compile modes. If `trim-paths` is `0` (`false`), no extra flag is supplied to `rustc`. -If `trip-paths` is `1` or `2` (`true`), then two `--remap-path-prefix` arguments are supplied to `rustc`: +If `trim-paths` is `1` or `2` (`true`), then two `--remap-path-prefix` arguments are supplied to `rustc`: - From the path of the local sysroot to `/rustc/[commit hash]`. -- If the compilation unit is under the working directory, from the absolute path to the working directory to `.`. +- If the compilation unit is under the working directory, from the the working directory absolute path to empty string. If it's outside the working directory, from the absolute path of the package root to `[package name]-[package version]`. -A further `--remap-scope` is also supplied for options `1` and `2`: +A further `--remap-path-scope` is also supplied for options `1` and `2`: If `trim-path` is `1`, then it depends on the setting of `split-debuginfo` (whether the setting is explicitly supplied or from the default) -- If `split-debuginfo` is `off`, then `--remap-scope=macro,debuginfo`. -- If `split-debuginfo` is `packed` or `unpacked`, then `--remap-scope=macro` +- If `split-debuginfo` is `off`, then `--remap-path-scope=macro,debuginfo`. +- If `split-debuginfo` is `packed` or `unpacked`, then `--remap-path-scope=macro` This is because we always want to remap panic messages as they will always be embedded in executable/library, but we don't need to touch the separate symbols file -If `trim-path` is `2` (`true`), all paths will be affected, equivalent to `--remap-scope=macro,debuginfo,diagnostics` +If `trim-path` is `2` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,debuginfo,diagnostics` -Some interactions with compiler-intrinstic macros need to be considered: +Some interactions with compiler-intrinsic macros need to be considered: 1. Path (of the current file) introduced by [`file!()`](https://doc.rust-lang.org/std/macro.file.html) *will* be remapped. **Things may break** if the code interacts with its own source file at runtime by using this macro. 2. Path introduced by [`include!()`](https://doc.rust-lang.org/std/macro.include.html) *will* be remapped, given that the included file is under @@ -172,8 +173,9 @@ only the file name of the .pdb file without the path to it. The virtualisation of sysroot files to `/rustc/[commit hash]/library/...` was done at compiler bootstraping, specifically when `remap-debuginfo = true` in `config.toml`. This is done for Rust distribution on all channels. -At `rustc` runtime (i.e. compiling some code), we try to correlate this virtual path to a real path pointing to the file on the local file system -Currently the result is represented internally as if the path was remapped by a `--remap-path-prefix`, from local `rust-src` path to the virtual path. +At `rustc` runtime (i.e. compiling some code), we try to correlate this virtual path to a real path pointing to the file on the local file system. +Currently the result is represented internally as if the path was remapped by a `--remap-path-prefix`, from local `rust-src` path to the virtual +path. Only the virtual name is ever emitted for metadata or codegen. We want to change this behaviour such that, when `rust-src` source files can be discovered, the virtual path is discarded and therefore the local path will be embedded, unless there is a `--remap-path-prefix` that causes this local path to be remapped in the usual way. @@ -183,7 +185,7 @@ local path to be remapped in the usual way. [drawbacks]: #drawbacks The user will not be able to `Ctrl+click` on any paths provided in panic messages or backtraces outside of the working directory. But -there shouldn't be any confusion as the combination of pacakge name and version can be used to pinpoint the file. +there shouldn't be any confusion as the combination of package name and version can be used to pinpoint the file. As mentioned above, `trim-paths` may break code that relies on `std::file!()` to evaluate to an accessible path to the file. Hence enabling it by default for release builds may be a technically breaking change. Occurrences of such use should be extremely rare but should be investigated @@ -200,7 +202,7 @@ Path to sysroot crates are specially handled by `rustc`. Due to this, the behavi Although good for privacy and reproducibility, some people find it a hinderance for debugging: https://github.com/rust-lang/rust/issues/85463. Hence the user should be given control on if they want the virtual or local path. -An alternative to `--remap-scope` is to have individual `--remap-path-prefix`-like flags, one each for macro, debuginfo and diagnostics, requiring +An alternative to `--remap-path-scope` is to have individual `--remap-path-prefix`-like flags, one each for macro, debuginfo and diagnostics, requiring the full mapping to be given for each context. This is similar to what GCC and Clang does as described below, but we have added a third context for diagnostics. This technically enables for even finer grained control, allowing different paths in different contexts to be remapped differently. However it will cause the command line to be very verbose under most normal use cases. @@ -215,7 +217,7 @@ Go does not enable this by default. Since Go does not differ between debug and r a hassle for debugging. However this is not an issue for Rust as we have separate debug build profile. GCC and Clang both have a flag equivalent to `--remap-path-prefix`, but they also both have two separate flags one for only macro expansion and -the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. This is the origin of the `--remap-scope` idea. +the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. This is the origin of the `--remap-path-scope` idea. # Unresolved questions [unresolved-questions]: #unresolved-questions From 998ecf4134f4cd719c705ff973e2e0ef1da3be71 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Sat, 4 Dec 2021 23:23:40 +0000 Subject: [PATCH 08/32] Add scoped mapping discussion --- text/3127-trim-paths.md | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 55eddf79fcb..12daaced161 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -202,10 +202,9 @@ Path to sysroot crates are specially handled by `rustc`. Due to this, the behavi Although good for privacy and reproducibility, some people find it a hinderance for debugging: https://github.com/rust-lang/rust/issues/85463. Hence the user should be given control on if they want the virtual or local path. -An alternative to `--remap-path-scope` is to have individual `--remap-path-prefix`-like flags, one each for macro, debuginfo and diagnostics, requiring -the full mapping to be given for each context. This is similar to what GCC and Clang does as described below, but we have added a third context -for diagnostics. This technically enables for even finer grained control, allowing different paths in different -contexts to be remapped differently. However it will cause the command line to be very verbose under most normal use cases. +An alternative is to extend the syntax accepted by `--remap-path-prefix` or add a new option called `--remap-path-prefix-scoped` which allows +scoping rules to be explicitly applied to each remapping. This can co-exist with `--remap-path-scope` so it will be discussed further in +[Future possibilities](#future-possibilities) section. # Prior art [prior-art]: #prior-art @@ -233,4 +232,13 @@ the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. # Future possibilities [future-possibilities]: #future-possibilities -N/A +If it turns out that we want to enable finer grained scoping control on each individual remapping, we could use a `scopes:from=to` syntax. +E.g. `debuginfo,diagnostics:/path/to/src=src` will remove all references to `/path/to/src` from compiler diagnostics and debug information, but +they are retained panic messages. This syntax can be used with either a brand new `--remap-path-prefix-scoped` option, or we could extend the +existing `--remap-path-prefix` option to take in this new syntax. + +If we were to extend the existing `--remap-path-prefix`, there may be an ambiguity to whether `:` means a separator between scope list and mapping, +or is it a part of the path; if the first `:` supplied belongs to the path then it would have to be escaped. This coudl be technically breaking. + +In any case, future inclusion of this new syntax will not affect `--remap-path-scope` introduced in this RFC. Scopes specified in `--remap-path-scope` +will be used as default for all mappings, and explicit scopes for an individual mapping will take precedence on that mapping. From ba6b2d8ba747fb2cde4cdd10a0096df6da572c84 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Sun, 5 Dec 2021 01:22:29 +0000 Subject: [PATCH 09/32] Elaborate on linkers for separate debuginfo --- text/3127-trim-paths.md | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 12daaced161..0a69d6d8707 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -164,10 +164,6 @@ If the user further supplies custom `--remap-path-prefix` arguments via `RUSTFLA or similar mechanisms, they will take precedence over the one supplied by `trim-paths`. This means that the user-defined remapping arguments must be supplied *after* Cargo's own remapping. - -Additionally, when using MSVC linker, Cargo should emit `/PDBALTPATH:%_PDB%` to the linker via `-C link-arg`. This makes the linker embed -only the file name of the .pdb file without the path to it. - ## Changing handling of sysroot path in `rustc` The virtualisation of sysroot files to `/rustc/[commit hash]/library/...` was done at compiler bootstraping, specifically when @@ -180,6 +176,20 @@ Only the virtual name is ever emitted for metadata or codegen. We want to change discovered, the virtual path is discarded and therefore the local path will be embedded, unless there is a `--remap-path-prefix` that causes this local path to be remapped in the usual way. +## Linker arguments + +If a separate debuginfo file is to be generated (which can be determined by `split-debuginfo` codegen option), the linker may include an absolute +path to the object into the binary. If the user wants debug information to be remapped, then the inclusion of this absolute path is +undesirable. `rustc` cannot exhaustively control the behaviour of an external program (the linker) specified by the user, but we should +supply appropriate linker options to mitigate this as much as we could. + +The linker in use can be determined by the [`linker-flavor`](https://doc.rust-lang.org/rustc/codegen-options/index.html#linker-flavor) flag, itself +generally being inferred by `rustc`. If `debuginfo` is in `--remap-path-scope` and `split-debuginfo` is not `off`, the following linker-specific +options should be emitted: + +- When using MSVC linker, `/PDBALTPATH:%_PDB%` should be emitted. This makes the linker embed only the file name of the .pdb file without the path + to it. + # Drawbacks [drawbacks]: #drawbacks @@ -221,11 +231,18 @@ the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. # Unresolved questions [unresolved-questions]: #unresolved-questions +- Should we use a slightly more complex remapping rule, like distinguishing packages from registry, git and path, as proposed in + [Issue #40552](https://github.com/rust-lang/rust/issues/40552)? +- With debug information in separate files, debuggers and Rust's own backtrace rely on the path embedded in the binary to find these files to display + source code lines, columns and symbols etc. If we sanitise these paths to relative paths, then debuggers and backtrace must be invoked + in specific directories for these paths to work. [For instance](https://github.com/rust-lang/rust/issues/87825#issuecomment-920693005), `cargo run` + invoked under crate root will fail to print meaningful backtrace symbols because the binary and `.pdb` file are under `target/release`, but + the backtrace library will attempt to find the `.pdb` file from the working directory (crate root), where it doesn't exist. +- At the time of writing, `rustc` recognises 10 [`linker-flavor`s](https://doc.rust-lang.org/rustc/codegen-options/index.html#linker-flavor). + We need to find the right option for each to change the embedded path to debug information. - Should we treat the current working directory the same as other packages? We could have one fewer remapping rule by remapping all package roots to `[package name]-[version]`. A minor downside to this is not being able to `Ctrl+click` on paths to files the user is working on from panic messages. -- Should we use a slightly more complex remapping rule, like distinguishing packages from registry, git and path, as mentioned in - https://github.com/rust-lang/rust/issues/40552? - Will these cover all potentially embedded paths? Have we missed anything? - Should we make this affect more `CompileMode`s, such as `Check`, where the emitted `rmeta` file will also contain absolute paths? From d7909488c9a1b6744aa79f941afb5b075c3a2195 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Sun, 5 Dec 2021 23:55:32 +0000 Subject: [PATCH 10/32] Add split-debuginfo-path scope --- text/3127-trim-paths.md | 53 ++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 30 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 0a69d6d8707..e0ccc6d542f 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -7,11 +7,10 @@ [summary]: #summary Cargo should have a [profile setting](https://doc.rust-lang.org/cargo/reference/profiles.html#profile-settings) named `trim-paths` -to sanitise absolute paths introduced during compilation that may be embedded in the compiled binary executable or library, and optionally in -the separate debug symbols file (depending on `split-debuginfo` settings). +to sanitise absolute paths introduced during compilation that may be embedded in the compiled binary executable or library. `cargo build` with the default `release` profile should not produce any host filesystem dependent paths into binary executable or library. But -it will retain the paths in separate debug symbols file, if one exists, to help debuggers and profilers locate the source files. +it will retain the paths inside separate debug symbols file, if one exists, to help debuggers and profilers locate the source files. To facilitate this, a new flag named `--remap-path-scope` should be added to `rustc` controlling the behaviour of `--remap-path-prefix`, allowing us to fine tune the scope of remapping, specifying paths under which context (in macro expansion, in debuginfo or in diagnostics) @@ -21,7 +20,7 @@ should or shouldn't be remapped. [motivation]: #motivation ## Sanitising local paths that are currently embedded -Currently, executables and libraries built by Cargo have a lot of embedded absolute paths. They most frequently appear in debug information and +Currently, executables and libraries built by Rust and Cargo have a lot of embedded absolute paths. They most frequently appear in debug information and panic messages (pointing to the panic location source file). As an example, consider the following package: `Cargo.toml`: @@ -75,7 +74,7 @@ This is undesirable for the following reasons: At the moment, paths to the source files of standard and core libraries, even when they are present, always begin with a virtual prefix in the form of `/rustc/[SHA1 hash]/library`. This is not an issue when the source files are not present (i.e. when `rust-src` component is not installed), but when a user installs `rust-src` they may want the path to their local copy of source files to be visible. Hence the default behaviour when `rust-src` -is installed should be to embed the local path. These local paths should be then affected by path remappings in the usual way. +is installed should be to use the local path. These local paths should be then affected by path remappings in the usual way. ## Preserving debuginfo to help debuggers At the moment, `--remap-path-prefix` will cause paths to source files in debuginfo to be remapped. On platforms where the debuginfo resides in a @@ -94,7 +93,8 @@ The `--remap-path-scope` argument can be used in conjunction with `--remap-path- This flag accepts a comma-separated list of values and may be specified multiple times. The valid scopes are: - `macro` - apply remappings to the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from -- `debuginfo` - apply remappings to debug information +- `debuginfo` - apply remappings to debug information, wherever they may be written to +- `split-debuginfo-path` - when `split-debuginfo=packed` or `unpacked`, apply remappings to the paths pointing to these split debug information files - `diagnostics` - apply remappings to printed compiler diagnostics ## Cargo @@ -103,8 +103,8 @@ This flag accepts a comma-separated list of values and may be specified multiple - `0` or `false`: no sanitisation at all - `1`: sanitise only the paths in emitted executable or library binaries. It always affects paths from macros such as panic messages, and in debug information only if they will be embedded together with the binary (the default on platforms with ELF binaries, such as Linux and windows-gnu), - but will not touch them if they are in a separate symbols file (the default on Windows MSVC and macOS) -- `2` or `true`: sanitise paths in all compilation outputs, including compiled executable/library, separate symbols file (if one exists), and compiler diagnostics. + but will not touch them if they are in separate files (the default on Windows MSVC and macOS). But the path to these separate files are sanitised. +- `2` or `true`: sanitise paths in all compilation outputs, including compiled executable/library, debug information, and compiler diagnostics. The default release profile uses option `1`. You can also manually override it by specifying this option in `Cargo.toml`: ```toml @@ -115,7 +115,7 @@ trim-paths = 2 trim-paths = 0 ``` -When a path is in scope for sanitisation, it is replaced with the following rules: +When a path is in scope for sanitisation, it is handled by the following rules: 1. Path to the source files of the standard and core library (sysroot) will begin with `/rustc/[rustc commit hash]`. E.g. `/home/username/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs` -> @@ -147,11 +147,11 @@ A further `--remap-path-scope` is also supplied for options `1` and `2`: If `trim-path` is `1`, then it depends on the setting of `split-debuginfo` (whether the setting is explicitly supplied or from the default) - If `split-debuginfo` is `off`, then `--remap-path-scope=macro,debuginfo`. -- If `split-debuginfo` is `packed` or `unpacked`, then `--remap-path-scope=macro` -This is because we always want to remap panic messages as they will always be embedded in executable/library, but we don't need to touch the separate -symbols file +- If `split-debuginfo` is `packed` or `unpacked`, then `--remap-path-scope=macro,split-debuginfo-path` +This is because we always want to remap panic messages as they will always be embedded in executable/library. We need to sanitise debug information +if they are embedded, but don't need to touch them if they are split. However in case they are split we need to sanitise the paths to these split files -If `trim-path` is `2` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,debuginfo,diagnostics` +If `trim-path` is `2` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,debuginfo,diagnostics,split-debuginfo-path` Some interactions with compiler-intrinsic macros need to be considered: @@ -176,20 +176,15 @@ Only the virtual name is ever emitted for metadata or codegen. We want to change discovered, the virtual path is discarded and therefore the local path will be embedded, unless there is a `--remap-path-prefix` that causes this local path to be remapped in the usual way. -## Linker arguments +## Split Debuginfo -If a separate debuginfo file is to be generated (which can be determined by `split-debuginfo` codegen option), the linker may include an absolute -path to the object into the binary. If the user wants debug information to be remapped, then the inclusion of this absolute path is -undesirable. `rustc` cannot exhaustively control the behaviour of an external program (the linker) specified by the user, but we should -supply appropriate linker options to mitigate this as much as we could. - -The linker in use can be determined by the [`linker-flavor`](https://doc.rust-lang.org/rustc/codegen-options/index.html#linker-flavor) flag, itself -generally being inferred by `rustc`. If `debuginfo` is in `--remap-path-scope` and `split-debuginfo` is not `off`, the following linker-specific -options should be emitted: - -- When using MSVC linker, `/PDBALTPATH:%_PDB%` should be emitted. This makes the linker embed only the file name of the .pdb file without the path - to it. +When debug information are not embedded in the binary (i.e. `split-debuginfo` is not `off`), absolute paths to various files containing debug +information are embedded into the binary instead. Such as the absolute path to `.pdb` file (MSVC, `packed`), `.dwo` files (ELF, `unpacked`), +and `.o` files (ELF, `packed`). This can be undesirable. As such, `split-debuginfo-path` is made specifically for these embedded paths. +On macOS and ELF platforms, these paths are introduced by `rustc` during codegen. With MSVC, however, the path to `.pdb` fil is generated and +embedded into the binary by the linker `link.exe`. The linker has a `/PDBALTPATH` option allows us to change the embedded path written to the +binary, which could be supplied by `rustc` # Drawbacks [drawbacks]: #drawbacks @@ -235,11 +230,9 @@ the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. [Issue #40552](https://github.com/rust-lang/rust/issues/40552)? - With debug information in separate files, debuggers and Rust's own backtrace rely on the path embedded in the binary to find these files to display source code lines, columns and symbols etc. If we sanitise these paths to relative paths, then debuggers and backtrace must be invoked - in specific directories for these paths to work. [For instance](https://github.com/rust-lang/rust/issues/87825#issuecomment-920693005), `cargo run` - invoked under crate root will fail to print meaningful backtrace symbols because the binary and `.pdb` file are under `target/release`, but - the backtrace library will attempt to find the `.pdb` file from the working directory (crate root), where it doesn't exist. -- At the time of writing, `rustc` recognises 10 [`linker-flavor`s](https://doc.rust-lang.org/rustc/codegen-options/index.html#linker-flavor). - We need to find the right option for each to change the embedded path to debug information. + in specific directories for these paths to work. [For instance](https://github.com/rust-lang/rust/issues/87825#issuecomment-920693005), if the + absolute path to the `.pdb` file is sanitised to the relative `target/release/foo.pdb`, then the binary must be invoked under the crate root as + `target/release/foo` to allow the correct backtrace to be displayed. - Should we treat the current working directory the same as other packages? We could have one fewer remapping rule by remapping all package roots to `[package name]-[version]`. A minor downside to this is not being able to `Ctrl+click` on paths to files the user is working on from panic messages. From d33e02957a49f01cd82c7e3ecb95aa7fdae95f66 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Sun, 24 Apr 2022 11:41:49 +0100 Subject: [PATCH 11/32] Rename split-debuginfo-path to split-debuginfo-file --- text/3127-trim-paths.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index e0ccc6d542f..17aaab4b775 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -94,7 +94,7 @@ This flag accepts a comma-separated list of values and may be specified multiple - `macro` - apply remappings to the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from - `debuginfo` - apply remappings to debug information, wherever they may be written to -- `split-debuginfo-path` - when `split-debuginfo=packed` or `unpacked`, apply remappings to the paths pointing to these split debug information files +- `split-debuginfo-file` - when `split-debuginfo=packed` or `unpacked`, apply remappings to the paths pointing to these split debug information files - `diagnostics` - apply remappings to printed compiler diagnostics ## Cargo @@ -147,11 +147,12 @@ A further `--remap-path-scope` is also supplied for options `1` and `2`: If `trim-path` is `1`, then it depends on the setting of `split-debuginfo` (whether the setting is explicitly supplied or from the default) - If `split-debuginfo` is `off`, then `--remap-path-scope=macro,debuginfo`. -- If `split-debuginfo` is `packed` or `unpacked`, then `--remap-path-scope=macro,split-debuginfo-path` +- If `split-debuginfo` is `packed` or `unpacked`, then `--remap-path-scope=macro,split-debuginfo-file` + This is because we always want to remap panic messages as they will always be embedded in executable/library. We need to sanitise debug information if they are embedded, but don't need to touch them if they are split. However in case they are split we need to sanitise the paths to these split files -If `trim-path` is `2` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,debuginfo,diagnostics,split-debuginfo-path` +If `trim-path` is `2` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,debuginfo,diagnostics,split-debuginfo-file` Some interactions with compiler-intrinsic macros need to be considered: @@ -180,7 +181,7 @@ local path to be remapped in the usual way. When debug information are not embedded in the binary (i.e. `split-debuginfo` is not `off`), absolute paths to various files containing debug information are embedded into the binary instead. Such as the absolute path to `.pdb` file (MSVC, `packed`), `.dwo` files (ELF, `unpacked`), -and `.o` files (ELF, `packed`). This can be undesirable. As such, `split-debuginfo-path` is made specifically for these embedded paths. +and `.o` files (ELF, `packed`). This can be undesirable. As such, `split-debuginfo-file` is made specifically for these embedded paths. On macOS and ELF platforms, these paths are introduced by `rustc` during codegen. With MSVC, however, the path to `.pdb` fil is generated and embedded into the binary by the linker `link.exe`. The linker has a `/PDBALTPATH` option allows us to change the embedded path written to the From 068858023c4d778c2f6e7a51ea8182b27f5db80d Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Mon, 25 Apr 2022 20:15:39 +0100 Subject: [PATCH 12/32] Typo fixes Co-authored-by: Josh Triplett --- text/3127-trim-paths.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 17aaab4b775..76e1e34d589 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -145,7 +145,7 @@ If `trim-paths` is `1` or `2` (`true`), then two `--remap-path-prefix` arguments A further `--remap-path-scope` is also supplied for options `1` and `2`: -If `trim-path` is `1`, then it depends on the setting of `split-debuginfo` (whether the setting is explicitly supplied or from the default) +If `trim-paths` is `1`, then it depends on the setting of `split-debuginfo` (whether the setting is explicitly supplied or from the default) - If `split-debuginfo` is `off`, then `--remap-path-scope=macro,debuginfo`. - If `split-debuginfo` is `packed` or `unpacked`, then `--remap-path-scope=macro,split-debuginfo-file` @@ -249,7 +249,7 @@ they are retained panic messages. This syntax can be used with either a brand ne existing `--remap-path-prefix` option to take in this new syntax. If we were to extend the existing `--remap-path-prefix`, there may be an ambiguity to whether `:` means a separator between scope list and mapping, -or is it a part of the path; if the first `:` supplied belongs to the path then it would have to be escaped. This coudl be technically breaking. +or is it a part of the path; if the first `:` supplied belongs to the path then it would have to be escaped. This could be technically breaking. In any case, future inclusion of this new syntax will not affect `--remap-path-scope` introduced in this RFC. Scopes specified in `--remap-path-scope` will be used as default for all mappings, and explicit scopes for an individual mapping will take precedence on that mapping. From aee42a6f62a6685750faf38b406c145a4d154bad Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Mon, 25 Apr 2022 21:20:54 +0100 Subject: [PATCH 13/32] Add `unsplit-debuginfo` scope --- text/3127-trim-paths.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 76e1e34d589..a67f1dac6ef 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -94,7 +94,8 @@ This flag accepts a comma-separated list of values and may be specified multiple - `macro` - apply remappings to the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from - `debuginfo` - apply remappings to debug information, wherever they may be written to -- `split-debuginfo-file` - when `split-debuginfo=packed` or `unpacked`, apply remappings to the paths pointing to these split debug information files +- `unsplit-debuginfo` - apply to remappings to debug information only when they are written to compiled executables or libraries, but not when they are in split files +- `split-debuginfo-file` - apply remappings to the paths pointing to split debug information files when `split-debuginfo=packed` or `unpacked`. Does nothing when debuginfo is embedded with the compiled executables or libraries - `diagnostics` - apply remappings to printed compiler diagnostics ## Cargo @@ -145,12 +146,9 @@ If `trim-paths` is `1` or `2` (`true`), then two `--remap-path-prefix` arguments A further `--remap-path-scope` is also supplied for options `1` and `2`: -If `trim-paths` is `1`, then it depends on the setting of `split-debuginfo` (whether the setting is explicitly supplied or from the default) -- If `split-debuginfo` is `off`, then `--remap-path-scope=macro,debuginfo`. -- If `split-debuginfo` is `packed` or `unpacked`, then `--remap-path-scope=macro,split-debuginfo-file` +If `trim-path` is `1`, then `--remap-path-scope=macro,unsplit-debuginfo,split-debuginfo-file`. -This is because we always want to remap panic messages as they will always be embedded in executable/library. We need to sanitise debug information -if they are embedded, but don't need to touch them if they are split. However in case they are split we need to sanitise the paths to these split files +As a result, panic messages (which are always embedded) are sanitised. If debug information is embedded, then they are sanitised; if they are split then they are kept untouched, but the paths to these split files are sanitised. If `trim-path` is `2` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,debuginfo,diagnostics,split-debuginfo-file` @@ -167,7 +165,7 @@ supplied *after* Cargo's own remapping. ## Changing handling of sysroot path in `rustc` -The virtualisation of sysroot files to `/rustc/[commit hash]/library/...` was done at compiler bootstraping, specifically when +The virtualisation of sysroot files to `/rustc/[commit hash]/library/...` was done at compiler bootstrapping, specifically when `remap-debuginfo = true` in `config.toml`. This is done for Rust distribution on all channels. At `rustc` runtime (i.e. compiling some code), we try to correlate this virtual path to a real path pointing to the file on the local file system. @@ -205,7 +203,7 @@ release builds. It has, over the past 4 years, gained a decent amount of popular implement. Path to sysroot crates are specially handled by `rustc`. Due to this, the behaviour we currently have is that all such paths are virtualised. -Although good for privacy and reproducibility, some people find it a hinderance for debugging: https://github.com/rust-lang/rust/issues/85463. +Although good for privacy and reproducibility, some people find it a hindrance for debugging: https://github.com/rust-lang/rust/issues/85463. Hence the user should be given control on if they want the virtual or local path. An alternative is to extend the syntax accepted by `--remap-path-prefix` or add a new option called `--remap-path-prefix-scoped` which allows From 8e33a46a11a76be27e4725ddb63e9800865a9dd9 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Mon, 25 Apr 2022 23:00:38 +0100 Subject: [PATCH 14/32] Use names instead of numbers for trim-paths options --- text/3127-trim-paths.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index a67f1dac6ef..15b5366a23b 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -101,19 +101,19 @@ This flag accepts a comma-separated list of values and may be specified multiple ## Cargo `trim-paths` is a profile setting which controls the sanitisation of file paths in compilation outputs. It has three valid options: -- `0` or `false`: no sanitisation at all -- `1`: sanitise only the paths in emitted executable or library binaries. It always affects paths from macros such as panic messages, and in debug information +- `none` or `false`: no sanitisation at all +- `object`: sanitise only the paths in emitted executable or library binaries. It always affects paths from macros such as panic messages, and in debug information only if they will be embedded together with the binary (the default on platforms with ELF binaries, such as Linux and windows-gnu), but will not touch them if they are in separate files (the default on Windows MSVC and macOS). But the path to these separate files are sanitised. -- `2` or `true`: sanitise paths in all compilation outputs, including compiled executable/library, debug information, and compiler diagnostics. +- `all` or `true`: sanitise paths in all compilation outputs, including compiled executable/library, debug information, and compiler diagnostics. -The default release profile uses option `1`. You can also manually override it by specifying this option in `Cargo.toml`: +The default release profile uses option `object`. You can also manually override it by specifying this option in `Cargo.toml`: ```toml [profile.dev] -trim-paths = 2 +trim-paths = all [profile.release] -trim-paths = 0 +trim-paths = none ``` When a path is in scope for sanitisation, it is handled by the following rules: @@ -137,20 +137,20 @@ This will not affect any hard-coded paths in the source code. We only need to change the behaviour for `Test` and `Build` compile modes. -If `trim-paths` is `0` (`false`), no extra flag is supplied to `rustc`. +If `trim-paths` is `none` (`false`), no extra flag is supplied to `rustc`. -If `trim-paths` is `1` or `2` (`true`), then two `--remap-path-prefix` arguments are supplied to `rustc`: +If `trim-paths` is `object` or `all` (`true`), then two `--remap-path-prefix` arguments are supplied to `rustc`: - From the path of the local sysroot to `/rustc/[commit hash]`. - If the compilation unit is under the working directory, from the the working directory absolute path to empty string. If it's outside the working directory, from the absolute path of the package root to `[package name]-[package version]`. -A further `--remap-path-scope` is also supplied for options `1` and `2`: +A further `--remap-path-scope` is also supplied for options `object` and `all`: -If `trim-path` is `1`, then `--remap-path-scope=macro,unsplit-debuginfo,split-debuginfo-file`. +If `trim-path` is `object`, then `--remap-path-scope=macro,unsplit-debuginfo,split-debuginfo-file`. As a result, panic messages (which are always embedded) are sanitised. If debug information is embedded, then they are sanitised; if they are split then they are kept untouched, but the paths to these split files are sanitised. -If `trim-path` is `2` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,debuginfo,diagnostics,split-debuginfo-file` +If `trim-path` is `all` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,debuginfo,diagnostics,split-debuginfo-file` Some interactions with compiler-intrinsic macros need to be considered: From 0b59e5ca63bb817ac83a58399a9441b415a59648 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Tue, 26 Apr 2022 00:05:03 +0100 Subject: [PATCH 15/32] Replace debuginfo with split-debuginfo option --- text/3127-trim-paths.md | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 15b5366a23b..d5a44f993c8 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -88,15 +88,17 @@ should support finer grained control over paths in which contexts should be rema ### `--remap-path-scope`: configure the scope of path remapping -When the `--remap-path-prefix` option is passed to rustc then source path prefixes in all output will be affected. +When the `--remap-path-prefix` option is passed to rustc, source path prefixes in all output will be affected by default. The `--remap-path-scope` argument can be used in conjunction with `--remap-path-prefix` to determine paths in which output context should be affected. This flag accepts a comma-separated list of values and may be specified multiple times. The valid scopes are: - `macro` - apply remappings to the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from -- `debuginfo` - apply remappings to debug information, wherever they may be written to -- `unsplit-debuginfo` - apply to remappings to debug information only when they are written to compiled executables or libraries, but not when they are in split files -- `split-debuginfo-file` - apply remappings to the paths pointing to split debug information files when `split-debuginfo=packed` or `unpacked`. Does nothing when debuginfo is embedded with the compiled executables or libraries - `diagnostics` - apply remappings to printed compiler diagnostics +- `unsplit-debuginfo` - apply to remappings to debug information only when they are written to compiled executables or libraries, but not when they are in split files +- `split-debuginfo` - apply remappings to debug information only when they are written to split debug information files, but not in compiled executables or libraries +- `split-debuginfo-file` - apply remappings to the paths pointing to split debug information files. Does nothing when these files are not generated. + +Debug information are written to split files when the separate codegen option `-C split-debuginfo=packed` or `unpacked` (whether by default or explicitly set). ## Cargo @@ -150,7 +152,7 @@ If `trim-path` is `object`, then `--remap-path-scope=macro,unsplit-debuginfo,spl As a result, panic messages (which are always embedded) are sanitised. If debug information is embedded, then they are sanitised; if they are split then they are kept untouched, but the paths to these split files are sanitised. -If `trim-path` is `all` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,debuginfo,diagnostics,split-debuginfo-file` +If `trim-path` is `all` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,split-debuginfo,unsplit-debuginfo,diagnostics,split-debuginfo-file` (or not supplying `--remap-path-scope` at all). Some interactions with compiler-intrinsic macros need to be considered: @@ -241,9 +243,10 @@ the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. # Future possibilities [future-possibilities]: #future-possibilities +## Per-mapping scope control If it turns out that we want to enable finer grained scoping control on each individual remapping, we could use a `scopes:from=to` syntax. -E.g. `debuginfo,diagnostics:/path/to/src=src` will remove all references to `/path/to/src` from compiler diagnostics and debug information, but -they are retained panic messages. This syntax can be used with either a brand new `--remap-path-prefix-scoped` option, or we could extend the +E.g. `split-debuginfo,unsplit-debuginfo,diagnostics:/path/to/src=src` will remove all references to `/path/to/src` from compiler diagnostics and debug information, but +they are retained in panic messages. This syntax can be used with either a brand new `--remap-path-prefix-scoped` option, or we could extend the existing `--remap-path-prefix` option to take in this new syntax. If we were to extend the existing `--remap-path-prefix`, there may be an ambiguity to whether `:` means a separator between scope list and mapping, From 604dcb0d7d76d86ff267e1cf5fa5ac705a78fbb4 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Tue, 26 Apr 2022 00:09:50 +0100 Subject: [PATCH 16/32] Add scope alias as a future possibility --- text/3127-trim-paths.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index d5a44f993c8..8013c567858 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -254,3 +254,8 @@ or is it a part of the path; if the first `:` supplied belongs to the path then In any case, future inclusion of this new syntax will not affect `--remap-path-scope` introduced in this RFC. Scopes specified in `--remap-path-scope` will be used as default for all mappings, and explicit scopes for an individual mapping will take precedence on that mapping. + +## Alias for scope options +`--remap-path-scope` can be made to accept additional options that act as aliases for one or more of the existing options. For instance, `--remap-path-scope=debuginfo` can be made equivalent to `--remap-path-scope=split-debuginfo,unsplit-debuginfo`. + +Additionally, `none`, `object` and `all` can be made aliases of what Cargo's `trim-paths` option is supposed to provide, such that Cargo's `trim-paths` option can be directly used as the value of `--remap-path-scope`. This allows the user to write `object,split-debuginfo` in `trim-paths` to remap paths in binaries/executables and split debuginfo files, but not in diagnostics. From 2d49c093b273db1dbd36306ee290c38ef67035b7 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Tue, 26 Apr 2022 21:17:14 +0100 Subject: [PATCH 17/32] Document the ambiguity of comma separated scopes --- text/3127-trim-paths.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 8013c567858..4d1db321d61 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -246,7 +246,12 @@ the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. ## Per-mapping scope control If it turns out that we want to enable finer grained scoping control on each individual remapping, we could use a `scopes:from=to` syntax. E.g. `split-debuginfo,unsplit-debuginfo,diagnostics:/path/to/src=src` will remove all references to `/path/to/src` from compiler diagnostics and debug information, but -they are retained in panic messages. This syntax can be used with either a brand new `--remap-path-prefix-scoped` option, or we could extend the +they are retained in panic messages. + +How exactly this new syntax will look like is, of course, up to further discussion. Using comma as a separator for scopes may look ambiguous as `macro,diagnostics:/path/from=to` could be interpreted as `macro` +and `diagnostics:/path/from=to`. + +This syntax can be used with either a brand new `--remap-path-prefix-scoped` option, or we could extend the existing `--remap-path-prefix` option to take in this new syntax. If we were to extend the existing `--remap-path-prefix`, there may be an ambiguity to whether `:` means a separator between scope list and mapping, From 23394fc69233842187d7b8f391b93f0e74b10793 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Tue, 26 Apr 2022 23:38:42 +0100 Subject: [PATCH 18/32] Add object and all as alias scopes --- text/3127-trim-paths.md | 35 ++++++++++++----------------------- 1 file changed, 12 insertions(+), 23 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 4d1db321d61..256fee9b136 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -97,19 +97,16 @@ This flag accepts a comma-separated list of values and may be specified multiple - `unsplit-debuginfo` - apply to remappings to debug information only when they are written to compiled executables or libraries, but not when they are in split files - `split-debuginfo` - apply remappings to debug information only when they are written to split debug information files, but not in compiled executables or libraries - `split-debuginfo-file` - apply remappings to the paths pointing to split debug information files. Does nothing when these files are not generated. +- `object` - an alias for `macro,unsplit-debuginfo,split-debuginfo-file`. This ensures all paths in compiled executables or libraries are remapped, but not elsewhere. +- `all` and `true` - an alias for all of the above, also equivalent to supplying `--remap-path-prefix` without this option. Debug information are written to split files when the separate codegen option `-C split-debuginfo=packed` or `unpacked` (whether by default or explicitly set). ## Cargo -`trim-paths` is a profile setting which controls the sanitisation of file paths in compilation outputs. It has three valid options: -- `none` or `false`: no sanitisation at all -- `object`: sanitise only the paths in emitted executable or library binaries. It always affects paths from macros such as panic messages, and in debug information - only if they will be embedded together with the binary (the default on platforms with ELF binaries, such as Linux and windows-gnu), - but will not touch them if they are in separate files (the default on Windows MSVC and macOS). But the path to these separate files are sanitised. -- `all` or `true`: sanitise paths in all compilation outputs, including compiled executable/library, debug information, and compiler diagnostics. +`trim-paths` is a profile setting which enables and controls the sanitisation of file paths in compilation outputs. It corresponds to the `--remap-path-scope` flag of rustc and accepts all valid scope, or combination of scopes that `--remap-path-scope` accepts, in addition to the `none` or `false` option which disables path sanitisation completely. -The default release profile uses option `object`. You can also manually override it by specifying this option in `Cargo.toml`: +It is defaulted to `none` for debug profiles, and `object` for release profiles. You can manually override it by specifying this option in `Cargo.toml`: ```toml [profile.dev] trim-paths = all @@ -118,7 +115,11 @@ trim-paths = all trim-paths = none ``` -When a path is in scope for sanitisation, it is handled by the following rules: +The default release profile setting (`object`) sanitises only the paths in emitted executable or library files. It always affects paths from macros such as panic messages, and in debug information + only if they will be embedded together with the binary (the default on platforms with ELF binaries, such as Linux and windows-gnu), + but will not touch them if they are in separate files (the default on Windows MSVC and macOS). But the path to these separate files are sanitised. + +The following paths are sanitised, if they appear in a covered scope: 1. Path to the source files of the standard and core library (sysroot) will begin with `/rustc/[rustc commit hash]`. E.g. `/home/username/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs` -> @@ -130,7 +131,7 @@ When a path to the source files of the standard and core library is *not* in sco is present. If it is, then the real path pointing to a copy of the source files on your file system will be emitted; if it isn't, then they will show up as `/rustc/[rustc commit hash]/library/...` (just like when it is selected for sanitisation). Paths to all other source files will not be affected. -This will not affect any hard-coded paths in the source code. +This will not affect any hard-coded paths in the source code, such as in strings. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -141,19 +142,12 @@ We only need to change the behaviour for `Test` and `Build` compile modes. If `trim-paths` is `none` (`false`), no extra flag is supplied to `rustc`. -If `trim-paths` is `object` or `all` (`true`), then two `--remap-path-prefix` arguments are supplied to `rustc`: +If `trim-paths` is anything else, then its value is supplied directly to `rustc`'s `--remap-path-scope` option, along with two `--remap-path-prefix` arguments: - From the path of the local sysroot to `/rustc/[commit hash]`. - If the compilation unit is under the working directory, from the the working directory absolute path to empty string. If it's outside the working directory, from the absolute path of the package root to `[package name]-[package version]`. -A further `--remap-path-scope` is also supplied for options `object` and `all`: - -If `trim-path` is `object`, then `--remap-path-scope=macro,unsplit-debuginfo,split-debuginfo-file`. - -As a result, panic messages (which are always embedded) are sanitised. If debug information is embedded, then they are sanitised; if they are split then they are kept untouched, but the paths to these split files are sanitised. - -If `trim-path` is `all` (`true`), all paths will be affected, equivalent to `--remap-path-scope=macro,split-debuginfo,unsplit-debuginfo,diagnostics,split-debuginfo-file` (or not supplying `--remap-path-scope` at all). - +The default value of `trim-paths` is `object` for release profile. As a result, panic messages (which are always embedded) are sanitised. If debug information is embedded, then they are sanitised; if they are split then they are kept untouched, but the paths to these split files are sanitised. Some interactions with compiler-intrinsic macros need to be considered: 1. Path (of the current file) introduced by [`file!()`](https://doc.rust-lang.org/std/macro.file.html) *will* be remapped. **Things may break** if @@ -259,8 +253,3 @@ or is it a part of the path; if the first `:` supplied belongs to the path then In any case, future inclusion of this new syntax will not affect `--remap-path-scope` introduced in this RFC. Scopes specified in `--remap-path-scope` will be used as default for all mappings, and explicit scopes for an individual mapping will take precedence on that mapping. - -## Alias for scope options -`--remap-path-scope` can be made to accept additional options that act as aliases for one or more of the existing options. For instance, `--remap-path-scope=debuginfo` can be made equivalent to `--remap-path-scope=split-debuginfo,unsplit-debuginfo`. - -Additionally, `none`, `object` and `all` can be made aliases of what Cargo's `trim-paths` option is supposed to provide, such that Cargo's `trim-paths` option can be directly used as the value of `--remap-path-scope`. This allows the user to write `object,split-debuginfo` in `trim-paths` to remap paths in binaries/executables and split debuginfo files, but not in diagnostics. From e15308cbee4026c2510414fd193ad7853cc8a27c Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Tue, 26 Apr 2022 23:58:39 +0100 Subject: [PATCH 19/32] Clarify the effects of multiple --remap-path-scope --- text/3127-trim-paths.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 256fee9b136..d2a7a9cbc35 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -90,7 +90,7 @@ should support finer grained control over paths in which contexts should be rema When the `--remap-path-prefix` option is passed to rustc, source path prefixes in all output will be affected by default. The `--remap-path-scope` argument can be used in conjunction with `--remap-path-prefix` to determine paths in which output context should be affected. -This flag accepts a comma-separated list of values and may be specified multiple times. The valid scopes are: +This flag accepts a comma-separated list of values and may be specified multiple times, in which case the scopes are aggregated together. The valid scopes are: - `macro` - apply remappings to the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from - `diagnostics` - apply remappings to printed compiler diagnostics From dec1901f60bbd577055f9a09ed98f4222826a9a8 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Fri, 29 Apr 2022 00:20:12 +0100 Subject: [PATCH 20/32] Improve wordings --- text/3127-trim-paths.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index d2a7a9cbc35..11ce6d26e41 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -98,7 +98,7 @@ This flag accepts a comma-separated list of values and may be specified multiple - `split-debuginfo` - apply remappings to debug information only when they are written to split debug information files, but not in compiled executables or libraries - `split-debuginfo-file` - apply remappings to the paths pointing to split debug information files. Does nothing when these files are not generated. - `object` - an alias for `macro,unsplit-debuginfo,split-debuginfo-file`. This ensures all paths in compiled executables or libraries are remapped, but not elsewhere. -- `all` and `true` - an alias for all of the above, also equivalent to supplying `--remap-path-prefix` without this option. +- `all` and `true` - an alias for all of the above, also equivalent to supplying only `--remap-path-prefix` without `--remap-path-scope`. Debug information are written to split files when the separate codegen option `-C split-debuginfo=packed` or `unpacked` (whether by default or explicitly set). @@ -119,7 +119,7 @@ The default release profile setting (`object`) sanitises only the paths in emitt only if they will be embedded together with the binary (the default on platforms with ELF binaries, such as Linux and windows-gnu), but will not touch them if they are in separate files (the default on Windows MSVC and macOS). But the path to these separate files are sanitised. -The following paths are sanitised, if they appear in a covered scope: +If `trim-paths` is not `none` or `false`, then the following paths are sanitised if they appear in a selected scope: 1. Path to the source files of the standard and core library (sysroot) will begin with `/rustc/[rustc commit hash]`. E.g. `/home/username/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs` -> @@ -128,7 +128,7 @@ The following paths are sanitised, if they appear in a covered scope: 3. Path to packages outside of the working directory will be replaced with `[package name]-[version]`. E.g. `/home/username/deps/foo/src/lib.rs` -> `foo-0.1.0/src/lib.rs` When a path to the source files of the standard and core library is *not* in scope for sanitisation, the emitted path will depend on if `rust-src` component -is present. If it is, then the real path pointing to a copy of the source files on your file system will be emitted; if it isn't, then they will +is present. If it is, then the real path pointing to the copy of the source files on your file system will be emitted; if it isn't, then they will show up as `/rustc/[rustc commit hash]/library/...` (just like when it is selected for sanitisation). Paths to all other source files will not be affected. This will not affect any hard-coded paths in the source code, such as in strings. From bb079f4b65e0e1456e8f2396d09940755772f8d1 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Fri, 29 Apr 2022 00:42:55 +0100 Subject: [PATCH 21/32] List options specifically for cargo --- text/3127-trim-paths.md | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 11ce6d26e41..fb6eea49cac 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -94,7 +94,7 @@ This flag accepts a comma-separated list of values and may be specified multiple - `macro` - apply remappings to the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from - `diagnostics` - apply remappings to printed compiler diagnostics -- `unsplit-debuginfo` - apply to remappings to debug information only when they are written to compiled executables or libraries, but not when they are in split files +- `unsplit-debuginfo` - apply remappings to debug information only when they are written to compiled executables or libraries, but not when they are in split debuginfo files - `split-debuginfo` - apply remappings to debug information only when they are written to split debug information files, but not in compiled executables or libraries - `split-debuginfo-file` - apply remappings to the paths pointing to split debug information files. Does nothing when these files are not generated. - `object` - an alias for `macro,unsplit-debuginfo,split-debuginfo-file`. This ensures all paths in compiled executables or libraries are remapped, but not elsewhere. @@ -104,7 +104,16 @@ Debug information are written to split files when the separate codegen option `- ## Cargo -`trim-paths` is a profile setting which enables and controls the sanitisation of file paths in compilation outputs. It corresponds to the `--remap-path-scope` flag of rustc and accepts all valid scope, or combination of scopes that `--remap-path-scope` accepts, in addition to the `none` or `false` option which disables path sanitisation completely. +`trim-paths` is a profile setting which enables and controls the sanitisation of file paths in compilation outputs. It corresponds to the `--remap-path-scope` flag of rustc and accepts all valid scope, or combination of scopes that `--remap-path-scope` accepts, in addition to the `none` or `false` option which disables path sanitisation completely. Possible values are: + +- `none` and `false` - disable path sanitisation +- `macro` - sanitise paths in the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from +- `diagnostics` - sanitise paths in printed compiler diagnostics +- `unsplit-debuginfo` - sanitise paths in debug information in compiled executables or libraries. Does nothing if debug information are in split files +- `split-debuginfo` - sanitise paths in debug information in split debuginfo files. Does nothing if debug information are in compiled executables or libraries +- `split-debuginfo-file` - sanitise paths pointing to split debug information files. Does nothing if these files are not generated. +- `object` - an alias for `macro,unsplit-debuginfo,split-debuginfo-file`. This ensures all paths in compiled executables or libraries are sanitised, but not elsewhere. +- `all` and `true` - an alias for all of the above It is defaulted to `none` for debug profiles, and `object` for release profiles. You can manually override it by specifying this option in `Cargo.toml`: ```toml From 7c533d3602eb4023b99db38642a399d168c9e84d Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Thu, 26 May 2022 19:25:11 +0100 Subject: [PATCH 22/32] Change it back to `split-debuginfo-path` --- text/3127-trim-paths.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index fb6eea49cac..6cb0b727235 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -96,8 +96,8 @@ This flag accepts a comma-separated list of values and may be specified multiple - `diagnostics` - apply remappings to printed compiler diagnostics - `unsplit-debuginfo` - apply remappings to debug information only when they are written to compiled executables or libraries, but not when they are in split debuginfo files - `split-debuginfo` - apply remappings to debug information only when they are written to split debug information files, but not in compiled executables or libraries -- `split-debuginfo-file` - apply remappings to the paths pointing to split debug information files. Does nothing when these files are not generated. -- `object` - an alias for `macro,unsplit-debuginfo,split-debuginfo-file`. This ensures all paths in compiled executables or libraries are remapped, but not elsewhere. +- `split-debuginfo-path` - apply remappings to the paths pointing to split debug information files. Does nothing when these files are not generated. +- `object` - an alias for `macro,unsplit-debuginfo,split-debuginfo-path`. This ensures all paths in compiled executables or libraries are remapped, but not elsewhere. - `all` and `true` - an alias for all of the above, also equivalent to supplying only `--remap-path-prefix` without `--remap-path-scope`. Debug information are written to split files when the separate codegen option `-C split-debuginfo=packed` or `unpacked` (whether by default or explicitly set). @@ -111,8 +111,8 @@ Debug information are written to split files when the separate codegen option `- - `diagnostics` - sanitise paths in printed compiler diagnostics - `unsplit-debuginfo` - sanitise paths in debug information in compiled executables or libraries. Does nothing if debug information are in split files - `split-debuginfo` - sanitise paths in debug information in split debuginfo files. Does nothing if debug information are in compiled executables or libraries -- `split-debuginfo-file` - sanitise paths pointing to split debug information files. Does nothing if these files are not generated. -- `object` - an alias for `macro,unsplit-debuginfo,split-debuginfo-file`. This ensures all paths in compiled executables or libraries are sanitised, but not elsewhere. +- `split-debuginfo-path` - sanitise paths pointing to split debug information files. Does nothing if these files are not generated. +- `object` - an alias for `macro,unsplit-debuginfo,split-debuginfo-path`. This ensures all paths in compiled executables or libraries are sanitised, but not elsewhere. - `all` and `true` - an alias for all of the above It is defaulted to `none` for debug profiles, and `object` for release profiles. You can manually override it by specifying this option in `Cargo.toml`: @@ -184,9 +184,9 @@ local path to be remapped in the usual way. When debug information are not embedded in the binary (i.e. `split-debuginfo` is not `off`), absolute paths to various files containing debug information are embedded into the binary instead. Such as the absolute path to `.pdb` file (MSVC, `packed`), `.dwo` files (ELF, `unpacked`), -and `.o` files (ELF, `packed`). This can be undesirable. As such, `split-debuginfo-file` is made specifically for these embedded paths. +and `.o` files (ELF, `packed`). This can be undesirable. As such, `split-debuginfo-path` is made specifically for these embedded paths. -On macOS and ELF platforms, these paths are introduced by `rustc` during codegen. With MSVC, however, the path to `.pdb` fil is generated and +On macOS and ELF platforms, these paths are introduced by `rustc` during codegen. With MSVC, however, the path to `.pdb` file is generated and embedded into the binary by the linker `link.exe`. The linker has a `/PDBALTPATH` option allows us to change the embedded path written to the binary, which could be supplied by `rustc` From 6e45014b20aac5c29ae34a41ffce8c768814fd60 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Thu, 26 May 2022 19:32:25 +0100 Subject: [PATCH 23/32] Make the Cargo option effective to all compile modes --- text/3127-trim-paths.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 6cb0b727235..7e93f016dd9 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -147,8 +147,6 @@ This will not affect any hard-coded paths in the source code, such as in strings ## `trim-paths` implementation in Cargo -We only need to change the behaviour for `Test` and `Build` compile modes. - If `trim-paths` is `none` (`false`), no extra flag is supplied to `rustc`. If `trim-paths` is anything else, then its value is supplied directly to `rustc`'s `--remap-path-scope` option, along with two `--remap-path-prefix` arguments: @@ -241,7 +239,6 @@ the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. package roots to `[package name]-[version]`. A minor downside to this is not being able to `Ctrl+click` on paths to files the user is working on from panic messages. - Will these cover all potentially embedded paths? Have we missed anything? -- Should we make this affect more `CompileMode`s, such as `Check`, where the emitted `rmeta` file will also contain absolute paths? # Future possibilities [future-possibilities]: #future-possibilities From 34d43863cb9fdcb54a6adb4d552c7ca3174b812a Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Sat, 9 Jul 2022 23:08:20 +0100 Subject: [PATCH 24/32] Clarify the sysroot situation --- text/3127-trim-paths.md | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 7e93f016dd9..aeaeae04a3b 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -73,8 +73,8 @@ This is undesirable for the following reasons: ## Handling sysroot paths At the moment, paths to the source files of standard and core libraries, even when they are present, always begin with a virtual prefix in the form of `/rustc/[SHA1 hash]/library`. This is not an issue when the source files are not present (i.e. when `rust-src` component is not installed), but -when a user installs `rust-src` they may want the path to their local copy of source files to be visible. Hence the default behaviour when `rust-src` -is installed should be to use the local path. These local paths should be then affected by path remappings in the usual way. +when a user installs `rust-src` they may want the path to their local copy of source files to be visible. Sometimes this is simply impossible as the path originated from the pre-compiled std and core and outside of rustc's control, but the local path should be used where possible. +Hence the default behaviour when `rust-src` is installed should be to use the local path. These local paths should be then affected by path remappings in the usual way. ## Preserving debuginfo to help debuggers At the moment, `--remap-path-prefix` will cause paths to source files in debuginfo to be remapped. On platforms where the debuginfo resides in a @@ -168,13 +168,12 @@ supplied *after* Cargo's own remapping. ## Changing handling of sysroot path in `rustc` -The virtualisation of sysroot files to `/rustc/[commit hash]/library/...` was done at compiler bootstrapping, specifically when -`remap-debuginfo = true` in `config.toml`. This is done for Rust distribution on all channels. +The remapping of sysroot paths to `/rustc/[commit hash]/library/...` was done when std and core libraries are compiled by Rust's release CI. Unless [`build-std`](https://doc.rust-lang.org/cargo/reference/unstable.html#build-std) is specified, these pre-compiled artifacts are used. -At `rustc` runtime (i.e. compiling some code), we try to correlate this virtual path to a real path pointing to the file on the local file system. +Most of the time, these paths are never handled by `rustc`, since they are in the debuginfo of pre-compiled binaries to be directly copied by the linker. However, sometimes (such as when compiling monomorphised functions), `rustc` does pick up these metadata. When this happens, `rustc` tries to correlate this virtual path to a real path pointing to the file on the local file system. Currently the result is represented internally as if the path was remapped by a `--remap-path-prefix`, from local `rust-src` path to the virtual -path. -Only the virtual name is ever emitted for metadata or codegen. We want to change this behaviour such that, when `rust-src` source files can be +path `/rustc/[commit hash]/library/...`. +Only the virtual path is ever emitted for metadata or codegen. We want to change this behaviour such that, when `rust-src` source files can be discovered, the virtual path is discarded and therefore the local path will be embedded, unless there is a `--remap-path-prefix` that causes this local path to be remapped in the usual way. @@ -259,3 +258,18 @@ or is it a part of the path; if the first `:` supplied belongs to the path then In any case, future inclusion of this new syntax will not affect `--remap-path-scope` introduced in this RFC. Scopes specified in `--remap-path-scope` will be used as default for all mappings, and explicit scopes for an individual mapping will take precedence on that mapping. + +## Sysroot paths uniformity +Since some virtualised sysroot paths are hardcoded in the pre-compiled debuginfo, while the others can be resolved back to a local path with `rust-src`, the user may see them interleaved +``` + 0: rust_begin_unwind + at /rustc/881c1ac408d93bb7adaa3a51dabab9266e82eee8/library/std/src/panicking.rs:493:5 + 1: core::panicking::panic_fmt + at /rustc/881c1ac408d93bb7adaa3a51dabab9266e82eee8/library/core/src/panicking.rs:92:14 + 2: core::result::unwrap_failed + at /rustc/881c1ac408d93bb7adaa3a51dabab9266e82eee8/library/core/src/result.rs:1355:5 + 3: core::result::Result::unwrap + at /home/jonas/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:1037:23 +``` + +This is not very nice. It is infeasible to fix up the pre-compiled debuginfo before linking to fully remove the virtual paths, so demapping needs to happen when it is displayed (in this case, when the backtrace is printed). This is out of scope of this RFC but it may be something we want to do separately in the future. From 785c229df17ce939236b912e5bbeaa28b4a86f2a Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Tue, 12 Jul 2022 19:37:21 +0100 Subject: [PATCH 25/32] Simplify possible scopes for Cargo --- text/3127-trim-paths.md | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index aeaeae04a3b..69b5e0f4613 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -104,24 +104,21 @@ Debug information are written to split files when the separate codegen option `- ## Cargo -`trim-paths` is a profile setting which enables and controls the sanitisation of file paths in compilation outputs. It corresponds to the `--remap-path-scope` flag of rustc and accepts all valid scope, or combination of scopes that `--remap-path-scope` accepts, in addition to the `none` or `false` option which disables path sanitisation completely. Possible values are: +`trim-paths` is a profile setting which enables and controls the sanitisation of file paths in build outputs. It is a simplified version of rustc's `--remap-path-scope`. It takes a comma separated list of the following values: - `none` and `false` - disable path sanitisation - `macro` - sanitise paths in the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from - `diagnostics` - sanitise paths in printed compiler diagnostics -- `unsplit-debuginfo` - sanitise paths in debug information in compiled executables or libraries. Does nothing if debug information are in split files -- `split-debuginfo` - sanitise paths in debug information in split debuginfo files. Does nothing if debug information are in compiled executables or libraries -- `split-debuginfo-path` - sanitise paths pointing to split debug information files. Does nothing if these files are not generated. -- `object` - an alias for `macro,unsplit-debuginfo,split-debuginfo-path`. This ensures all paths in compiled executables or libraries are sanitised, but not elsewhere. -- `all` and `true` - an alias for all of the above +- `object` - sanitise paths in compiled executables or libraries +- `all` and `true` - sanitise paths in all possible locations It is defaulted to `none` for debug profiles, and `object` for release profiles. You can manually override it by specifying this option in `Cargo.toml`: ```toml [profile.dev] -trim-paths = all +trim-paths = "all" [profile.release] -trim-paths = none +trim-paths = "none" ``` The default release profile setting (`object`) sanitises only the paths in emitted executable or library files. It always affects paths from macros such as panic messages, and in debug information From a357827e31b0eadcff437681b283be66ea3e3e2f Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Sun, 17 Jul 2022 19:55:23 +0100 Subject: [PATCH 26/32] Add CARGO_TRIM_PATHS --- text/3127-trim-paths.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 69b5e0f4613..2a8af3f46c8 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -133,8 +133,12 @@ If `trim-paths` is not `none` or `false`, then the following paths are sanitised 2. Path to the working directory will be stripped. E.g. `/home/username/crate/src/lib.rs` -> `src/lib.rs`. 3. Path to packages outside of the working directory will be replaced with `[package name]-[version]`. E.g. `/home/username/deps/foo/src/lib.rs` -> `foo-0.1.0/src/lib.rs` +Paths requiring sanitisation can be retrieved by build scripts at their execution time from the environment variable `CARGO_TRIM_PATHS`, in comma-separated format. +If a build script does anything that may result in these, or any other absolute paths, to be included in compilation outputs, such as by invoking a C compiler, then the build script should make sure they are trimmed. +Cargo's mapping scheme (what Cargo will map these paths to) is not provided in `CARGO_TRIM_PATHS`, and build scripts are free to decide as long as they are reproducible and privacy preserving. + When a path to the source files of the standard and core library is *not* in scope for sanitisation, the emitted path will depend on if `rust-src` component -is present. If it is, then the real path pointing to the copy of the source files on your file system will be emitted; if it isn't, then they will +is present. If it is, then some paths will point to the copy of the source files on your file system; if it isn't, then they will show up as `/rustc/[rustc commit hash]/library/...` (just like when it is selected for sanitisation). Paths to all other source files will not be affected. This will not affect any hard-coded paths in the source code, such as in strings. @@ -151,6 +155,9 @@ If `trim-paths` is anything else, then its value is supplied directly to `rustc` - If the compilation unit is under the working directory, from the the working directory absolute path to empty string. If it's outside the working directory, from the absolute path of the package root to `[package name]-[package version]`. +If a package in the dependency tree has build scripts, then the absolute path to the package root is supplied by +the environment variable `CARGO_TRIM_PATHS` when executing build scripts. + The default value of `trim-paths` is `object` for release profile. As a result, panic messages (which are always embedded) are sanitised. If debug information is embedded, then they are sanitised; if they are split then they are kept untouched, but the paths to these split files are sanitised. Some interactions with compiler-intrinsic macros need to be considered: From 528600804c1847aaea24b3d05e28b238bdcb30d0 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Sun, 17 Jul 2022 22:16:11 +0100 Subject: [PATCH 27/32] Add example usages --- text/3127-trim-paths.md | 68 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 2a8af3f46c8..ea595337ad1 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -191,6 +191,74 @@ On macOS and ELF platforms, these paths are introduced by `rustc` during codegen embedded into the binary by the linker `link.exe`. The linker has a `/PDBALTPATH` option allows us to change the embedded path written to the binary, which could be supplied by `rustc` +# Usage examples + +## Alice wants to ship her binaries, but doesn't want others to see her username + +It works out of the box! + +```console +Alice$ cargo build --release +``` + +## Bob wants to profile his program and see the original function names in the report + +He needs the debug information emitted and preserved, so he changes his `Cargo.toml` file + +```toml +[profile.release] +trim-paths = "none" +debuginfo = 1 +``` + +```console +Bob$ cargo build --release && perf record cargo run --release +``` + +## Eve wants to symbolicate her users' crash reports from binaries without debug information + +She needs to use the `split-debuginfo` feature to produce a separate file containing debug information + +```toml +[profile.release] +split-debuginfo = "packed" +debuginfo = 1 +``` + +Again, the default works fine. + +```console +Eve$ cargo build --release +``` + +She can ship her binary like Alice, without worrying about leaking usernames. + +## Hana needs to compile a C program in their build script + +They can consult `CARGO_TRIM_PATHS` in their build script to find out which paths need to be sanitised + +```rust +// in build.rs + +let mut gcc = Command::new("gcc"); + +if let Ok(paths) = std::env::var("CARGO_TRIM_PATHS") { + for to_trim in paths.split(','){ + gcc.arg(format!("-ffile-prefix-map={to_trim}=redacted")); + } +} + +gcc.args(["-std=c11", "-O2", "-o=lib.o", "lib.c"]); + +let output = gcc.output(); + +//... do stuff +``` + +```console +Hana$ cargo build --release +``` + # Drawbacks [drawbacks]: #drawbacks From 8ba3510bc182d811918dabba9285795e267488f0 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Fri, 14 Oct 2022 16:27:34 +0200 Subject: [PATCH 28/32] Clarify that not all options are intended to be stabilized Co-authored-by: Josh Triplett --- text/3127-trim-paths.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index ea595337ad1..0da9250127b 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -102,6 +102,8 @@ This flag accepts a comma-separated list of values and may be specified multiple Debug information are written to split files when the separate codegen option `-C split-debuginfo=packed` or `unpacked` (whether by default or explicitly set). +Note: this RFC is not a commitment to stabilizing all of these options; stabilization will evaluate each option and see if that option carries enough value to stabilize. + ## Cargo `trim-paths` is a profile setting which enables and controls the sanitisation of file paths in build outputs. It is a simplified version of rustc's `--remap-path-scope`. It takes a comma separated list of the following values: @@ -112,6 +114,8 @@ Debug information are written to split files when the separate codegen option `- - `object` - sanitise paths in compiled executables or libraries - `all` and `true` - sanitise paths in all possible locations +Note: this RFC is not a commitment to stabilizing all of these options; stabilization will evaluate each option and see if that option carries enough value to stabilize. + It is defaulted to `none` for debug profiles, and `object` for release profiles. You can manually override it by specifying this option in `Cargo.toml`: ```toml [profile.dev] From 3f59b7b26b4a68e6bf31c3f3b212bed6017d1561 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Wed, 8 Feb 2023 21:14:06 +0100 Subject: [PATCH 29/32] Add option rationales --- text/3127-trim-paths.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 0da9250127b..ebef0ea08be 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -288,6 +288,22 @@ An alternative is to extend the syntax accepted by `--remap-path-prefix` or add scoping rules to be explicitly applied to each remapping. This can co-exist with `--remap-path-scope` so it will be discussed further in [Future possibilities](#future-possibilities) section. +## Rationale for the `--remap-path-scope` options +There are quite a few options available for `--remap-path-scope`. Not all of them are expected to have meaningful use-cases in their own right. +Some are only added for completeness, that is, the behaviour of `--remap-path-prefix=all` (or the original `--remap-path-prefix` on its own) is +the same as specifying all individual scopes. In the future, we expect some of the scopes to be removed as independent options, while preserving +the behaviour of `--remap-path-prefix=all` and the stable `--remap-path-prefix`, which is "Remap source names in all output". + +- `macro` is primarily meant for panic messages embedded in binaries. +- `diagnostics` is unlikely to be used on its own as it only affects console outputs, but is required for completeness. See [#87745](https://github.com/rust-lang/rust/issues/87745). +- `unsplit-debuginfo` is used to sanitise debuginfo embedded in binaries. +- `split-debuginfo` is used to sanitise debuginfo separate from binaries. This is may be used when debuginfo files are separate and the author +still wants to distribute them. +- `split-debuginfo-path` is used to sanitise the path embedded in binaries pointing to separate debuginfo files. This is likely needed in all +contexts where `unsplit-debuginfo` is used, but it's technically a separate piece of information inserted by the linker, not rustc. +- `object` is a shorthand for the most common use-case: sanitise everything in binaries, but nowhere else. +- `all` and `true` preserves the documented behaviour of `--remap-path-prefix`. + # Prior art [prior-art]: #prior-art From 16edfb2cfc5197a6d32f5af4b6045aa1b58b6666 Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Wed, 8 Feb 2023 22:49:13 +0100 Subject: [PATCH 30/32] Change CARGO_TRIM_PATHS to the profile option --- text/3127-trim-paths.md | 34 ++++++++++++++++++++-------------- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index ebef0ea08be..57172d4989d 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -137,16 +137,17 @@ If `trim-paths` is not `none` or `false`, then the following paths are sanitised 2. Path to the working directory will be stripped. E.g. `/home/username/crate/src/lib.rs` -> `src/lib.rs`. 3. Path to packages outside of the working directory will be replaced with `[package name]-[version]`. E.g. `/home/username/deps/foo/src/lib.rs` -> `foo-0.1.0/src/lib.rs` -Paths requiring sanitisation can be retrieved by build scripts at their execution time from the environment variable `CARGO_TRIM_PATHS`, in comma-separated format. -If a build script does anything that may result in these, or any other absolute paths, to be included in compilation outputs, such as by invoking a C compiler, then the build script should make sure they are trimmed. -Cargo's mapping scheme (what Cargo will map these paths to) is not provided in `CARGO_TRIM_PATHS`, and build scripts are free to decide as long as they are reproducible and privacy preserving. - When a path to the source files of the standard and core library is *not* in scope for sanitisation, the emitted path will depend on if `rust-src` component is present. If it is, then some paths will point to the copy of the source files on your file system; if it isn't, then they will show up as `/rustc/[rustc commit hash]/library/...` (just like when it is selected for sanitisation). Paths to all other source files will not be affected. This will not affect any hard-coded paths in the source code, such as in strings. +### Environment variables Cargo sets for build scripts +* `CARGO_TRIM_PATHS` - The value of `trim-paths` profile option. If the build script introduces absolute paths to built artefacts (such as +by invoking a compiler), the user may request them to be sanitised in different types of artefacts. Common paths requiring sanitisation +include `OUT_DIR` and `CARGO_MANIFEST_DIR`, plus any other introduced by the build script, such as include directories. + # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -159,9 +160,6 @@ If `trim-paths` is anything else, then its value is supplied directly to `rustc` - If the compilation unit is under the working directory, from the the working directory absolute path to empty string. If it's outside the working directory, from the absolute path of the package root to `[package name]-[package version]`. -If a package in the dependency tree has build scripts, then the absolute path to the package root is supplied by -the environment variable `CARGO_TRIM_PATHS` when executing build scripts. - The default value of `trim-paths` is `object` for release profile. As a result, panic messages (which are always embedded) are sanitised. If debug information is embedded, then they are sanitised; if they are split then they are kept untouched, but the paths to these split files are sanitised. Some interactions with compiler-intrinsic macros need to be considered: @@ -239,20 +237,28 @@ She can ship her binary like Alice, without worrying about leaking usernames. ## Hana needs to compile a C program in their build script -They can consult `CARGO_TRIM_PATHS` in their build script to find out which paths need to be sanitised +They can consult `CARGO_TRIM_PATHS` in their build script to find out paths in what places the user wants sanitised ```rust // in build.rs +use std::env; +use std::process::Command; let mut gcc = Command::new("gcc"); - -if let Ok(paths) = std::env::var("CARGO_TRIM_PATHS") { - for to_trim in paths.split(','){ - gcc.arg(format!("-ffile-prefix-map={to_trim}=redacted")); - } +let out_dir = env::var("OUT_DIR").unwrap(); +let scope = env::var("CARGO_TRIM_PATHS").unwrap(); + +if scope != "none" && scope != "false" { + // Runtime working directory of the build script + let cwd = env::var("CARGO_MANIFEST_DIR").unwrap(); + let gcc_scope = match scope.as_str() { + "macro" => "-fmacro-prefix-map", + _ => "-ffile-prefix-map", + }; + gcc.args([&format!("{gcc_scope}={cwd}=redacted"), &format!("{gcc_scope}={out_dir}=redacted")]); } -gcc.args(["-std=c11", "-O2", "-o=lib.o", "lib.c"]); +gcc.args(["-std=c11", &format!("-o={out_dir}/lib.o"), "lib.c"]); let output = gcc.output(); From cbcb1dfc4ef4ff6d2a7df698436ad4d435f0aabf Mon Sep 17 00:00:00 2001 From: Andy Wang Date: Mon, 20 Feb 2023 23:11:04 +0100 Subject: [PATCH 31/32] Current working directory -> current package --- text/3127-trim-paths.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 57172d4989d..0e16094090f 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -127,15 +127,15 @@ trim-paths = "none" The default release profile setting (`object`) sanitises only the paths in emitted executable or library files. It always affects paths from macros such as panic messages, and in debug information only if they will be embedded together with the binary (the default on platforms with ELF binaries, such as Linux and windows-gnu), - but will not touch them if they are in separate files (the default on Windows MSVC and macOS). But the path to these separate files are sanitised. + but will not touch them if they are in separate files (the default on Windows MSVC and macOS). But the paths to these separate files are sanitised. If `trim-paths` is not `none` or `false`, then the following paths are sanitised if they appear in a selected scope: 1. Path to the source files of the standard and core library (sysroot) will begin with `/rustc/[rustc commit hash]`. E.g. `/home/username/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs` -> `/rustc/fe72845f7bb6a77b9e671e6a4f32fe714962cec4/library/core/src/result.rs` -2. Path to the working directory will be stripped. E.g. `/home/username/crate/src/lib.rs` -> `src/lib.rs`. -3. Path to packages outside of the working directory will be replaced with `[package name]-[version]`. E.g. `/home/username/deps/foo/src/lib.rs` -> `foo-0.1.0/src/lib.rs` +2. Path to the current package will be stripped. E.g. `/home/username/crate/src/lib.rs` -> `src/lib.rs`. +3. Path to dependency packages will be replaced with `[package name]-[version]`. E.g. `/home/username/deps/foo/src/lib.rs` -> `foo-0.1.0/src/lib.rs` When a path to the source files of the standard and core library is *not* in scope for sanitisation, the emitted path will depend on if `rust-src` component is present. If it is, then some paths will point to the copy of the source files on your file system; if it isn't, then they will @@ -157,8 +157,8 @@ If `trim-paths` is `none` (`false`), no extra flag is supplied to `rustc`. If `trim-paths` is anything else, then its value is supplied directly to `rustc`'s `--remap-path-scope` option, along with two `--remap-path-prefix` arguments: - From the path of the local sysroot to `/rustc/[commit hash]`. -- If the compilation unit is under the working directory, from the the working directory absolute path to empty string. - If it's outside the working directory, from the absolute path of the package root to `[package name]-[package version]`. +- For the the current package (where the current working directory is in), from the the absolute path of the package root to empty string. + For other packages, from the absolute path of the package root to `[package name]-[package version]`. The default value of `trim-paths` is `object` for release profile. As a result, panic messages (which are always embedded) are sanitised. If debug information is embedded, then they are sanitised; if they are split then they are kept untouched, but the paths to these split files are sanitised. @@ -166,7 +166,7 @@ Some interactions with compiler-intrinsic macros need to be considered: 1. Path (of the current file) introduced by [`file!()`](https://doc.rust-lang.org/std/macro.file.html) *will* be remapped. **Things may break** if the code interacts with its own source file at runtime by using this macro. 2. Path introduced by [`include!()`](https://doc.rust-lang.org/std/macro.include.html) *will* be remapped, given that the included file is under - the current working directory or a dependency package. + the current package or a dependency package. If the user further supplies custom `--remap-path-prefix` arguments via `RUSTFLAGS` or similar mechanisms, they will take precedence over the one supplied by `trim-paths`. This means that the user-defined remapping arguments must be @@ -286,7 +286,7 @@ There has been an issue (https://github.com/rust-lang/rust/issues/40552) asking release builds. It has, over the past 4 years, gained a decent amount of popular support. The remapping rule proposed here is very simple to implement. -Path to sysroot crates are specially handled by `rustc`. Due to this, the behaviour we currently have is that all such paths are virtualised. +Paths to sysroot crates are specially handled by `rustc`. Due to this, the behaviour we currently have is that all such paths are virtualised. Although good for privacy and reproducibility, some people find it a hindrance for debugging: https://github.com/rust-lang/rust/issues/85463. Hence the user should be given control on if they want the virtual or local path. @@ -303,7 +303,7 @@ the behaviour of `--remap-path-prefix=all` and the stable `--remap-path-prefix`, - `macro` is primarily meant for panic messages embedded in binaries. - `diagnostics` is unlikely to be used on its own as it only affects console outputs, but is required for completeness. See [#87745](https://github.com/rust-lang/rust/issues/87745). - `unsplit-debuginfo` is used to sanitise debuginfo embedded in binaries. -- `split-debuginfo` is used to sanitise debuginfo separate from binaries. This is may be used when debuginfo files are separate and the author +- `split-debuginfo` is used to sanitise debuginfo separate from binaries. This may be used when debuginfo files are separate and the author still wants to distribute them. - `split-debuginfo-path` is used to sanitise the path embedded in binaries pointing to separate debuginfo files. This is likely needed in all contexts where `unsplit-debuginfo` is used, but it's technically a separate piece of information inserted by the linker, not rustc. @@ -332,7 +332,7 @@ the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. in specific directories for these paths to work. [For instance](https://github.com/rust-lang/rust/issues/87825#issuecomment-920693005), if the absolute path to the `.pdb` file is sanitised to the relative `target/release/foo.pdb`, then the binary must be invoked under the crate root as `target/release/foo` to allow the correct backtrace to be displayed. -- Should we treat the current working directory the same as other packages? We could have one fewer remapping rule by remapping all +- Should we treat the current package the same as other packages? We could have one fewer remapping rule by remapping all package roots to `[package name]-[version]`. A minor downside to this is not being able to `Ctrl+click` on paths to files the user is working on from panic messages. - Will these cover all potentially embedded paths? Have we missed anything? From 451f16335acf41e3012ff8b104817fc05e4c41f9 Mon Sep 17 00:00:00 2001 From: Eric Huss Date: Sat, 13 May 2023 10:43:01 -0700 Subject: [PATCH 32/32] Update tracking issue --- text/3127-trim-paths.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3127-trim-paths.md b/text/3127-trim-paths.md index 0e16094090f..8883a471349 100644 --- a/text/3127-trim-paths.md +++ b/text/3127-trim-paths.md @@ -1,7 +1,7 @@ - Feature Name: trim-paths - Start Date: 2021-05-24 - RFC PR: [rust-lang/rfcs#3127](https://github.com/rust-lang/rfcs/pull/3127) -- Rust Issue: N/A +- Rust Issue: [rust-lang/rust#111540](https://github.com/rust-lang/rust/issues/111540) # Summary [summary]: #summary