Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support configuring whether to capture backtraces at runtime #93101

Merged
merged 1 commit into from
Feb 3, 2022

Conversation

Mark-Simulacrum
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented Jan 20, 2022

Tracking issue: #93346

This adds a new API to the std::panic module which configures whether and how the default panic hook will emit a backtrace when a panic occurs.

After discussion with @yaahc on Zulip, this PR chooses to avoid adjusting or seeking to provide a similar API for the (currently unstable) std::backtrace API. It seems likely that the users of that API may wish to expose more specific settings rather than just a global one (e.g., emulating the env_logger, tracing per-module configuration) to avoid the cost of capture in hot code. The API added here could plausibly be copied and/or re-exported directly from std::backtrace relatively easily, but I don't think that's the right call as of now.

mod panic {
    #[derive(Copy, Clone, Debug, PartialEq, Eq)]
    #[non_exhaustive]
    pub enum BacktraceStyle {
        Short,
        Full,
        Off,
    }
    fn set_backtrace_style(BacktraceStyle);
    fn get_backtrace_style() -> Option<BacktraceStyle>;
}

Several unresolved questions:

  • Do we need to move to a thread-local or otherwise more customizable strategy for whether to capture backtraces? See this comment for some potential use cases for this.
    • Proposed answer: no, leave this for third-party hooks.
  • Bikeshed on naming of all the options, as usual.
  • Should BacktraceStyle be moved into std::backtrace?
    • It's already somewhat annoying to import and/or re-type the std::panic:: prefix necessary to use these APIs, probably adding a second module to the mix isn't worth it.

Note that PR #79085 proposed a much simpler API, but particularly in light of the desire to fully replace setting environment variables via env::set_var to control the backtrace API, a more complete API seems preferable. This PR likely subsumes that one.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 20, 2022
@Mark-Simulacrum Mark-Simulacrum added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 20, 2022
@rust-log-analyzer

This comment has been minimized.

@yaahc
Copy link
Member

yaahc commented Jan 20, 2022

  • Should we expose knobs for configuring library backtraces separately from those printed on panic!?

In my opinion, yes. My expectation is that there would be one knob associated with std::backtrace::Backtrace and the other knob associated somehow with the default std panic hook. Intuitively I expect ppl will want an API that sets both to the same value as well, though at that point it feels like three knobs is maybe too many knobs, so maybe that yes is not 100% certain.

Part of me separately thinks that we should make it so the knob only exists on the lib backtrace type, and then expect ppl to replace the panic hook when they want to configure the panic backtrace behavior separately, and polish that experience to make it as easy as possible to change. Maybe going so far as exporting a std::panic::Hook type that has some configurability knobs that can then be installed to override the default configuration.

Should the API permit configuring full vs. short backtraces (and potentially other options in the future)?

Yes as well imo

@yaahc raised in February 2021 that an alternative to the current Backtrace interface may be introduced in core "soon" -- this has not materialized, and in any case the API added here would be trivial to move to core (either by itself or alongside other parts of the backtrace design).

That comment is out of date at this point. We did prototype and show how we could split the exterior API of Backtrace from the implementation across core and std to enable error in core, but my conclusion once we finished that was that the hacks necessary are not worth it, and so our current plan is to entirely separate Backtrace from the error trait by removing the backtrace function and replacing it with generic member access. Also, this work is ongoing with the FCP for merging the provider API, the core functionality for generic member access, being started just yesterday. At this point I don't think we should gate any changes to the Backtrace API on backtrace in core since its likely not a path we will go down.

  • @yaahc raised in November 2020 a desire for a more customizable hook, similar to that provided for panics. This is something we could do later, but it also seems likely to be rather complicated, and likely to target the panic hook specifically rather than backtraces in general (presuming we eventually stabilize the regular Backtrace type).

100% agreed, this kind of dynamic backtrace capture configurability should be handled in the panic hook / eyre reporter / error type constructor or equivalent mechanism as the caller of Backtrace::capture()/Backtrace::force_capture().

@Mark-Simulacrum
Copy link
Member Author

Replacing the panic hook feels like a rather heavy-weight operation from a user's perspective compared to adjusting a single setting, so I don't think we should optimize for users doing it over providing knobs to adjust specific behaviors (e.g., whether to print backtraces), as this PR does. Overriding the panic hook with any kind of custom code in a wide-spread way also means that it's harder for us in std to make the messaging/integration of the hook with the panic machinery nicer by default (e.g., trimming the backtrace in panics to avoid some of the default machinery). So I don't think it's a good idea to depend on that strategy long-term as opposed to the flag-based design proposed by this PR.

It seems to me that a single knob that matches the available environment variables is better suited at this point than panic hook replacement.

[...] At this point I don't think we should gate any changes to the Backtrace API on backtrace in core since its likely not a path we will go down.

I think we agree here on the important aspect, at least, that this PR is not related to its placement in core (or not).

@rust-log-analyzer

This comment has been minimized.

@yaahc
Copy link
Member

yaahc commented Jan 20, 2022

Replacing the panic hook feels like a rather heavy-weight operation from a user's perspective compared to adjusting a single setting, so I don't think we should optimize for users doing it over providing knobs to adjust specific behaviors (e.g., whether to print backtraces), as this PR does.

To be clear, I don't mean customizing the panic hook by re-creating one from scratch. I was more thinking about abstracting the logic in the default hook we provide and exporting that as well so that users can easily tweek the reporting format while preserving the rest of the logic that they still want to reuse. In fact, with the current state of our panic hook it is impossible for users to replicate all the functionality we provide, IIRC we use a special non public interface for capturing backtraces that is more resistant to oom or something along those lines.

I'm imagining something potentially as simple as:

std::panic::Hook::default()
    .set_capture_preference(my_runtime_capture_preference)
    .install();

I agree that right now it's heavy-weight operation but I think this is a weakness of our error handling story and not something we should be perpetuating in other adjacent designs. In my opinion tweeking the error reporting for panics and recoverable errors should be normalized and we should seek to make it as lightweight and convenient as possible.

Overriding the panic hook with any kind of custom code in a wide-spread way also means that it's harder for us in std to make the messaging/integration of the hook with the panic machinery nicer by default (e.g., trimming the backtrace in panics to avoid some of the default machinery). So I don't think it's a good idea to depend on that strategy long-term as opposed to the flag-based design proposed by this PR.

Similarly, I think this is an indicator that such messaging and integration support that we have in our panic hook should be exported as well as part of our API so it can be reused in customized hooks. I don't think its reasonable for us to expect or plan on the default hook being sufficent for all applications and we should actively plan for customization of it.

@rust-log-analyzer

This comment has been minimized.

@Mark-Simulacrum
Copy link
Member Author

With the functions provided by the new API (after the latest push), it's just a couple lines to set panics on:

std::backtrace::set_library(std::backtrace::CapturePreference::Full);
std::backtrace::set_panic(std::backtrace::CapturePreference::Full);

I agree that exposing the internals of the default hook in a way that users can customize seems like a good goal. However, I don't think it precludes providing the APIs proposed here. In fact, I think rather than "building up" the default hook by providing a series of options that it takes, we should strive to expose these as global flags that std provides read/write APIs for (this PR provides the write side).

That way a downstream library that wants to expose similar functionality to a panic hook (e.g., interfacing with Python or similar, and wanting the same behavior from exceptions thrown in Python as panics thrown in Rust) can read those options and respect the user's choices -- without needing to be separately configured. I suspect many of the configuration knobs we might want to provide will have similar applications outside just that of the std hook.

This is all to say that while I agree that we should plan for -- and expect -- that users will want to customize the panic hook and other error handling functionality, I am not convinced that the right mechanism for doing so is to leverage a builder-style API that builds up a panic hook implementation. And, it seems to me, that even if we had such a builder style API, the APIs added by this PR would not be an inconvenience, but rather complement it relatively well. At minimum, the set_library function is largely orthogonal from the panic hook, so would presumably want to exist regardless.

@yaahc
Copy link
Member

yaahc commented Jan 20, 2022

I agree that exposing the internals of the default hook in a way that users can customize seems like a good goal. However, I don't think it precludes providing the APIs proposed here. In fact, I think rather than "building up" the default hook by providing a series of options that it takes, we should strive to expose these as global flags that std provides read/write APIs for (this PR provides the write side).

That way a downstream library that wants to expose similar functionality to a panic hook (e.g., interfacing with Python or similar, and wanting the same behavior from exceptions thrown in Python as panics thrown in Rust) can read those options and respect the user's choices -- without needing to be separately configured. I suspect many of the configuration knobs we might want to provide will have similar applications outside just that of the std hook.

Fair enough, are you thinking of adding get_library and get_panic functions that return the set CapturePreference? Either way big +1 for the style of approach of exposing the building blocks rather than a Builder API.

This is all to say that while I agree that we should plan for -- and expect -- that users will want to customize the panic hook and other error handling functionality, I am not convinced that the right mechanism for doing so is to leverage a builder-style API that builds up a panic hook implementation. And, it seems to me, that even if we had such a builder style API, the APIs added by this PR would not be an inconvenience, but rather complement it relatively well. At minimum, the set_library function is largely orthogonal from the panic hook, so would presumably want to exist regardless.

Fair point fair point, I didn't really intend to push the builder API specifically, just that I want some uniformity between how we interface with Backtrace across both panics and recoverable errors and to make it easy to preserve this functionality when transitioning from the default panic hook to a customized one.


Regarding the specific split proposed here, one of the things that bothers me and feels inconsistent is that Backtrace:capture() is influenced by set_library, but there is no equivalent public API for capturing a backtrace for a panic that is influenced by set_panic. I'm imagining a situation where we have both the set and get functions for these settings, and it feels kinda strange to respect the set_library configuration by calling Backtrace::capture, but to respect the set_panic configuration by calling if get_panic() == CapturePreference::??? { Backtrace::force_capture() }, also, I'm not really sure how one would respect the Full/Short split in this case.

I'm thinking this implies that the Backtrace apis should really be something like this:

fn set_library(CapturePreference);
fn get_library() -> CapturePreference;
fn capture_library() -> Backtrace;
fn force_capture_library() -> Backtrace;

fn set_panic(CapturePreference);
fn get_panic() -> CapturePreference;
fn capture_panic() -> Backtrace;
fn force_capture_panic() -> Backtrace;

Then you can respect or override either configuration while respecting the format configuration of each knob. Though I guess this raises the question, do we expect users to ever enable both kinds of backtraces but have one be Full and the other Short?

If not we could make the format a separate knob and make force_capture generally usable for both sides. Something like this:

fn set_format(DisplayFormat);
fn get_format() -> DisplayFormat;

fn set_library(enabled: bool);
fn get_library() -> bool;
fn capture_library() -> Backtrace;

fn set_panic(enabled: bool);
fn get_panic() -> bool;
fn capture_panic() -> Backtrace;


fn force_capture() -> Backtrace;

And as a minor nit, and I know you already mentioned this in the top level comment so this isn't a blocker just adding support behind that unresolved question: I really don't like the set_library/set_panic names, though I guess we're kinda stuck with it based on the naming of the already stable RUST_BACKTRACE and RUST_LIB_BACKTRACE env variables... It just feels lame calling the backtrace that's being used for recoverable errors a library backtrace. "library" seems like an irrelevant distinction here. And set_panic makes it seem like its somehow related to the panic APIs, rather than just being intended for use in panic reporting implementations by convention. panic vs library feels like a poor dichotomy. I kinda wish the split was recoverable vs non-recoverable, or user-facing vs dev-facing, or really just anything more direct.

edit: wait actually no, RUST_LIB_BACKTRACE is not stabilized, we're not stuck here!

@bors
Copy link
Contributor

bors commented Jan 21, 2022

☔ The latest upstream changes (presumably #93119) made this pull request unmergeable. Please resolve the merge conflicts.

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jan 26, 2022
@Mark-Simulacrum
Copy link
Member Author

@yaahc revised this PR per our discussion on Zulip to just focus on the std::panic API, leaving the std::backtrace::Backtrace::capture() behavior out of scope, likely to be addressed by a separate PR.

@rust-log-analyzer

This comment has been minimized.

@yaahc
Copy link
Member

yaahc commented Jan 26, 2022

LGTM, feel free to r+ once it's passing tests and has a tracking issue.

@Mark-Simulacrum
Copy link
Member Author

r? @yaahc updated per comments (and updated the tracking issue)

@rust-highfive rust-highfive assigned yaahc and unassigned m-ou-se Jan 27, 2022
@rust-log-analyzer

This comment has been minimized.

@yaahc
Copy link
Member

yaahc commented Jan 28, 2022

Looks good to me other than the clarifying question on the last comment I left unresolved.

@Mark-Simulacrum
Copy link
Member Author

CI failure is unrelated to this PR, so r? @yaahc for my response there

1 => BacktraceStyle::Short,
2 => BacktraceStyle::Full,
3 => BacktraceStyle::Off,
_ => unreachable!(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we return None here to avoid any possibility of panic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can make it a debug_assert or so, but it seems like a None on other variants makes it a little too easy to just forget to change this function after adding a new variant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its fine for now. Like Mark said, the panic defends against errors being introduced in refactors. I'd rather leave it in until it becomes an issue.

library/std/src/panicking.rs Outdated Show resolved Hide resolved
library/std/src/panic.rs Show resolved Hide resolved
library/std/src/panic.rs Outdated Show resolved Hide resolved
library/std/src/panic.rs Show resolved Hide resolved
@yaahc
Copy link
Member

yaahc commented Jan 31, 2022

@bors r+

@bors
Copy link
Contributor

bors commented Jan 31, 2022

📌 Commit 368a83d has been approved by yaahc

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 31, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 1, 2022
…r=yaahc

Support configuring whether to capture backtraces at runtime

Tracking issue: rust-lang#93346

This adds a new API to the `std::panic` module which configures whether and how the default panic hook will emit a backtrace when a panic occurs.

After discussion with `@yaahc` on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/backtrace.20lib.20vs.2E.20panic), this PR chooses to avoid adjusting or seeking to provide a similar API for the (currently unstable) std::backtrace API. It seems likely that the users of that API may wish to expose more specific settings rather than just a global one (e.g., emulating the `env_logger`, `tracing` per-module configuration) to avoid the cost of capture in hot code. The API added here could plausibly be copied and/or re-exported directly from std::backtrace relatively easily, but I don't think that's the right call as of now.

```rust
mod panic {
    #[derive(Copy, Clone, Debug, PartialEq, Eq)]
    #[non_exhaustive]
    pub enum BacktraceStyle {
        Short,
        Full,
        Off,
    }
    fn set_backtrace_style(BacktraceStyle);
    fn get_backtrace_style() -> Option<BacktraceStyle>;
}
```

Several unresolved questions:

* Do we need to move to a thread-local or otherwise more customizable strategy for whether to capture backtraces? See [this comment](rust-lang#79085 (comment)) for some potential use cases for this.
   * Proposed answer: no, leave this for third-party hooks.
* Bikeshed on naming of all the options, as usual.
* Should BacktraceStyle be moved into `std::backtrace`?
   * It's already somewhat annoying to import and/or re-type the `std::panic::` prefix necessary to use these APIs, probably adding a second module to the mix isn't worth it.

Note that PR rust-lang#79085 proposed a much simpler API, but particularly in light of the desire to fully replace setting environment variables via `env::set_var` to control the backtrace API, a more complete API seems preferable. This PR likely subsumes that one.
@matthiaskrgr
Copy link
Member

Tests failed in a rollup #93541 (comment)
@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Feb 1, 2022
@Mark-Simulacrum
Copy link
Member Author

@bors r=yaahc rollup=iffy

@bors
Copy link
Contributor

bors commented Feb 2, 2022

📌 Commit 85930c8 has been approved by yaahc

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 2, 2022
@bors
Copy link
Contributor

bors commented Feb 2, 2022

⌛ Testing commit 85930c8 with merge b380086...

@bors
Copy link
Contributor

bors commented Feb 3, 2022

☀️ Test successful - checks-actions
Approved by: yaahc
Pushing b380086 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 3, 2022
@bors bors merged commit b380086 into rust-lang:master Feb 3, 2022
@rustbot rustbot added this to the 1.60.0 milestone Feb 3, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (b380086): comparison url.

Summary: This benchmark run did not return any relevant results.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants