Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEN-192: Use Pin API for error-stack #5008

Closed
Bennett-Petzold opened this issue Sep 7, 2024 · 7 comments
Closed

GEN-192: Use Pin API for error-stack #5008

Bennett-Petzold opened this issue Sep 7, 2024 · 7 comments
Assignees
Labels
area/libs > error-stack Affects the `error-stack` crate (library) area/libs Relates to first-party libraries/crates/packages (area) category/enhancement New feature or request lang/rust Pull requests that update Rust code

Comments

@Bennett-Petzold
Copy link

Related Problem

Nesting enums is an effective strategy for providing errors from a library, allowing the downstream user to use arbitrarily granular handling. e.g.

struct Err1 {}

enum Err2 {
  Downstream(Err1),
  Originating
}

However, this does not seem possible in Report. I believe the way to emulate this for functions producing reports is as follows:

struct Err1 {}

enum Err2 {
  Downstream,
  Originating
}

fn foo() {
  let orig_report = Report::new(Err1);
  let top_report = orig_report.change_context(Err2::Downstream);
  
  match top_report.current_context() {
    Downstream => {
      let orig_err = top_report.downcast_ref::<Err1>().unwrap();
      ...
    },
    Originating => { ... }
  }
}

This requires the library author to document the downstream error and the user to read that documentation. Standard error types have no such limitation.

Proposed Solution

Frames are currently stored as Box<dyn FrameImpl> with mutable handle methods. Make this type Pin<Box<dyn FrameImpl>> and replace mutable access methods with https://doc.rust-lang.org/std/pin/struct.Pin.html#method.set wrappers. Report can then be self-referential with its errors, enabling Deref access through some const * wrapper (ReportCell) and a special construction method (change_context_ref) that provides a reference to the current context.

struct Err1 {}

enum Err2 {
  Downstream(ReportCell<Err1>),
  Originating
}

fn foo() {
  let orig_report = Report::new(Err1);
  let top_report = orig_report.change_context_ref(|err_ref| Err2::Downstream(err_ref));
  
  match top_report.current_context() {
    Downstream(orig_err_ref) => {
      let orig_err: &Err1 = *orig_err_ref;
      ...
    },
    Originating => { ... }
  }
}

For soundness when a Context uses borrowed data during drop, Report would need to be changed from Box<Vec<Frame>> to Box<Vec<ManuallyDrop<Frame>>> and given a custom Drop implementation that enforces top of stack to bottom of stack drop ordering.

This scheme would allow for both enum matching and mutation (via replacement) for purposes outlined in #4610 (comment). However, it would introduce unsafe into the codebase for its implementation.

Alternatives

It's possible that I'm simply wrong about enum handling (If that's the case, great!). I've also experimented with mitigating this by:

  1. Wrapping Report to create the guarantees and transforms I need -- ends up creating a complex error type exclusive to a single library.
  2. Writing docs to indicate next-on-stack error type for error enums.
  3. Avoiding returning Report<E> from most functions, wrapping it at the top. Returning plain E allows for pattern matching, but this puts all backtrace line numbers at the topmost error. It also avoids most the benefits of this library.

Additional context

If this seems viable, maybe as a breaking change or as an alternate form of Report under a feature flag, I'm happy to work on a PR.

@Bennett-Petzold Bennett-Petzold added area/libs Relates to first-party libraries/crates/packages (area) area/libs > error-stack Affects the `error-stack` crate (library) category/enhancement New feature or request lang/rust Pull requests that update Rust code labels Sep 7, 2024
@TimDiekmann
Copy link
Member

Hi @Bennett-Petzold and thanks for the issue.

I don't think I really can follow your issue. What is the issue around the first snippet?

struct Err1 {}

enum Err2 {
  Downstream(Err1),
  Originating
}

I don't see a reason why this should not work.

@vilkinsons vilkinsons changed the title Use Pin API for ErrorStack GEN-192: Use Pin API for error-stack Sep 9, 2024
@Bennett-Petzold
Copy link
Author

Bennett-Petzold commented Sep 9, 2024

I've written up an expanded example demonstrating why the first form is incompatible with full error stack output. The first code version is with regular enum nesting, and the second version is the changes to properly track lines with Report as it currently stands. I used thiserror to create my error types.

use error_stack::Report;
use thiserror::Error;

fn main() {
    foo().unwrap();
}

#[derive(Debug, Error)]
#[error("error 1!")]
struct Err1 {}

#[derive(Debug, Error)]
enum Err2 {
    #[error("error 2 caused by error 1!")]
    Downstream(#[source] Err1),
    #[error("error 2!")]
    Originating,
}

fn bar() -> Result<(), Err1> {
    Err(Err1 {})
}

fn foo() -> Result<(), Report<Err2>> {
    let orig_err = bar();
    match orig_err {
        Ok(()) => Ok(()),
        Err(e) => Err(Err2::Downstream(e).into()),
    }
}

This produces

thread 'main' panicked at src/main.rs:5:11:
called `Result::unwrap()` on an `Err` value: error 2 caused by error 1!
├╴at src/main.rs:28:43
│
╰─▶ error 1!
    ╰╴at src/main.rs:28:43
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

The backtrace puts all error line numbers at line 28, where the Report is created. If I change my type to the second version, with this diff...

15c15
<     Downstream(#[source] Err1),
---
>     Downstream,
20,21c20,21
< fn bar() -> Result<(), Err1> {
<     Err(Err1 {})
---
> fn bar() -> Result<(), Report<Err1>> {
>     Err(Err1 {}.into())
28c28
<         Err(e) => Err(Err2::Downstream(e).into()),
---
>         Err(e) => Err(e.change_context(Err2::Downstream)),

It produces a backtrace with all line number tracking down to the actual source sites.

thread 'main' panicked at src/bin/alt.rs:5:11:
called `Result::unwrap()` on an `Err` value: error 2 caused by error 1!
├╴at src/bin/alt.rs:28:25
│
╰─▶ error 1!
    ╰╴at src/bin/alt.rs:21:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Here's the cargo project with both versions: report_diff.tgz

@TimDiekmann
Copy link
Member

TimDiekmann commented Sep 9, 2024

Hmm, I see your point, thanks for providing the example. I'm a bit hesitant about the amount of unsafe required around the Pin handling, but I wonder if we can get around this by Box::pin.
Personally, I have not done much with Pin (except when implementing Stream and Sink which is quite annoying 😅). Could you think of a solution which does not require unsafe? I tried quite hard to reduce the amount of unsafe code to a minimum (I even sacrificed a bit of performance and used Box<dyn FrameImpl> because I don't expect this to be the bottleneck in the usual case, but got a 100% sound implementation for that). There are good tools around unsafe (and we run miri in our CI), though, if we can avoid it it would be good.

Generally, I don't have a strong opinion around how we mutate the data. Direct mutation is nice, but a proxy object is good as well. The latter would even allow to create immutable reports. A breaking change here is not the end of the world as long as migrating is easy. We are still in pre-release because (a) a few features we rely on are not stabilized, yet, and (b) I expect changes around the generics and multi-sources, which will be a breaking change as well.

I have not tested around with your provided code, yet, and maybe I'm missing something, but what prevents us from creating ReportCell without Pin? Where exactly we need references?

@Bennett-Petzold
Copy link
Author

Bennett-Petzold commented Sep 9, 2024

I realized it can be done entirely safely with RefCell<Pin<...>> at the tail end of writing this, but I'm keeping the unsafe explanation anyways.

  • NOTE: Since I'm writing proof of concept code, I don't know if it all strictly compiles as-is.

Safe implementation

Use RefCell to allow for aliasing (this codebase never fails the runtime checks) and Pin to ensure validity.

... // (in frame/mod.rs)
  // mutation is now replace-only, via "frame.get_mut().set(VALUE)"
  // VALUE is whatever we're holding as "dyn FrameImpl".
  frame: RefCell<Pin<Box<dyn FrameImpl>>>,
...

struct ReportCell<T>(RefCell<Pin<Box<T>>>);

impl Deref<T> for ReportCell<T> {
  type Target = T;
  
  /// Desugared lifetimes for clarity.
  /// The pointer is bound to the lifetime of ReportCell.
  /// ReportCell is only constructed in Report, only borrows valid lower data data in Report,
  /// and cannot be copied out of Report.
  /// ReportCell is only obtained via a Report borrow, so the lifetimes bind as:
  ///    &'r Report -> &'a ReportCell -> &'a T (where 'r outlives 'a)
  /// Report then must be immutably borrowed for the lifetime of this return.
  fn deref<'a>(&'a self) -> &'a T {
    self.0.borrow().map(|cell| cell.as_ref())
  }
}

fn from_ref(existing: Report<Err1>) -> Report<Err2>> {
  existing.change_context(Err2::Downstream(existing.current_context_cell().clone()))
}

Justification for unsafe

Why use Pin?

ReportCell inside Report would turn Report into a self-referential datatype. Example structure:

Report
- Err2
-╰─> Err1

Report owns the pointers to Err2 and Err1, but Err2 also points to Err1 and relies on that pointer remaining valid. If Report has direct non-pin access to Box, it could invalidate Err2 by assigning a new pointer for Err1 and dropping the old one (via Box::new). The purpose of Pin is to block movement on Err1, so that Err2 is never invalidated by Err1 being moved or in a temporarily invalid state.

An immutable design with no unsafe

We could also make sure Err1 is always valid by putting all types in Arc instead of Box, so the memory layout would look like

Report
- Err2
-- Err1 (ref owned by Err2)
- Err1 (ref owned by Report)

And Err2 would look like

enum Err2 {
  Downstream(Arc<Err1>),
  Originating
}

Since Report now has to store all its values as Arc, they're immutable.

Making the safe design mutable, at API cost

It is possible to get around that in safe Rust, in turn, by transforming Arc<T> to Arc<RwLock<T>>. But the API would then need to return Arc<RwLock<T>> (or return the read and write handles directly) instead of &T and &mut T so users can lock the results. A library user could bungle their handling of locks and end up with a runtime deadlock the borrow checker can't catch for them.

An immutable unsafe design

By using one line of unsafe only mutable return types need to change, and no deadlock risk is introduced by the bare API. Errors are stored with Pin<Box<T>> to ensure they can't move, and up-stack errors can they rely on the address remaining valid. If we try to construct the nested error naively with references...

enum Err2<'a> {
  Downstream(&'a Err1),
  Originating
}

fn from_ref<'b>(existing: Report<Err1>) -> Report<Err2<'b>>> {
  existing.change_context(Err2::Downstream(existing.current_context()))
}

the compiler rejects it for returning a reference to data owned by the function.

We know that Err1 is valid for as long as Err2 needs it (as long as we guarantee that Err1 drops before Err2). The drop order documentation which guarantees "the elements of an array or owned slice are dropped from the first element to the last", which makes the originally proposed ManuallyDrop redundant. So we can escape the compiler's lifetime analysis and know that Err1 will outlive Err2, so the following code is coherent:

struct ReportCell<T>(*const T);

impl Deref<T> for ReportCell<T> {
  type Target = T;
  
  /// Desugared lifetimes for clarity.
  /// The pointer is bound to the lifetime of ReportCell.
  /// ReportCell is only constructed in Report, only borrows valid lower data data in Report,
  /// and cannot be copied out of Report.
  /// ReportCell is only obtained via a Report borrow, so the lifetimes bind as:
  ///    &'r Report -> &'a ReportCell -> &'a T (where 'r outlives 'a)
  /// Report then must be immutably borrowed for the lifetime of this return.
  fn deref<'a>(&'a self) -> &'a T {
    // Only line of `unsafe` necessary for this implementation
    unsafe {&*self.0}
  }
}

enum Err2 {
  Downstream(ReportCell<Err1>),
  Originating
}

fn from_ref(existing: Report<Err1>) -> Report<Err2>> {
  let existing_cell = ReportCell(*existing.current_context());
  existing.change_context(Err2::Downstream(existing_cell))
}

This does make the entire structure immutable, same as the Arc implementation.

The proposed mutable unsafe design

Rust assumes &T is non-aliasing by default. UnsafeCell allows us to opt out of that assumption so that if Err1 changes, Err2's reference is guaranteed to reflect that change. Without UnsafeCell, the compiler might optimize the reference to Err1 by re-using a stale value. So to make mutation possible, store things in UnsafeCell.

... // (in frame/mod.rs)
frame: UnsafeCell<Pin<Box<dyn FrameImpl>>>,
...
struct ReportCell<T>(UnsafeCell<T>);
...
  fn deref<'a>(&'a self) -> &'a T {
    // Only line of `unsafe` necessary for this implementation
    unsafe {&self.0.get()}
  }

What about Cell?

Cell doesn't allow returning an immutable reference, only cloned values. That would mean !Clone errors (which forms at least a plurality of Rust ecosystem errors) can't be stored in Report.

@Bennett-Petzold
Copy link
Author

On further reflection, the Pin inside RefCell might be extraneous. I think just RefCell by itself upholds requirements, since it'll update for a new indirect pointer. But RefCell actually does an internal clone for Clone, so it needs to be stored in an Arc and use borrow_mut for mutation so it can work on !Clone errors.

@TimDiekmann
Copy link
Member

Thank you for the detailed response!

That would all result in a lot of pointer indirection (both, the Pin approach and the RefCell approach). Generally, I think that Report should not become as complex as you described, i.e. it should not be self-referencing at all. Actually, it is already too complex and @indietyp is currently investigating this to reduce the indirection.
Instead, it can be modeled much easier by extending Context by returning something like report_source - similar to Error::source but returning an inner Report instead. The formatter implementation would need to take this into account and could render the expected output. But there is a catch: We want to remove Context in favor of core::error::Error (and bump the MSRV to 1.81), so adding more methods to Context is not really an option. Anyway, we have the provider API available (or at least what remains of it). Generally, we could provide something which looks like a Report (for example as described in #4693). It cannot be a Report<C> because C is not known, but something like a FrameStack certainly could be used.

I'd prefer to discover this route before we largely overhaul the internal representation of Report.

@Bennett-Petzold
Copy link
Author

I don't think the unsafe Pin compiles down to more pointer indirection (it's a complicated way of preventing the compiler from catching values), but one pointer deref a non-panic handling path isn't much cheaper than two pointer derefs anyways. You're right it comes at a perf hit.

I think I'm fighting some architectural differences with the idea of wrapping the Errors at all -- Report is throwing in a less direct ownership model than I want to work with. What I really want from this library is the output from fmt + error enums that own their children; I don't personally envision a use for attachments instead of additional nesting. Given this is permissively licensed, I think I'd like to use formatting logic from this library to create another library along the lines of https://github.com/Kijewski/pretty-error-debug, if that's alright. I'll make sure to include full credit + commit history, and link back for compatibility if possible.

Rework around core::error::Error looks like a solid move! Looking forward to see how that comes out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/libs > error-stack Affects the `error-stack` crate (library) area/libs Relates to first-party libraries/crates/packages (area) category/enhancement New feature or request lang/rust Pull requests that update Rust code
Development

No branches or pull requests

2 participants