Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate panics harder #14

Merged
merged 1 commit into from
Oct 4, 2016
Merged

Conversation

alexcrichton
Copy link
Contributor

Once a Connection has panicked in I/O it's effectively poisoned and we
shouldn't come back to it (due to a lack of UnwindSafe bound). Set a
flag on Connection and bail early before we access the stream.

@sfackler
Copy link
Collaborator

I don't think this is necessary. We rethrow the panic, so from the perspective of a user of this crate, it's as if the catch_unwind didn't exist. Is there something I'm not thinking of?

@alexcrichton
Copy link
Contributor Author

Oh, sorry, to clarify, this is fixing the SIGILL on travis

It looks like SecureTransport writes twice before returning back for the panic to get propagated? Somehow the runtime is detecting a double panic.

@sfackler
Copy link
Collaborator

sfackler commented Oct 1, 2016

The SIGILL is a thing that's weird, but it seems to me like some kind of issue with libstd. IIRC it popped up after some of those optimizations made to the panic runtime a couple weeks ago.

Specifically, it shouldn't matter if secure transport tries to write twice, since we catch the panic each time. If the panic runtime isn't decrementing the panic count when a panic is caught, that seems bad.

@alexcrichton
Copy link
Contributor Author

The change in question was likely rust-lang/rust#34866, but I disagree that this change isn't needed. With panic safety we shouldn't come back to call write again, but what's happening here appears to be:

  • The SslStream::handshake method is called
  • Transitively, the write_func callback is called
  • This panics, returning errSecIO
  • The panic is then propagated on the other side of the handshake
  • As part of the panic, the SslStream is destroyed
  • SSLClose is called by the destructor
  • Transitively write_func is called again

The ud2 instruction happens because this is legitimately a double panic after #34866. Catching a panic while panicking isn't allowed (which is what's happening here). Moreover, I think it's also invalid to go back to the stream and try to write again after it's panicked, the purpose of the catch_unwind + propagation is to avoid that, right?

@sfackler
Copy link
Collaborator

sfackler commented Oct 3, 2016

Ah, that does make sense. I think I'd kind of prefer to only have the Drop impl check the poison state rather than everything, under the rationale that if the user wants to call write after a panic, that's up to them. Does that make sense?

We could alternatively not call SSLClose in Drop at all. It won't work in a nonblocking context and it might make more sense to have people opt-into an orderly session shutdown rather than doing it by default. IIRC rust-openssl doesn't do a shutdown in its destructor, but schannel-rs does :(. We should figure out a consistent story here and make sure everything's doing the same thing.

@alexcrichton
Copy link
Contributor Author

Ah yeah I'd imagine that Drop would opportunistically do I/O if possible (like BufWriter) but ignore all errors (including would block)

@alexcrichton
Copy link
Contributor Author

Ok, updated a bit with an assert!(!self.panicked) in the C callbacks and a guard against calling those functions elsewhere.

Is that what you're thinking?

@sfackler
Copy link
Collaborator

sfackler commented Oct 3, 2016

Let's cut the change to get_ref, get_mut, read, write, and flush, since it seems fine to let the user explicitly poke at things after a panic if they want to. That would also require the asserts in the C callbacks to go away, which is probably a good idea anyway since we don't want to unwind through C.

@alexcrichton
Copy link
Contributor Author

Hm right yeah that's true, I think that with propagation it's ok to re-poke. The C side though was intended to abort quickly b/c it's a bug to reenter. Would you prefer though if they just immediately returned errSecIO?

@sfackler
Copy link
Collaborator

sfackler commented Oct 3, 2016

It's only a bug if we do it without the user's permission (i.e. the Drop use case). If the user wants to start calling read/write again, that seems like a thing that should work.

Once a `Connection` has panicked in I/O it's effectively poisoned and we
shouldn't come back to it in the destructor, so skip `SSLClose` in this
case.
@alexcrichton
Copy link
Contributor Author

Ok, updated. Slowly getting my head straight around this again...

@sfackler sfackler merged commit 18f0f3c into kornelski:master Oct 4, 2016
@sfackler
Copy link
Collaborator

sfackler commented Oct 4, 2016

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants