Skip to content
This repository has been archived by the owner on Aug 17, 2022. It is now read-only.

Exception object type as anyref or its subtype #10

Closed
aheejin opened this issue Dec 8, 2017 · 19 comments
Closed

Exception object type as anyref or its subtype #10

aheejin opened this issue Dec 8, 2017 · 19 comments

Comments

@aheejin
Copy link
Member

aheejin commented Dec 8, 2017

We have been discussing using anyref as an exception type (or its supertype) recently, as @eholk mentioned in #9 (comment).

The current exception handling proposal imposes several difficulties (WebAssembly/exception-handling#30 and WebAssembly/exception-handling#31), and allowing opaque exception objects to be stored in locals and be dynamically type-tested can solve most of those problems. (They don't have to be stored in linear memory.) And the proposed anyref as WASM value (#9) type sounds like it satisfies many requirements.

One thing is, we use a 'tagged value' to represent an exception object. An exception object is a pair of a tag and a list of values. Definitions of the related terminologies are here. Tags can be used many ways, possibly to denote types (int, MyException&, ...) or languages (C++, JavaScript, ...). In C++ exception support, we are using them to denote languages (we can't do types with C++ because of inheritance and such): so for example a specific tag can mean C++, the other tag can mean JavaScript, etc.

Can we treat anyref as tagged values in general? It might be possible for most of non-exception objects to have the same predefined tag, making them essentially tagless. This way, we would also need a match instruction that dynamically tells if the current object (on stack) has the specified tag or not.

Or, can we make the tagged value type as a subtype of anyref? In that way we would also need some instructions like isinstanceof as suggested in #4, as well as also the match to dynamically test the tag. This way we need to introduce supertype/subtype hierarchy in the system.

cc @eholk @dschuff @KarlSchimpf

@rossberg
Copy link
Member

rossberg commented Dec 8, 2017

If we allowed tagged values as first-class values then their type would have to be something new. It could probably be made a subtype of anyref, depending on implementation details. You'd need to further distinguish the type of such tagged values from that of caught exception values (sometimes called exception packets), because those are more than just the exception that was thrown. They include context information, for example, an associated stack trace. With resumption, they would in fact include a continuation.

For that reason, I'd be hesitant to make exception packets first-class values that can escape their handler. That would have severe implications for the future, for example, any attempt to later extending exception handling with resumption might become difficult or very costly, because continuations would immediately become first-class as well. That's a scary thing to commit to prematurely.

@magcius
Copy link

magcius commented Dec 8, 2017

This would then turn exceptions into GC'd objects, correct? If they can be local values, one can imagine they can set_elem to store them in a global table, which means they would need some form of reference management. I don't quite understand how this is supposed to fit together with C++ exceptions.

@aheejin
Copy link
Member Author

aheejin commented Dec 8, 2017

@magcius The exception objects are host-created opaque objects and not C++ pointers. A C++ exception object pointer is a value in a {tag, value+} pair.

@eholk
Copy link

eholk commented Dec 8, 2017

...any attempt to later extending exception handling with resumption might become difficult or very costly, because continuations would immediately become first-class as well.

While supporting resumption is something I'd like to see in the future, it is more important that we design an exception system now that meets the needs of C++. If we can do this and accommodate resumption in the future, great! But making resumption difficult should not stop us from solving problems that C++ has now.

It may be that exceptions with resumption, or more generalized effect handlers, are differently enough from the exceptions we want to enable now that we should design them as separate features with completely separate types and instructions.

@lukewagner
Copy link
Member

If we allowed tagged values as first-class values then their type would have to be something new. It
could probably be made a subtype of anyref, depending on implementation details.

Agreed

You'd need to further distinguish the type of such tagged values from that of caught exception
values (sometimes called exception packets), because those are more than just the exception
that was thrown.

What if we inverted the what-owns-what relationship so that the caught value was the tagged value and its contents were a list of wasm values (which would include anyref). Additionally:

  • if a wasm exception propagated to JS, the tagged value would get boxed into a new WebAssembly.Exception JS object that didn't derive Error but instead simply contained:
    • the JS API reification of the tag as a property (so JS could test tag via ===)
    • the array of wasm values as a property
  • if a JS exception propagated to wasm
    • if it was a WebAssembly.Exception, the tag/values would be unboxed and rethrown
    • otherwise, the JS exception would be converted to anyref and then put into a new tagged value whose tag was a pre-defined "JS exception" tag (that could be imported and used to extract the anyref from within wasm)

(This all feels rather symmetric to function import/export rules.) Thus, by default, a wasm throw wouldn't need to eagerly capture a stack or do a GC allocation. If wasm wanted a backtrace, in the EH MVP, it would need to call out to JS to throw new Error(). When we one day added first-class stack-walking to core wasm, presumably that could be used directly as long as there was an associated value type.

For that reason, I'd be hesitant to make exception packets first-class values that can escape their
handler. That would have severe implications for the future, for example, any attempt to later
extending exception handling with resumption might become difficult or very costly, because
continuations would immediately become first-class as well. That's a scary thing to commit to
prematurely.

I would've thought the necessary constraint here was that the resumption value (which could be a new value type whose values would be stored inside the tagged exception value) was invoked once. This could be a dynamic restriction. Dropping the syntactic restriction of resume would, just like rethrow, increase the expressivity of resumption as a feature. So I don't see the problem here, and even a win.

@rossberg
Copy link
Member

rossberg commented Dec 14, 2017 via email

@aheejin
Copy link
Member Author

aheejin commented Dec 14, 2017

@rossberg

That way, a handler can potentially be compiled
more cheaply. With only catch-all you perhaps could annotate the handler
differently, but if the exception can also escape then you'd need to track
this property further through the type system I think. What corner cases
could arise and does this impose extra cost because we'll need to assume
the worst case in more places?

But without the ability to rethrow from outside of a catch block, which is the reason I want to assign exceptions to locals, we may end up having to assume the worst case for most exceptions anyway. So the point I was trying to make in WebAssembly/exception-handling#30 is, it is a very common pattern that a rethrow is followed by some code that is reachable from many catchs, like below:

block $label0
  try
    ...
  catch i
    br $label$0
  end
  ...
  try
    ...
  catch i
    br $label$0
  end
end

some common code
rethrow

If we want to rethrow an exception that's caught by either of the catches and don't want to duplicate 'some common code' part, which can be arbitrarily long, there is no way to support this in the current scheme. One thing @eholk suggested offline is we might be able to use resumable exceptions everywhere, so that we can use it like a subroutine to run the common code and come back to a catch block:

try
  try
    ...
  catch i
    ...
    throw j          (1)
    rethrow          (4)
  end
catch                (2)
  some common code
  resume             (3)
end

(Execution order: (1) -> (2) -> (3) -> (4))

Here I showed only a single try-catch pair (the outer try-catch will be inserted by compiler to make this resumable, so that the control flow can come back to the inside of a catch block again). If there are multiple catch blocks that share some common code, we need that many extra compiler-inserted try-catches. But anyway, while this scheme looks overly complicated provided that this is just to support a couple of very plain and simple try-catches, this is going to use resumable exception everywhere, just to return to a catch block to rethrow something, because we can't assign exceptions to locals.

Do you suggest any alternatives that can make rethrow happen?

@lukewagner
Copy link
Member

@rossberg

@lukewagner, yes, one vs multiple invocations of a continuation is probably gonna be a dynamic check. What you might want to know statically, though, is zero vs one

Building on the approach I outlined above, just like the exception-with-stack value could be an opaque anyref stored inside the tagged exception value, I imagine we would have a new opaque continutation value type, created by some new resumable_throw opcode, that would be optionally stored in the exception tagged value. I think we need the static throw vs. resumable_throw distinction anyway because of how differently they get compiled locally in the function. Then the catch site either has, or doesn't have, a continuation value just based on the matched exception tag's signature. So I think the zero vs. one would be sufficiently static for an efficient impl, or is there something else?

@flagxor
Copy link
Member

flagxor commented Dec 16, 2017

@lukewagner @rossberg Pulling out the throw vs resumable_throw distinction seems useful (as does in general grounding our choices in what's required in the compiler). It also (hopefully?) let's us disentangle resumption.

Exception handling for C++ with good cross-language interaction seems like it's going to require some kind of mechanism for the exception to make it's way outside the scope of the catch.
So far we've danced around this with:

  • an exception stack
  • wrapping the exception in a resumption
    Having a direct representation of the exception on the stack + locals seems like the least convoluted variant so far.
    With resumption handled by a different sort of throw, what are the concrete downsides of allowing exceptions in locals + stack?
    I we want to syntactically avoid throwing more than once, I suppose we could make rethrow take a local and null it. But that seems a tad weird.

@rossberg
Copy link
Member

rossberg commented Dec 16, 2017

@lukewagner, @flagxor, distinguishing the throw is one end, but I strongly suspect that you'll also want to be able to distinguish on the catch end for optimal code. That looks trickier with a tag-agnostic catch_all, but maybe it can be done.

@aheejin, @flagxor, the other option proposed earlier was a generalised rethrow, which works similar to a br, but for the exceptional path. Taking @aheejin's example from above:

try $label0
  try
    ...
  catch i
    rethrow $label0  // rethrow from target block
  end
  ...
  try
    ...
  catch i
    rethrow $label0
  end
catch
  some common code
  rethrow
end

In general, rethrow $l terminates the target block with the current exception. If that block happens to be a try body, then this is simply a jump to the respective handler and can be compiled as such. (Omitting here the source label to denote the "current" exception that we already have on rethrow, so in fact it would then have two labels, source and target.)

But you might all be right, and exceptions in locals still be the nicer option. I agree with @lukewagner that we ultimately may want to have that anyway. Just fearing that this might be harder to design and implement properly and that we end up cutting corners or prematurely pruning the design space for resumption. For example, there is the choice about shallow vs deep handlers, where the latter always resumes inside the handling try, which turns out to have certain advantages (and disadvantages). That option is lost when you allow escaping. I would at least suggest consulting with people who have experience implementing and using such mechanisms.

@lukewagner
Copy link
Member

lukewagner commented Dec 18, 2017

@rossberg Ah, I see your point: when compiling a call from within a try block block that can catch continuations, you need to start a new stack segment at the callsite so that the segment can be set aside when executing the handler. If there are already separate throw/resumable_throw and rethrow/resume opcodes, then it also seems natural to have separate catch/catch_continuation opcodes, which should give us all the static info. That still leaves some questions about how the continuation value gets created/passed, but it seems like there are viable options here.

But overall still agreed that the first-class exception value is probably our cleanest solution.

@aheejin
Copy link
Member Author

aheejin commented Dec 18, 2017

@rossberg

I don't think that would work because, wrapping code parts with an outer enclosing try-catch can introduce other problems. Putting an outer enclosing try-catch for two arbitrary try-catches involves computing a nearest common dominator of the two try-catches in a CFG, which may very well be the entry node, resulting in the new try-catch wrapping the whole function. And there can be other function calls that might throw elsewhere, which should throw to the caller in case they throw. But now we have an enclosing catch that wraps all those calls, so they would get caught in the new catch, while they should just throw to the caller. Referring to your example,

try $label0

  (1)

  try
    ...
  catch i
    rethrow $label0
  end

  (2)

  try
    ...
  catch i
    rethrow $label0
  end

  (3)

catch
  some common code
  rethrow
end

there can be other calls that might throw in (1), (2), or (3). Their semantics now have changed because in case they throw, they are going to be caught by the new catch. We can insert some more code to make those calls throw to the caller, like, before all those calls we set some local signalling that they are meant to be propagated up to the caller and not be caught by a new catch or something, and in the new catch block we insert a branch to check the value of that local and do different things based on the result. But this is clearly ugly and will contribute to code size as well.

And what is the difference between rethrow label and normal rethrow? In your example code, it doesn't look like they are semantically different. (If we replace the rethrow label with a normal rethrow, it would have the same semantics, I mean)

@eholk
Copy link

eholk commented Dec 19, 2017

there can be other calls that might throw in (1), (2), or (3). Their semantics now have changed because in case they throw, they are going to be caught by the new catch.

For a second I thought you could work around this by throwing a new exception with a tag that is unused elsewhere, but then you would still need a way to be able to rethrow the original exception.

Still, I think we can make this work with another try block, like this:

try $label1
  try $label0
    (1)

    try
      ...
    catch i
      rethrow $label1
    end

    (2)

    try
      ...
    catch i
      rethrow $label1
    end

    (3)

  catch
    rethrow $label2
  end $label0
catch
  some common code
  rethrow
end $label1

This gives us a stack of try blocks, 2 1 0 (I'm using 2 as an implicit label that means "skip all the try blocks in this function"). We have common code in label 1's catch block that we want to run if any of the inner tries catch an exception. If the other code, such as (1), (2), or (3) throw, they are caught by the label 0 catch block. This block simply rethrows, but skips the common code. On the other hand, to run the common code, we rethrow $label1, which skips the layer at layer 0.

@aheejin
Copy link
Member Author

aheejin commented Dec 20, 2017

@rossberg @eholk

Ah, now I understand what @rossberg meant. It was what you suggested in WebAssembly/exception-handling#29 (comment). Yeah, we actually might be able to use this to solve this. This can incur code size increase, but I don't think that would be significant. I have to check if this can cover all the cases.

This is equivalent to adding a depth argument to rethrow (as in br or br_if). Does that mean we remove the original depth argument of rethrow, which specifies which exception object to rethrow? We don't seem to use it anywhere actually. Or do we keep both arguments?

@eholk
Copy link

eholk commented Dec 20, 2017 via email

@rossberg
Copy link
Member

@aheejin, @eholk, yes, that's the rough idea. Good point about other throws, though. Code that throws in (1), (2), (3) could either be handled the way @eholk suggests, or by also having a target label on throw itself to skip over the inner try -- which admittedly makes this proposal increasingly less attractive.

As said above, the proposal would imply having two labels on rethrow, because the motivation for having the source label is independent of this use case, it being a general composability argument (e.g. if you need to nest try into handlers).

But I'm also thinking through the first-class exceptions alternative some more. I'm positive that we could come up with an adequate semantics if we give up on deep handlers -- which might not fly anyway in a low-level language. (Unfortunately, I'm off into vacation now, but back in 2 weeks.)

@aheejin
Copy link
Member Author

aheejin commented Dec 21, 2017

@rossberg

Code that throws in (1), (2), (3) could either be handled the way @eholk suggests, or by also having a target label on throw itself to skip over the inner try -- which admittedly makes this proposal increasingly less attractive

The bigger problem is it is not even going to be a throw - in most cases it's gonna be a call that might throw. I'm not actually against attaching a depth to throw instruction, but modifying call instruction so it can have a depth or creating the second call' instruction that can take a depth sounds a like a bigger and less attractive change.

@rossberg
Copy link
Member

@aheejin, right, good point.

@aheejin
Copy link
Member Author

aheejin commented Dec 2, 2018

Closing this, since we decided to make except_ref as a subtype of anyref.

@aheejin aheejin closed this as completed Dec 2, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants