Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Defect exceptions are not consistently tracked #12862

Closed
zah opened this issue Dec 9, 2019 · 17 comments
Closed

The Defect exceptions are not consistently tracked #12862

zah opened this issue Dec 9, 2019 · 17 comments

Comments

@zah
Copy link
Member

zah commented Dec 9, 2019

In RFC 77, it was proposed that Defect exception types should not be tracked. Since this proposal was not accepted, we should assume that the currently specified behaviour is that the compiler should be tracking them.

This can be confirmed with the following simple program:

proc foo(x: seq[int], i: int): int {.raises: [].} =
  if i >= x.len:
    #  Error: can raise an unlisted exception: ref IndexError
    raise newException(IndexError, "index out of bounds") 
  x[i]

Unfortunately, there are many alternative situations where the compiler is failing to enforce a similar empty raises: [] list.

Indexing a sequence or array

proc foo(x: seq[int], i: int): int {.raises: [].} =
  x[i]

proc bar(x: array[10, int], i: int): int {.raises: [].} =
  x[i]

This program compiles without errors, even though it can raise IndexError.

Doing integer arithmetic

proc foo(x: int, y: int): int {.raises: [].} =
  x + y # This can fail with OverflowError

Assertions

proc foo(x: int, y: int): int {.raises: [].} =
  assert x < y

Please note that this example fails only because there is a deliberate hack in lib/system/assertions.nim trying to hide the AssertionError effect. It was introduced many years ago before we had the clear distinction between recoverable errors and defects.

Expected behaviour

My expectation is that all of the above programs should fail to compile until the user has specified raises: [Defect] in the signature. Alternatively, if the RFC is accepted, the very first program that raises IndexError manually should start compiling.

$ nim -v
Nim Compiler Version 1.0.4
@arnetheduck
Copy link
Contributor

One thing I've wondered about is why more of the exception-raising behaviour isn't defined in the std lib - ie it seems that one should be able to have a + operator that checks for overflow and calls sysFatal instead of generating that code in the backend - what are issues with this approach?

@Araq ?

this is something I've considered for wasm as well as nlvm, both of which would be more simple if some of these things were moved to the library instead - I imagine getting consistency would be helped as well because the "normal" tracking that the compiler already does would apply.

@Araq
Copy link
Member

Araq commented Mar 4, 2020

what are issues with this approach?

We're doing that slowly but it wasn't all that obvious how to write a + purely in the library that works at compile-time and at runtime and respects the --overflowChecks switch.

@Araq
Copy link
Member

Araq commented Mar 4, 2020

Let's analyse this snippet more closely, because I think the actual current implementation is better than the solutions the manual dreams about.

proc foo(x: seq[int], i: int): int {.raises: [].} =
  if i >= x.len:
    #  Error: can raise an unlisted exception: ref IndexError
    raise newException(IndexError, "index out of bounds") 
  x[i]

Imagine you read this as a new Nim programmer and that the compiler compiles this snippet. This is "obviously" a compiler bug! You do raise IndexError, claim you don't and the compiler swallows it. Yet, this is one proposed solution to the exception tracking problem. It doesn't feel right anymore. If the code raises an IndexError which is a bug indicator, it shouldn't have used raise, but instead sysFatal (though preferable under a better name). And sysFatal can either cause a stack unwind or a direct process termination, depending on the compilation model. (This is also what Rust/Go do btw.)

So instead of making the distinction between "uncatchable" exceptions and "catchable" exceptions we should have a distinction between raise and sysFatal. These are two separate things but encoding the difference in the exception hiearchy is the wrong design: It obfuscates what really happens, it never was implemented properly, and arguably it's harder to teach. All we really need to do is to expose sysFatal to libraries and document it properly. The implementation always had the right idea and there is little left to patch.

@arnetheduck
Copy link
Contributor

arnetheduck commented Mar 5, 2020

@zah had a hack/workaround in mind for that: {.errors: [XXX].} - basically something that would expand to {.raises: [Defect, XXX].}, and that users would use when they don't want to track defects (sometimes? often? most of the time? depends on the use case) - of course, this adds to the teaching burden because now you also have to remember the difference between the two.

With sysFatal:

  • 👍 the mechanisms are separate making a stronger distinction between the two modes - this is good because it's a lot easier to explain, and a lot harder to accidentally abuse - no longer do you run the risk of catching more than you intended with a naked catch - rust does a similar thing: normal errors go as return values, panics go as exceptions and require a special mechanism to contain
  • 👍 we can rethink what should happen for certain other tricky conditions, such as out-of-memory: when you have no more memory you cannot allocate a Defect either - C++ tends to reserve some memory for an exception instance for this reason - this is made easier if Defect cannot be inherited from because you can more easily bound the memory usage
  • 👎 we lose the ability to have the compiler guarantee Defect-free code - this is easier / more common than one would think - ie overflow addition is well-defined for all inputs, and I'd like to have a way to express / guarantee that there are no defects or exceptions at all - imagine writing a device driver, a real time or embedded system. This can perhaps be worked around by adding some other effect to track it?

In rust, panic-free libraries are generally considered higher-quality because it's easier to reason about them and the generated code can be made more efficient - no stack unwinding, no alternate code paths that mess up the logic or crash the application.

Simply put, a Defect-free library can be used everywhere a Defective one can be used, but also in some other situations - they're more powerful - but the only way to be sure is if the compiler checks it.

The other way to deal with sysFatal would indeed be to follow rust and add a cumbersome way to "catch" them - but as the rust manual notes, there are some fatal defects that can't be caught (I imagine certain hardware exceptions or platform-specific things, stack overflow etc) - key here is that the internal delivery mechanism / implementation could be the same, but the "keywords" that developers use to deal with them would have to be separate and water-tight.

@arnetheduck
Copy link
Contributor

one more issue with tracking raises as pragmas is that it's hard to express relations:

proc x(callback: proc () {.raises: [A])) {.raises: [...].}

how do I express x raises the same exceptions that the callback and nothing else? If raises were part of the "ordinary" type signature it seems like generics could be expanded to cover this scenario - it often happens in generic code in C++ for example, which is why they have a way to reason about it using nothrow(true) and nothrow(false) world - in particular, it allows them to write generic code that behaves differently if the callback can be guaranteed to not raise - there are many simplicity and efficiency gains to be had, specially when RAII is involved.

@Araq
Copy link
Member

Araq commented Mar 5, 2020

we lose the ability to have the compiler guarantee Defect-free code - this is easier / more common than one would think - ie overflow addition is well-defined for all inputs, and I'd like to have a way to express / guarantee that there are no defects or exceptions at all -

I think what people really want is total functions, they don't realize it though and it comes out as "I want to be able to catch programming bugs in my server". My "quirky exceptions" idea is a cheap way of enabling this -- instead of dying due to the error you set a global error flag and continue. If you break out after too many loop iterations/recursion levels automatically then also termination is guaranteed. So this is a way to make a function .total at runtime, it doesn't "guarantee" defect-free code, it uses brute force runtime mechanisms to ensure it. For example + would be mapped to a wrap around integer that can set the error flag.

Crazy idea ahead: On out of memory the allocator sets the error flag and returns a pointer to a single, preallocated fixed size block of memory (that is big enough, if not, we terminate the process), then we have hundred of spurious writes before the upper layers detect the OOM condition and reject the result of a computation. But it could work.

@arnetheduck
Copy link
Contributor

consider writing a bigint library (who would have thought, but we're up to about 6-7 now, turns out each cryptography library must have its own) - we really want to have defect-free addition for each limb and not quirky exceptions, because we a) want to pass the carry to the next limb and b) we want the compiler to tell us that we're not missing some weird edge condition as far as possible.

yes, "terminating" would be another nice guarantee, but since that's hard, "loop-free" could also be interesting since these tend to terminate as well. all these properties are orthogonal.

quirky, like float NaN is nice for side-effect-free stuff where you only care about some composite result being there or not, and you keep going as if nothing happened in the meantime - it doesn't always work/help though.

your OOM area would have to be per-thread I suspect. that's one of the things I remember as tricky from when I tried adding support for it in nlvm - the other being that the nim std lib api is broken/impossible to implement correctly when nested exceptions happen ;)

@Araq
Copy link
Member

Araq commented Mar 5, 2020

consider writing a bigint library (who would have thought, but we're up to about 6-7 now, turns out each cryptography library must have its own) - we really want to have defect-free addition for each limb and not quirky exceptions, because we a) want to pass the carry to the next limb and b) we want the compiler to tell us that we're not missing some weird edge condition as far as possible.

Ok, thanks, that clears things up. So when you say "Defect" you think about overflows and not so much about invalid array accesses.

quirky, like float NaN is nice for side-effect-free stuff where you only care about some composite result being there or not, and you keep going as if nothing happened in the meantime - it doesn't always work/help though.

Certainly.

@arnetheduck
Copy link
Contributor

not so much about invalid array accesses.

Well, I just highlighted an example of where I've wanted not to have a Defect, recently :) I'm not even sure I'd see out-of-bounds array accesses as defects always - I mean one way to view an array is a dense table indexed by integers - at that point, accessing out of bounds is just another KeyError or whatever it's called. That's what I find tricky with Defects - there's no logic behind the distinction that clearly picks a winner - it's all context. @zah likes to explain that you should document an intended API and make a defect out of anything that doesn't conform (pre-conditions etc) but that's when I'm usually thinking that I'd rather it was part of the API instead so the compiler would tell me and not the documentation.

The other reason is when you give up correctness for performance - this is usually the case for Defects - overflow is a good example - you kind of just hope that nobody will use big numbers and most of the time it works, enough that you as a programmer think that the user won't be too bothered by a crash, because although it's documented that integers have overflow checks sometimes, it's not like devs go off and actually write the code to avoid them. The eth2 specification had a bug like this - had it gone live, it would have been.. costly.

The last reason would be things that are outside of your control because by their nature, they're shared outside the scope of your application - memory, failing hardware etc etc - these are more clear cut, but then you wouldn't want to catch them, to begin with.

Nim has the advantage that it's a source-code-first language - that means that you can almost always fix the underlying defect if you want to. I guess that's also a reason to make defects more strict and crash.

@arnetheduck
Copy link
Contributor

actually, sysFatal and the panic mode is somewhat orthogonal to Defect tracking - the defects can be tracked as an effect regardless if they are raised or panicked.

In particular, even when the panic mode is on, the effect should not change, because if I type raises: [] I don't want that to change behaviour depending on the panic flag - I want the compiler to guarantee that there are no overflows or invalid array accesses or whatever else it might be in that particular piece of code.

@Araq
Copy link
Member

Araq commented May 11, 2020

Exceptions inheriting from Defect are not tracked anymore. For even more control DrNim will prove your array indexing correct and eventually also that no overflows can happen.

@Araq Araq closed this as completed May 11, 2020
@arnetheduck
Copy link
Contributor

that's a bit unsatisfactory in that there's no feature available today then, that allows me to track not only things that drnim in a rosy future maybe will understand, but also things I develop in libraries of my own - with my own defects. ie I might consider out-of-diskspace a defect because it's such a bother to code for - likewise for a number of other defects that that would be easy to manage with a tracking feature here. tracking defects allow me to know when the author of a library I depend on changes their mind.

in summary, this is a bit the opposite of what would be useful for delineating code that doesn't have failure modes (say.. a hash function) from stuff that does. in particular, drnim will remain fundamentally limited in what it can analyse - for example, with exception tracking I can say that I want a callback to not raise anything, not even defects, and get a useful best-effort attempt (yes, minus stack overflow, maybe - but I can avoid memory allocations, integer operations, indexing errors etc by simply improving the structure of my code, for all total problems) that gives me some security.

we want to write Defect-free code, but to do that at scale, we need tools that lets us see where the defects are coming from, generally - not prove that in some particular code path, defects don't happen and that there's a small optimization to be had..

@Araq
Copy link
Member

Araq commented May 11, 2020

I thought about adding .nodefects but there is not really an RFC for it. Please write one and then we can implement it. I cannot write it because I don't really understand what you need.

@iacore
Copy link
Contributor

iacore commented Feb 4, 2022

array.pop() is marked as .nosideeffect.... when the array is emtpy the program terminates. This feels wrong.

@Araq
Copy link
Member

Araq commented Feb 5, 2022

Read the manual, there is a definition for .noSideEffect and "can raise due to a Defect" is not part of the definition.

@iacore
Copy link
Contributor

iacore commented Feb 5, 2022

Coming from a functional programming background, the name is confusing. Divergent from regular execution path is considered "side effect" in academic or nonacademic literature (like blog posts about programming). I think it's confusing enough to change it, since most people don't read the whole manual before they use Nim.

@Araq
Copy link
Member

Araq commented Feb 8, 2022

Haskell doesn't consider defects to be "effects" either:

mydiv:: Integer -> Integer
mydiv b = (7 :: Integer) `div` b
-- can raise an exception but no monad required

main = putStrLn (show (mydiv 0))

I think it's confusing enough to change it, since most people don't read the whole manual before they use Nim.

Yet we don't punish the people who did read it by constantly renaming things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants