-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --warn-no-return, and major binder refactoring #1748
Conversation
@ecprice, I'm putting this up so you know where I'm planning to go with this in case you intend to work more on the binder in the near future. What I've done is have a Frame contain the information of, in addition to the types that various expressions have when a certain point in the code is reached, whether that point is reachable at all. This information is combined in In this version I've made the semantics of |
Yeah, adding can_skip back seems reasonable. By the way, your branch has a bug with try/finally; it doesn't detect this:
(I detected this by rebasing my error-on-fall branch that warns on unreachable code above this, then noticing mistakes on the mypy codebase.) |
Oh, thanks for the example. I think it's the same issue as a general bug with try statements that I hadn't filed an issue for yet, namely that the initial state after entering the body isn't captured as a possible state by the surrounding "try frame". For example from typing import Union, List
x = 1 # type: Union[int, List[int]]
x = 1
try:
x = [1]
finally:
x[0] should error on the last line but doesn't. (There could be an asynchronous exception, or you could add a function call that could raise an exception there and it makes no difference to the binder.) |
Oh, yeah, that's right. Hackishly adding Why do you have the |
I think I figured out something that had been confusing me here. In This inconsistency didn't cause any problems before |
Seems reasonable. I think the reason it worked before is that the times it wouldn't work are when the list is empty and you're unreachable, and there was separate logic for detecting unreachable states and breaking out. |
a72d456
to
02de8f9
Compare
@ecprice, what do you think about just handling |
e8c39c9
to
bc10eae
Compare
Don't you need to set |
Yes, I do. Since this pattern appears everywhere I'm going to make it into a method. |
One issue with the |
b68f747
to
6c7ceb0
Compare
Yep... the fast parser does parse it correctly though so I used it in one of the tests. I think I'm done with this for now, so if you have a chance, could you try rebasing your |
OK, I've updated error-on-fall to give a note for missing return statements. The first commit is the real one; the later ones update the tests/mypy return statements/add a hack for assert False, "message" so that there aren't too many false positives. I haven't rebased the warning for unreachable lines yet, because my old version wasn't quite right -- a decent approximation isn't too hard, but it gets tricky when you have code paths that are checked multiple times, to only warn on the lines that are unreachable under all paths. |
Oh, and binder.ConditionalTypeBinder.unreachable should return None, not bool. |
I've finished what I wanted to do on this branch. The last two commits above remove The code ended up a bit trickier than I would like, due to things like deleting exception handler variables and the fact that definitions that are assignments to a more specific type than the declared type don't set the initial state of the variable in the type binder. (Should this be fixed? Anyways, it would be too much for this PR.) It could probably use a future clean-up. Overall I'm a lot happier with the new state of the binder! Thanks again to @ecprice for helping out with this little project. Particularly helpful were all the test cases! |
What would it take (besides a rebase) to get this ready for merging? I would really like to use this option. |
2517710
to
734d357
Compare
I managed to rebase this on master, so if one of you has a chance to review it that would be great. Hopefully I can even remember how it works :) |
I'm looking at this. |
Looks good, and I especially like how this makes binder-related code clearer and more readable. Thanks for working on this and continuing to update this -- finding missing returns has been a long-time wishlist item. When run against mypy, this seems to find some real bugs! There are some test cases that I'd like to see added, but they don't block this PR. |
I am suspecting that this PR broke something. I am seeing some new errors for out internal "S" and "C" builds (both Python 2). Several are new errors mentioning basestring, e.g. def f(a):
# type: (unicode) -> None
if not isinstance(a, basestring):
a = ""
z = [] # type: List[unicode]
z.append(a) The error with the latest commit is:
There's also another error that appeared at the same time which is even more disturbing.
That message is old but I've never seen it before and here it is occurring inside an |
I found the cause of the first bug. The logic in For the second error, it would really help to have a sanitized/minimized reproducer; this stuff with deleting exception variables was tricky, and I'm not surprised if the behavior changed, but I don't remember the details well. |
Thanks for looking at the first bug. I agree that the reproducer looks odd, but these kinds of patterns really do happen in our code. I'm guessing the type annotation indicates what we hope for and the isinstance() check tries to deal with bad calls from unchecked code. I'm still trying to come up with an isolated reproducer of the second bug. |
Update: I've had no success yet finding a small reproducer for the deleted variable error. The code that triggers this is remarkably complex. :-( I'll keep looking into it. |
OK, here's a repro. It's still a doozie!
The errors are:
And yes, it does need two helper classes, and a try/except inside another try/except/else's else clause, and the forward reference |
Updates: (1) the with-statement wasn't necessary, just the forward ref (I updated the example in place). (2) this is likely due to the deferred_nodes queue. The forward reference |
Another update: the forward ref |
Much simpler repro:
It's clearly due to deferral, and reusing the caught variable. Since I've just gotten dug into the deferral queue I may or may not be able to come up with a quick fix. |
Looks like deferral causes mypy to stop inferring types after some point; this is a problem for exception variables, which can change between incompatible types (as well as DeletedType) in different except: branches. The following patch seems to fix your examples:
I don't understand deferral well enough to know if there are other problematic interactions, though. |
Oh, that was a helpful hack! The real issue is actually that when current_node_deferred is set, we never assign to var.type. So here's a better (and simpler!) fix:
I'll write a test and submit it as a PR. |
No, that's what I tried to do at first, but it doesn't work. If you check, you'll find that it doesn't fix your first repro of the bug, only the second one. In your first repro, the first except: clause appears before the forward reference. Hence the type will be set to DeletedType (because current_node_deferred isn't set yet), and then not updated in the except: clause after current_node_deferred is set. |
Dang you're right.
|
I didn't look into this in detail, but here's another idea (that may not work). What if we created a separate |
Alternatively, I wonder if we should more closely copy what
|
Jukka's idea sounds good, however, I'm not sure I understand the motivation for the deferred execution logic in general. It has other issues, for example, the following code gives an error as is but not if the
Why not instead just evaluate all global statements (including function definitions) before checking all function bodies? It seems dangerous to typecheck functions in two different ways. Guido, that won't work either: the problem with except clauses isn't really about deletion. One can construct examples where, even if you rip out the deletion code entirely, it still gives the wrong answer because the exception type changes; for example,
|
OK, I'll have to try Jukka's idea of using a separate variable for each. Though I suspect it'll run into some other issue... (How would a use of the variable outside the handler clause even be processed?) I don't know how the design of the deferred nodes queue came about, hopefully Jukka remembers. Perhaps it has to do with assignments to instance variables being processed to set their types for use by other methods? We haven't seen a ton of problems due to it yet (though that may change because I'm looking to reuse it to handle import cycles as well, see #2264). |
Type checking top-levels before function bodies doesn't resolve some of the more common issues we have, such as accessing an attribute before we've inferred a type for it from an assignment in a method. Example:
I'm not surprised that there are still bugs with deferred nodes, and they are somewhat of a hacky approach. I believe that we can get them to work correctly, though they clearly make it a little harder to reason about type checking. We often type check expressions multiple times anyway, so this mostly affects statements. To fix the union type example, a non-definition assignment to |
* Fix bug with exception variable reuse in deferred node. The fix is actually by @ecprice. See #1748 (comment)
Sorry for the interruption - is this just another incarnation of the course-grained vs. fine-grained incrementalism? |
Deferred processing is not currently really related to incremental checking, but fine-grained dependency tracking and incrementalism might be an alternative to deferred checking in the future, though we don't have a design or plan for that yet. When we are type checking a function and it refers to something that we haven't processed yet (and that doesn't have an inferred type), we basically stop type checking the function and put it in a deferred list. After we've done a single pass on a strongly-connected set of modules, we do another pass and type check the functions in the deferred list again, and this time we generally have the types available from the first pass. (We don't actually stop processing the function when we defer it, but we stop inferring types for variables, since we are missing some type information and can't always infer the right types.) |
No description provided.