Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reachability/exhaustiveness checking for match statements #12010

Closed
JukkaL opened this issue Jan 18, 2022 · 15 comments · Fixed by #12267
Closed

Reachability/exhaustiveness checking for match statements #12010

JukkaL opened this issue Jan 18, 2022 · 15 comments · Fixed by #12267
Labels
bug mypy got something wrong priority-0-high topic-match-statement Python 3.10's match statement

Comments

@JukkaL
Copy link
Collaborator

JukkaL commented Jan 18, 2022

Currently this generates a false positive, since mypy doesn't recognize that all the possible values are processed by the match statement:

# Error: Missing return statement
def f(x: int | str) -> int:
    match x:
        case str(v):
            return 0
        case int(v):
            return 1
    # We can't reach here, but mypy doesn't know it

This has the same issue as well:

# Error: Missing return statement
def f(x: int | str) -> int:
    match x:
        case str(v):
            return 0
        case v:
            return 1
@JukkaL JukkaL added bug mypy got something wrong priority-0-high topic-match-statement Python 3.10's match statement labels Jan 18, 2022
@JukkaL
Copy link
Collaborator Author

JukkaL commented Jan 18, 2022

@freundTech Did you have a work-in-progress implementation of exhaustiveness checking? If you have, it would be great if you could create a PR -- maybe I or somebody else can help finish it?

@freundTech
Copy link
Contributor

There is a working exhaustiveness checking implementation in https://github.com/freundTech/mypy (feature-match-stmt-exhaustiveness branch. I made it the default branch)

I could make a PR from that branch, but it would need major rebasing. It's probably better to cherry-pick the relevant commits to a new branch instead of rebasing all of it.

The implementation works, but I'm not really happy with it's design. It checks all patterns twice. First ignoring guard statements in order to infer rough types for capture patterns (capture patterns need to be inferred before a guard statement can use them) and then a second time with guard patterns to infer more accurate types and narrow the capture patterns.

I did this because I thought that mypy couldn't broaden a variable's type after it was inferred. Later I found out that it can do that, but didn't have enough time to redo my implementation.

An example to illustrate the problem:

x: object

match x:
    case int(c) if c < 5:
        pass
    case str(c):
        pass

c should be inferred as int | str. To know this we need to look at both cases. Before checking the second case we need to check the first case. Checking the first case includes checking the guard. Checking the guard required c to be inferred first.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Jan 18, 2022

Thanks! Cherry-picking from your branch and potentially tweaking parts of the implementation sounds reasonable. The branch also suggests that we could split this into at least two separate PRs: first one adds basic support and second one adds better support for enums (and literal types).

@JukkaL
Copy link
Collaborator Author

JukkaL commented Jan 19, 2022

There's some relevant discussion starting from this comment: #10191 (comment)

Here's my summary of the ideas discussed above (these can be moved to follow-up issues):

  • case _: assert False could generate an error if it's reachable
  • We could have a per-module strictness flag that generates an error if mypy can't prove that a match statement covers all values exhaustively

@erictraut
Copy link

For what it's worth, here's another discussion on this topic: microsoft/pyright#2569.

I'm personally not a fan of using assert False for this purpose. There are other reasons why someone might want to use assert False in their code. A type checker should not generate errors if it see an assert False statement because it doesn't violate any type rules. I'd prefer to standardize on assert_never and add it to one of the stdlib stubs.

@ghost
Copy link

ghost commented Jan 23, 2022

I'm wondering if this is the same issue as what I am experiencing?

def testing_return(test: str) -> str:
    match test:
        case "test":
            return test
        case "testing":
            return test

if __name__ == "__main__":
    print(testing_return("test"))

results in:
test.py:1: error: Missing return statement

it seems even valid match case blocks in python 3.10 results in mypy giving the missing return statement error.

@JelleZijlstra
Copy link
Member

JelleZijlstra commented Jan 23, 2022

@Pavocracy that error is correct. The function does not return a value if test is any string other than "test" or "testing".

(It doesn't matter that you only call the function with "test" as an argument. Mypy essentially checks each function in isolation.)

@ghost
Copy link

ghost commented Jan 23, 2022

@JelleZijlstra so mypy will always complain at match blocks that dont contain case _: ?

I did not know that, thank you for the information!

Just to clarify incase this ever shows up in a search result. Simply adding a case _: didn't fix my error.
I now understand I was using the match case incorrectly. I need to made modifications in the match block and have the return statement after the match block.

@freundTech
Copy link
Contributor

@Pavocracy Once this issue is solved using a case _: with a return should also work.
Then you could also type test as Literal["test", "testing"] instead of adding another return. That let's mypy know that only "test" and "testing" are valid inputs for this function

@TomasPuverle
Copy link

TomasPuverle commented Jan 23, 2022

I have been following this discussion with great interest; I think "safe by default" should win. It also seems to jive better with PEP 20's "Zen of Python": Explicit is better than implicit, Special cases aren't special enough to break the rules and, most importantly, Errors should never pass silently.

IMO, not handling exhaustiveness is equivalent to the required break inside of a C/C++ style switch statement - generally considered to be bug-prone and a language design mistake.

The overwhelming majority of languages (all?) I am familiar with that support structural pattern matching have exhaustiveness checking - with good reasons. Unless there are strong technical issues, it may be reasonable to go with what's familiar/established. It will also make Python more attractive to domains for which other languages have typically "dominated", such as compilers.

case _: assert False or similar will lead to the same problems as missed breaks. Even if a pattern is exhaustive today, it may not be tomorrow and requiring discipline from programmers is likely not to work "when programming in the large".

Finally, case _: pass will be clearer and shorter to write.

JukkaL added a commit that referenced this issue Mar 7, 2022
Closes #12010.

Mypy can now detect if a match statement covers all the possible values.
Example:
```
def f(x: int | str) -> int:
    match x:
        case str():
            return 0
        case int():
            return 1
    # Mypy knows that we can't reach here
```

Most of the work was done by @freundTech. I did various minor updates
and changes to tests.

This doesn't handle some cases properly, including these:
1. We don't recognize that `match [*args]` fully covers a list type
2. Fake intersections don't work quite right (some tests are skipped)
3. We assume enums don't have custom `__eq__` methods

Co-authored-by: Adrian Freund <[email protected]>
cdce8p pushed a commit to cdce8p/mypy that referenced this issue Mar 10, 2022
Closes python#12010.

Mypy can now detect if a match statement covers all the possible values.
Example:
```
def f(x: int | str) -> int:
    match x:
        case str():
            return 0
        case int():
            return 1
    # Mypy knows that we can't reach here
```

Most of the work was done by @freundTech. I did various minor updates
and changes to tests.

This doesn't handle some cases properly, including these:
1. We don't recognize that `match [*args]` fully covers a list type
2. Fake intersections don't work quite right (some tests are skipped)
3. We assume enums don't have custom `__eq__` methods

Co-authored-by: Adrian Freund <[email protected]>
@ruohola
Copy link

ruohola commented Jul 30, 2023

Is this deficiency (from the commit message of 226661f) tracked somewhere?

This doesn't handle some cases properly, including these:
1. We don't recognize that match [*args] fully covers a list type

@alexandermalyga
Copy link

Currently the following code still fails with error: Missing return statement:

from dataclasses import dataclass

@dataclass
class Foo:
    value: int | str

def f(x: Foo) -> int:
    match x:
        case Foo(str(v)):
            return 0
        case Foo(int(v)):
            return 1

@erictraut
Copy link

@alexandermalyga, I wouldn't expect that example to work. Type exhaustion for match statements is implemented using type narrowing. The initial type of the subject expression x is Foo. This type cannot be further narrowed by either the first or second case statements. It remains Foo in both cases. The Python type system has no way to represent the type "an instance of Foo where the attribute value is not a str".

If you want type exhaustion to work with this code, you will need to make a slight modification:

def f(x: Foo) -> int:
    match x.value:
        case str(v):
            return 0
        case int(v):
            return 1

This works because the type of the subject expression x.value is initially str | int. This is narrowed to int if the first case statement doesn't apply. It is further narrowed to Never if the second case statement doesn't apply. The Never type (which is the equivalent of an empty union) indicates that the match statement has exhausted all possible types.

@aiven-anton
Copy link

@erictraut

I wouldn't expect that example to work.

Fair, but do you expect it to give an error?

The Python type system has no way to represent the type "an instance of Foo where the attribute value is not a str".

Would it be possible to expand on this a little? When you write that the Python type system has no way of representing, do you really mean expressing? It's not clear to me why there would need to be a way to express this in order for a type checker to implement an understanding of it. How does this differ from the situation with intersections for instance, where there is a representation in type checkers but no way to express them?

My understanding is that a Python runtime implementation is free to optimize the given pattern into this:

if isinstance(x, Foo):
    if isinstance(x.value, str):
        return 0
    elif isinstance(x.value, int):
        return 1

Is there some fundamental reason that a type checker couldn't do the same, and apply exhaustiveness checking to that?

I think it's completely fair for a type checker to dismiss this as too complex to implement, or on grounds of being too slow, but is there some inherently unsafe or other (perhaps type-theory-based) reason not to have this functionality in type checkers?

@erictraut
Copy link

Fair, but do you expect it to give an error?

Given how type exhaustion detection works, yes I'd expect it to not detect exhaustion in this case.

Is there some fundamental reason that a type checker couldn't do the same

The logic in a type checker works on types that are expressible within the type system. It's theoretically possible for a type checker to internally create constructs that extend the type system, but the logic throughout the entire type checker would need to understand these new constructs. There would also need to be some way of expressing these constructs to users in error messages and reveal_type output. So yes, it's theoretically possible to do this, but it's not something I would expect or recommend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug mypy got something wrong priority-0-high topic-match-statement Python 3.10's match statement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants