Reachability/exhaustiveness checking for match statements #12010

JukkaL · 2022-01-18T16:05:28Z

Currently this generates a false positive, since mypy doesn't recognize that all the possible values are processed by the match statement:

# Error: Missing return statement
def f(x: int | str) -> int:
    match x:
        case str(v):
            return 0
        case int(v):
            return 1
    # We can't reach here, but mypy doesn't know it

This has the same issue as well:

# Error: Missing return statement
def f(x: int | str) -> int:
    match x:
        case str(v):
            return 0
        case v:
            return 1

JukkaL · 2022-01-18T16:08:15Z

@freundTech Did you have a work-in-progress implementation of exhaustiveness checking? If you have, it would be great if you could create a PR -- maybe I or somebody else can help finish it?

freundTech · 2022-01-18T16:30:31Z

There is a working exhaustiveness checking implementation in https://github.com/freundTech/mypy (feature-match-stmt-exhaustiveness branch. I made it the default branch)

I could make a PR from that branch, but it would need major rebasing. It's probably better to cherry-pick the relevant commits to a new branch instead of rebasing all of it.

The implementation works, but I'm not really happy with it's design. It checks all patterns twice. First ignoring guard statements in order to infer rough types for capture patterns (capture patterns need to be inferred before a guard statement can use them) and then a second time with guard patterns to infer more accurate types and narrow the capture patterns.

I did this because I thought that mypy couldn't broaden a variable's type after it was inferred. Later I found out that it can do that, but didn't have enough time to redo my implementation.

An example to illustrate the problem:

x: object

match x:
    case int(c) if c < 5:
        pass
    case str(c):
        pass

c should be inferred as int | str. To know this we need to look at both cases. Before checking the second case we need to check the first case. Checking the first case includes checking the guard. Checking the guard required c to be inferred first.

JukkaL · 2022-01-18T16:52:30Z

Thanks! Cherry-picking from your branch and potentially tweaking parts of the implementation sounds reasonable. The branch also suggests that we could split this into at least two separate PRs: first one adds basic support and second one adds better support for enums (and literal types).

JukkaL · 2022-01-19T10:36:48Z

There's some relevant discussion starting from this comment: #10191 (comment)

Here's my summary of the ideas discussed above (these can be moved to follow-up issues):

case _: assert False could generate an error if it's reachable
We could have a per-module strictness flag that generates an error if mypy can't prove that a match statement covers all values exhaustively

erictraut · 2022-01-19T16:28:48Z

For what it's worth, here's another discussion on this topic: microsoft/pyright#2569.

I'm personally not a fan of using assert False for this purpose. There are other reasons why someone might want to use assert False in their code. A type checker should not generate errors if it see an assert False statement because it doesn't violate any type rules. I'd prefer to standardize on assert_never and add it to one of the stdlib stubs.

ghost · 2022-01-23T04:41:10Z

I'm wondering if this is the same issue as what I am experiencing?

def testing_return(test: str) -> str:
    match test:
        case "test":
            return test
        case "testing":
            return test

if __name__ == "__main__":
    print(testing_return("test"))

results in:
test.py:1: error: Missing return statement

it seems even valid match case blocks in python 3.10 results in mypy giving the missing return statement error.

JelleZijlstra · 2022-01-23T04:48:18Z

@Pavocracy that error is correct. The function does not return a value if test is any string other than "test" or "testing".

(It doesn't matter that you only call the function with "test" as an argument. Mypy essentially checks each function in isolation.)

ghost · 2022-01-23T06:01:56Z

@JelleZijlstra so mypy will always complain at match blocks that dont contain case _: ?

I did not know that, thank you for the information!

Just to clarify incase this ever shows up in a search result. Simply adding a case _: didn't fix my error.
I now understand I was using the match case incorrectly. I need to made modifications in the match block and have the return statement after the match block.

freundTech · 2022-01-23T09:21:16Z

@Pavocracy Once this issue is solved using a case _: with a return should also work.
Then you could also type test as Literal["test", "testing"] instead of adding another return. That let's mypy know that only "test" and "testing" are valid inputs for this function

TomasPuverle · 2022-01-23T19:35:19Z

I have been following this discussion with great interest; I think "safe by default" should win. It also seems to jive better with PEP 20's "Zen of Python": Explicit is better than implicit, Special cases aren't special enough to break the rules and, most importantly, Errors should never pass silently.

IMO, not handling exhaustiveness is equivalent to the required break inside of a C/C++ style switch statement - generally considered to be bug-prone and a language design mistake.

The overwhelming majority of languages (all?) I am familiar with that support structural pattern matching have exhaustiveness checking - with good reasons. Unless there are strong technical issues, it may be reasonable to go with what's familiar/established. It will also make Python more attractive to domains for which other languages have typically "dominated", such as compilers.

case _: assert False or similar will lead to the same problems as missed breaks. Even if a pattern is exhaustive today, it may not be tomorrow and requiring discipline from programmers is likely not to work "when programming in the large".

Finally, case _: pass will be clearer and shorter to write.

@freundTech

Closes #12010. Mypy can now detect if a match statement covers all the possible values. Example: ``` def f(x: int | str) -> int: match x: case str(): return 0 case int(): return 1 # Mypy knows that we can't reach here ``` Most of the work was done by @freundTech. I did various minor updates and changes to tests. This doesn't handle some cases properly, including these: 1. We don't recognize that `match [*args]` fully covers a list type 2. Fake intersections don't work quite right (some tests are skipped) 3. We assume enums don't have custom `__eq__` methods Co-authored-by: Adrian Freund <[email protected]>

@freundTech

Closes python#12010. Mypy can now detect if a match statement covers all the possible values. Example: ``` def f(x: int | str) -> int: match x: case str(): return 0 case int(): return 1 # Mypy knows that we can't reach here ``` Most of the work was done by @freundTech. I did various minor updates and changes to tests. This doesn't handle some cases properly, including these: 1. We don't recognize that `match [*args]` fully covers a list type 2. Fake intersections don't work quite right (some tests are skipped) 3. We assume enums don't have custom `__eq__` methods Co-authored-by: Adrian Freund <[email protected]>

ruohola · 2023-07-30T12:49:52Z

Is this deficiency (from the commit message of 226661f) tracked somewhere?

This doesn't handle some cases properly, including these:
1. We don't recognize that match [*args] fully covers a list type

alexandermalyga · 2023-08-27T15:53:38Z

Currently the following code still fails with error: Missing return statement:

from dataclasses import dataclass

@dataclass
class Foo:
    value: int | str

def f(x: Foo) -> int:
    match x:
        case Foo(str(v)):
            return 0
        case Foo(int(v)):
            return 1

erictraut · 2023-08-27T16:13:06Z

@alexandermalyga, I wouldn't expect that example to work. Type exhaustion for match statements is implemented using type narrowing. The initial type of the subject expression x is Foo. This type cannot be further narrowed by either the first or second case statements. It remains Foo in both cases. The Python type system has no way to represent the type "an instance of Foo where the attribute value is not a str".

If you want type exhaustion to work with this code, you will need to make a slight modification:

def f(x: Foo) -> int:
    match x.value:
        case str(v):
            return 0
        case int(v):
            return 1

This works because the type of the subject expression x.value is initially str | int. This is narrowed to int if the first case statement doesn't apply. It is further narrowed to Never if the second case statement doesn't apply. The Never type (which is the equivalent of an empty union) indicates that the match statement has exhausted all possible types.

aiven-anton · 2023-08-28T08:38:09Z

@erictraut

I wouldn't expect that example to work.

Fair, but do you expect it to give an error?

The Python type system has no way to represent the type "an instance of Foo where the attribute value is not a str".

Would it be possible to expand on this a little? When you write that the Python type system has no way of representing, do you really mean expressing? It's not clear to me why there would need to be a way to express this in order for a type checker to implement an understanding of it. How does this differ from the situation with intersections for instance, where there is a representation in type checkers but no way to express them?

My understanding is that a Python runtime implementation is free to optimize the given pattern into this:

if isinstance(x, Foo):
    if isinstance(x.value, str):
        return 0
    elif isinstance(x.value, int):
        return 1

Is there some fundamental reason that a type checker couldn't do the same, and apply exhaustiveness checking to that?

I think it's completely fair for a type checker to dismiss this as too complex to implement, or on grounds of being too slow, but is there some inherently unsafe or other (perhaps type-theory-based) reason not to have this functionality in type checkers?

erictraut · 2023-08-28T14:21:58Z

Fair, but do you expect it to give an error?

Given how type exhaustion detection works, yes I'd expect it to not detect exhaustion in this case.

Is there some fundamental reason that a type checker couldn't do the same

The logic in a type checker works on types that are expressible within the type system. It's theoretically possible for a type checker to internally create constructs that extend the type system, but the logic throughout the entire type checker would need to understand these new constructs. There would also need to be some way of expressing these constructs to users in error messages and reveal_type output. So yes, it's theoretically possible to do this, but it's not something I would expect or recommend.

JukkaL added bug mypy got something wrong priority-0-high topic-match-statement Python 3.10's match statement labels Jan 18, 2022

JukkaL mentioned this issue Jan 18, 2022

Support for python 3.10 match statement #10191

Merged

13 tasks

JukkaL mentioned this issue Jan 19, 2022

Release 0.940 planning #12021

Closed

freundTech mentioned this issue Jan 20, 2022

Python 3.10 tracking issue #10201

Closed

21 tasks

TomasPuverle mentioned this issue Jan 23, 2022

Exhaustiveness checks for patten matching microsoft/pyright#2569

Closed

JelleZijlstra mentioned this issue Jan 25, 2022

Mypy reports a missing return statement despite all cases of match statement return something #12071

Closed

JukkaL mentioned this issue Mar 1, 2022

Exhaustiveness checking for match statements #12267

Merged

JukkaL closed this as completed in #12267 Mar 7, 2022

michaeldiamant mentioned this issue Mar 17, 2022

Make ABI types fully specified algorand/pyteal#222

Merged

11 tasks

smheidrich mentioned this issue Sep 16, 2023

Feature request: Detect type mismatch between pattern-matched variable and cases #16128

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reachability/exhaustiveness checking for match statements #12010

Reachability/exhaustiveness checking for match statements #12010

JukkaL commented Jan 18, 2022

JukkaL commented Jan 18, 2022

freundTech commented Jan 18, 2022

JukkaL commented Jan 18, 2022

JukkaL commented Jan 19, 2022

erictraut commented Jan 19, 2022

ghost commented Jan 23, 2022

JelleZijlstra commented Jan 23, 2022 •

edited

Loading

ghost commented Jan 23, 2022 •

edited by ghost

Loading

freundTech commented Jan 23, 2022

TomasPuverle commented Jan 23, 2022 •

edited

Loading

ruohola commented Jul 30, 2023 •

edited

Loading

alexandermalyga commented Aug 27, 2023

erictraut commented Aug 27, 2023

aiven-anton commented Aug 28, 2023

erictraut commented Aug 28, 2023

Reachability/exhaustiveness checking for match statements #12010

Reachability/exhaustiveness checking for match statements #12010

Comments

JukkaL commented Jan 18, 2022

JukkaL commented Jan 18, 2022

freundTech commented Jan 18, 2022

JukkaL commented Jan 18, 2022

JukkaL commented Jan 19, 2022

erictraut commented Jan 19, 2022

ghost commented Jan 23, 2022

JelleZijlstra commented Jan 23, 2022 • edited Loading

ghost commented Jan 23, 2022 • edited by ghost Loading

freundTech commented Jan 23, 2022

TomasPuverle commented Jan 23, 2022 • edited Loading

ruohola commented Jul 30, 2023 • edited Loading

alexandermalyga commented Aug 27, 2023

erictraut commented Aug 27, 2023

aiven-anton commented Aug 28, 2023

erictraut commented Aug 28, 2023

JelleZijlstra commented Jan 23, 2022 •

edited

Loading

ghost commented Jan 23, 2022 •

edited by ghost

Loading

TomasPuverle commented Jan 23, 2022 •

edited

Loading

ruohola commented Jul 30, 2023 •

edited

Loading