Union and Intersections #3

CarliJoy · 2023-07-24T16:48:55Z

Intersection A & B

Type: Intersection of A and B. If A and B are both concrete types and not protocols, this must be a subclass of both. If either is a protocol, the protocol must be satisfied. If A or B is already an intersection, the intersections are combined.
Interface: The interface of A and B are both available and must not conflict.

(A & B) & C is equivalent to A & B & C

Union A | B

Type: Union of A and B. If A or B are already Unions, The unions are combined

Interface: The attributes, methods, and properties shared between A and B which have compatible types. If A and B use an identifier in conflicting ways, the accessor of the object cannot know which way it is used without checking.

(A | B) | C is equivalent to A | B C

(The below may seem contrived, but I think including it can assure the definitions we end up on remain consistent)

The order of intersections and unions matters

Unions of Intersections (A & B) | (B & C)

Type: a union of the intersection A & B and the intersection B & C
Interface: The shared compatible interface of 1) the combined interfaces of A and of B which must not conflict with each other and 2) the combined interfaces of B and of C which must not conflict with each other

(A & B) | (B & C) is equivalent to (A | C) & B

Intersection of Unions (A | B) & (B | C)

Type: Intersection of the unions A | B and B | C
Interface: The interface of both the unions A | B and B | C, where each of those unions has the shared minimum nonconflicting interface, and no permutation of union elements results in a conflicting interface for the intersection*

(A | B) & (B | C) is not equivalent to B & (A | C), but to (A & C) | (B & (A | C))

* A contrived case, to be sure, but...

class A(Protocol):
    def this(self: Self) -> Self:
        ...

class B(Protocol):
    def that(self: Self, other: T) -> T:
       ...

class C(Protocol):
    def this(self: Self) -> None:
        ...

Problematic = (A | B) & (B | C)

There are two possibilities here

This reduces to: (A & B) | B | (B & C) due to the incompatibility between A & C
This is considered unsatisfiable as a set of types

I believe 2 to be the correct interpretation. 1 results in a different interface than may be expected.

Why this matters when considering Any

This is a set of ordering and satisfiability constraints that works with the proposed update to the definitions of Any and Never. Should these orderings and constraints be considered undesirable, we may be back to the drawing board on handling Any

Originally posted by @mikeshardmind in #1 (comment)

NeilGirdhar · 2023-07-24T17:09:24Z

This looks okay, but I don't think "The interface of A and B are both available and must not conflict." is going to be reasonable in general. You can always have two interfaces with functions that have different parameter lists, but an intersection is still possible because the parameter list is contravariant. So a derived type that inherits from the members of the intersection can simply resolve the problem by having overloads (for example) that accomodate all of its parent classes.

Also, it may be really expensive to test that entire interfaces are compatible. It would be a lot easier to wait until a call is performed before verifying the call.

I think the verification should be to consider only the set of any elements that support that call, and then use as a return value the intersection of return types (due to covariance).

CarliJoy · 2023-07-24T17:37:07Z

I would be very happy if someone could write the something in the specification
@tomasr8 already offered, so keep him in the loop.

mikeshardmind · 2023-07-24T18:16:57Z

So a derived type that inherits from the members of the intersection can simply resolve the problem by having overloads (for example) that accomodate all of its parent classes.

overloads do not solve this. A & B says that this must satisfy both A and B. This is not possible in some cases by strict incompatibility. artificially adding overloads cannot bandaid over this suitably, and even detecting where it could theoretically would mean that it is not enough to just be a subclass of both A and B, but that you then must reimplement any interfaces which conflicted to accept the overload. This is an unreasonable ask for type checkers for something that won't even work all of the time. Consider these two interfaces:

class A:
    foo: int

    @staticmethod
    def more_complicated() -> int:
        ...

class B:
    foo: str

    def more_complicated(self, something: int) -> str:
        ...

There is no case for overloads here, and this violates requesting compatible interfaces with both A and with B by the intersection. I can't safely do something.foo and know the type anymore, and I can't even safely attempt to call something.more_complicated to then check the resulting type.

Requesting an impossible intersection should be treated as an error.

CarliJoy · 2023-07-24T18:28:06Z

@mikeshardmind this is a different issue, lets discuss this in #5

NeilGirdhar · 2023-07-24T19:21:37Z

I can't even safely attempt to call something.more_complicated to then check the resulting type.

I was thinking of having a non-empty intersection type if the return types of the function are compatible—even if the parameter lists are totally different:

from typing import overload, Any

class A:
    def more_complicated(self, something: A, cat: float = 1.0, dog: int = 14) -> int:
        ...

class B:
    def more_complicated(self, something: B) -> int:
        ...

class C(A, B):
    @overload
    def more_complicated(self, something: A, cat: float, dog: int) -> int:
        ...

    @overload
    def more_complicated(self, something: B) -> int:
        ...

    def more_complicated(self, something: A | B, cat: float = 0.0, dog: int = 1) -> int:
        return 1

Regardless of how permissive the parameter list is taken to be, I don't think it should be an error to intersect two types that have interface conflicts. It's too easy for this to happen especially when you don't control other people's code. And the intersections can also happen in other people's code, so you would end up with type errors that you can't correct.

I suggest you at least wait for the call or attribute access. For functions, I suggest that you do no more than try the arguments on each function that was defined for an element of the intersection. Don't try to verify that the functions are "compatible" since you may not be able to do so. What if one of the functions is a catch-all like "def f(self, *args, **kwargs)`? That should be fine, in my opinion—the hypothetical derived type could do the same after all. For the return type, choose the intersection of all return types of the matching functions (covariance).

I think one of the things that may be missing from this discussion is that the intersection may not be visible to the code producing the intersection. So there may be no way to provide it. It's very important for valid, reasonable code to pass type checking.

mikeshardmind · 2023-07-24T19:32:24Z

This makes an intersection no more useful than what already exists. I'd rather manually write a protocol that describes the intended behavior where there is incompatibility than have a declaration that I expect compatibility between types to exist include the type system not informing me when there isn't, and then inserting an overload that I then need to handle wherever that is used.

Furthermore, in the case of concrete types, it's already invalid to change the signature of a method of a subclass, so how would A & B ever end up with this being needed without prior issues?

NeilGirdhar · 2023-07-24T19:38:59Z

. I'd rather manually write a protocol that describes the intended behavior where there is incompatibility t

You can't always do this. The intersections can occur in generic code, and you may not be able to engineer some protocol for an intersection that doesn't exist anywhere in your code. And it's extremely tedious to do this in general, and especially in generic code.

than have a declaration that I expect compatibility between types to exist include the type system not informing me when there isn't, and then inserting an overload that I then need to handle wherever that is used.

This is a fair point. I withdraw my most permissive suggestion of imagining broad parameter lists, and suggest the trial approach of trying the parameters on each function with the same name. And then intersecting the returned types. Thus, the above example would work for:

x: A & B
x.f(x)  # Matches A.f, which returns int, and B.f, which returns int.  Thus, it returns int & int = int.

DiscordLiz · 2023-07-24T19:46:17Z

I can't even safely attempt to call something.more_complicated to then check the resulting type.

I was thinking of having a non-empty intersection type if the return types of the function are compatible—even if the parameter lists are totally different:

from typing import overload, Any

class A:
    def more_complicated(self, something: A, cat: float = 1.0, dog: int = 14) -> int:
        ...

class B:
    def more_complicated(self, something: B) -> int:
        ...

class C(A, B):
    @overload
    def more_complicated(self, something: A, cat: float, dog: int) -> int:
        ...

    @overload
    def more_complicated(self, something: B) -> int:
        ...

    def more_complicated(self, something: A | B, cat: float = 0.0, dog: int = 1) -> int:
        return 1

This isn't a valid subclass of A or of B.

main.py:12: error: Signature of "more_complicated" incompatible with supertype "A" [override]

you can't override methods in subclasses like this. (@mikeshardmind also already told you this isn't valid)

The lack of being able to do this makes this impossible, and there are very good reasons not to remove that restriction as it breaks assumptions that sub classes are valid substitutes

It may be possible for protocols by constructing a virtual subclass, but I don't like that for other reasons as I brought up in the other thread

NeilGirdhar · 2023-07-25T01:34:12Z

This isn't a valid subclass of A or of B.

It is valid. Try it in PyRight; MyPy is wrong here. Apparently, MyPy wants defaults for some reason and then it passes in MyPy too.

Edit: It is valid with defaults only.

mikeshardmind · 2023-07-25T01:40:54Z

pyright is wrong about this being a valid subclass of both A and B. The fact that you wrote "MyPy wants defaults for some reason" without understanding the reason is part of the issue here.

Without the defaults, this fails the ability to substitute your subclass for the original and have it be a drop in replacement.

You also constructed a case where overloads can be made to bandage around it and did not address the case I provided where this isn't possible at all.

erictraut · 2023-07-25T01:47:57Z

I'm concerned about the complexity of what's being proposed here and where this discussion has led.

From a type theory standpoint, I think it is sufficient to simply document that unions distribute over intersections and intersections distribute over unions.

(A | B) & C <=> (A & C) | (B & C)
(A & B) | C <=> (A | C) & (B | C)

Other rules should follow from these. TypeScript follows these distributive rules, and I presume other languages that support intersections do as well. (If you haven't done a survey of other languages that support intersections, I'd encourage you to do so. I feel like some of the apparent debates could be avoided if such a survey existed.)

As I mentioned in another thread, I encourage the authors of the PEP to avoid calling out specific handling of protocols, concrete types, etc. The rules for intersections will be easier to understand, more consistent, and more composable with existing type features if the PEP simply describes what happens in terms of existing type rules.

Other rules that are relevant for unions and intersections:
Subtyping
C is a subtype of A | B if C is a subtype of A or B
C is a subtype of A & B if C is a subtype of A and B

Attribute type evaluation
If ab has the type A | B, the type of ab.foo is invalid if foo is missing from either A or B; if foo is present on both, then ab.foo is the union of the types of A.foo and B.foo.
If ab has the type A & B, the type of ab.foo is invalid if foo is missing from both A and B; if foo is present on one, then ab.foo is that type; if it's present on both, then it's the intersection of A.foo and B.foo.

Using these general rules, we can avoid bikeshedding about whether a particular overload should be considered a subtype of a particular function, etc. Those discussions may or may not be worth our time, but they should be separate from the discussion about intersections.

NeilGirdhar · 2023-07-25T02:02:15Z

pyright is wrong about this being a valid subclass of both A and B. The fact that you wrote "MyPy wants defaults for some reason" without understanding the reason is part of the issue here.

Ah, my mistake. Well, consider the overload to have defaults then.

You also constructed a case where overloads can be made to bandage around it and did not address the case I provided where this isn't possible at all.

Like I said, Python typing doesn't have good support for the descriptor protocol yet. You can provide theoretical subclasses that work around that too.

Using these general rules, we can avoid bikeshedding about whether a particular overload should be considered a subtype of a particular function, etc. Those discussions may or may not be worth our time, but they should be separate from the discussion about intersections.

I agree. My only point is that we shouldn't block intersection just because the parameter lists of various functions are different. Either leave this up to type checkers or specify the most permissive and correct thing.

CarliJoy · 2023-07-25T17:30:38Z

I agree. My only point is that we shouldn't block intersection just because the parameter lists of various functions are different. Either leave this up to type checkers or specify the most permissive and correct thing.

@NeilGirdhar what you mean by block? Block writing the PEP/introducing Intersections or block users from using them in this case?

NeilGirdhar · 2023-07-25T21:12:28Z

what you mean by block? B

I just mean that I don't want it to be an error to try to use A & B in a case like:

class A:
  def f(self, x: int) -> None: ...
class B:
  def f(self, x: str) -> None: ...

That's what I tried to explain here. Just because the methods are not "compatible", doesn't mean that a subclass of A and B might not be possible. I think forcing the user to use protocols is unreasonable.

mikeshardmind · 2023-07-26T00:43:06Z

what you mean by block? B

I just mean that I don't want it to be an error to try to use A & B in a case like:
class A:
  def f(self, x: int) -> None: ...
class B:
  def f(self, x: str) -> None: ...
That's what I tried to explain here. Just because the methods are not "compatible", doesn't mean that a subclass of A and B might not be possible. I think forcing the user to use protocols is unreasonable.

I think what was said here, namely:

Possibly the type of ab["a"] should actually be int & str, not int | str

solves this without mandating anything additional.

If (A & B).foo = A.foo & B.foo, then we may not mandate how something satisfies both, but it doesn't preclude that something could. (The rule needs to be slightly more verbose than this, but this is how it would express for this case)

randolf-scholz · 2023-07-27T22:43:36Z

@NeilGirdhar I'm not sure what you mean by the methods not being compatible. There is no incompatibility when specifying A&B in the example you gave. If you follow #3 (comment), then (A&B).foo would be any subtype of Callable[[int], None] & Callable[[str], None].

Now, function types Callable[[A], B] are contravariant in the input and covariant in the output, i.e. Callable[[C], D] is a subtype of Callable[[A], B] if and only if C≥A and D≤B. So, following the subtyping rules:

Callable[[X], Y] ≤ Callable[[int], None] & Callable[[str], None]
⟺ (X ≥ int and Y ≤ None) and (X ≥ str and Y ≤ None)
⟺ X ≥ int and X ≥ str and Y ≤ None

I.e. X must be a supertype of both int and str. Set-theoretically, we can assign the join, i.e. the least common supertype of int and str, i.e. X = ⋂_{Y: Y≥int and Y≥str} Y. In the actual class hierarchy, this is likely simply object, but note that structurally also int | str works, i.e.

class C(A, B):
    def f(self, x: int | str) -> None: ...

Would be a valid subclass of A & B.

NeilGirdhar · 2023-07-28T05:44:03Z

@NeilGirdhar I'm not sure what you mean by the methods not being compatible. There is no incompatibility when specifying A&B in the example you gave. If you follow #3 (comment), then (A&B).foo would be any subtype of Callable[[int], None] & Callable[[str], None].

I know that there's no compatability problem. That's what I was showing. If you look at some of the comments that I was responding to, you can see the implication that parameters lists might matter. What I said in #3 (comment) was that "because the parameter list is contravariant... a derived type that inherits from the members of the intersection can simply resolve the problem by having overloads (for example) that accomodate all of its parent classes." Which appears to be what you've repeated here?

NeilGirdhar · 2023-08-04T16:24:57Z

I'm trying to organize the issues. If this is a summary of #5, then this issue appears to be a special case of it? After all:

There are two possibilities here

This reduces to: (A & B) | B | (B & C) due to the incompatibility between A & C

This is considered unsatisfiable as a set of types

The first case here is equivalent to impossible intersection reducing to Never? The second is equivalent to it being some kind error?

mikeshardmind · 2023-08-04T19:03:36Z

@NeilGirdhar accurate, but the second option was ruled out for the same reason one of the others was in #5, can come up in a place the type checker may not be able to raise it, so eliminating a case in reduction is fine, erroring on it isn't.

Whether reduction needs to happen or not is still being worked through(?)

NeilGirdhar · 2023-08-04T19:20:30Z

so eliminating a case in reduction is fine, erroring on it isn't.

Yes, I agree with that.

I'm not trying to express an opinion though. I just want to know if we can close this issue as a special case of #5, and do all the discussion there?

mikeshardmind · 2023-08-04T19:21:05Z

I think so now that we got to that being the decision left here, closing this up

CarliJoy mentioned this issue Jul 24, 2023

Handle "impossible" intersections #5

Closed

This was referenced Jul 25, 2023

TypedDict rules are unsound #6

Closed

Handling of Any within an Intersection #1

Closed

CarliJoy mentioned this issue Jul 25, 2023

General Organisation / How to continue / Consensus Finding #8

Open

mikeshardmind closed this as completed Aug 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Union and Intersections #3

Union and Intersections #3

CarliJoy commented Jul 24, 2023 •

edited

Loading

NeilGirdhar commented Jul 24, 2023 •

edited

Loading

CarliJoy commented Jul 24, 2023

mikeshardmind commented Jul 24, 2023 •

edited

Loading

CarliJoy commented Jul 24, 2023

NeilGirdhar commented Jul 24, 2023 •

edited

Loading

mikeshardmind commented Jul 24, 2023

NeilGirdhar commented Jul 24, 2023 •

edited

Loading

DiscordLiz commented Jul 24, 2023 •

edited

Loading

NeilGirdhar commented Jul 25, 2023 •

edited

Loading

mikeshardmind commented Jul 25, 2023

erictraut commented Jul 25, 2023 •

edited

Loading

NeilGirdhar commented Jul 25, 2023 •

edited

Loading

CarliJoy commented Jul 25, 2023

NeilGirdhar commented Jul 25, 2023

mikeshardmind commented Jul 26, 2023 •

edited

Loading

randolf-scholz commented Jul 27, 2023 •

edited

Loading

NeilGirdhar commented Jul 28, 2023

NeilGirdhar commented Aug 4, 2023 •

edited

Loading

mikeshardmind commented Aug 4, 2023

NeilGirdhar commented Aug 4, 2023

mikeshardmind commented Aug 4, 2023

Union and Intersections #3

Union and Intersections #3

Comments

CarliJoy commented Jul 24, 2023 • edited Loading

Intersection A & B

Union A | B

Unions of Intersections (A & B) | (B & C)

Intersection of Unions (A | B) & (B | C)

Why this matters when considering Any

NeilGirdhar commented Jul 24, 2023 • edited Loading

CarliJoy commented Jul 24, 2023

mikeshardmind commented Jul 24, 2023 • edited Loading

CarliJoy commented Jul 24, 2023

NeilGirdhar commented Jul 24, 2023 • edited Loading

mikeshardmind commented Jul 24, 2023

NeilGirdhar commented Jul 24, 2023 • edited Loading

DiscordLiz commented Jul 24, 2023 • edited Loading

NeilGirdhar commented Jul 25, 2023 • edited Loading

mikeshardmind commented Jul 25, 2023

erictraut commented Jul 25, 2023 • edited Loading

NeilGirdhar commented Jul 25, 2023 • edited Loading

CarliJoy commented Jul 25, 2023

NeilGirdhar commented Jul 25, 2023

mikeshardmind commented Jul 26, 2023 • edited Loading

randolf-scholz commented Jul 27, 2023 • edited Loading

NeilGirdhar commented Jul 28, 2023

NeilGirdhar commented Aug 4, 2023 • edited Loading

mikeshardmind commented Aug 4, 2023

NeilGirdhar commented Aug 4, 2023

mikeshardmind commented Aug 4, 2023

CarliJoy commented Jul 24, 2023 •

edited

Loading

NeilGirdhar commented Jul 24, 2023 •

edited

Loading

mikeshardmind commented Jul 24, 2023 •

edited

Loading

NeilGirdhar commented Jul 24, 2023 •

edited

Loading

NeilGirdhar commented Jul 24, 2023 •

edited

Loading

DiscordLiz commented Jul 24, 2023 •

edited

Loading

NeilGirdhar commented Jul 25, 2023 •

edited

Loading

erictraut commented Jul 25, 2023 •

edited

Loading

NeilGirdhar commented Jul 25, 2023 •

edited

Loading

mikeshardmind commented Jul 26, 2023 •

edited

Loading

randolf-scholz commented Jul 27, 2023 •

edited

Loading

NeilGirdhar commented Aug 4, 2023 •

edited

Loading