Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of Any within an Intersection #1

Closed
CarliJoy opened this issue Jul 23, 2023 · 295 comments
Closed

Handling of Any within an Intersection #1

CarliJoy opened this issue Jul 23, 2023 · 295 comments

Comments

@CarliJoy
Copy link
Owner

CarliJoy commented Jul 23, 2023

There is a great deal of confusion about handling the Any type within an Intersection.

In Python, Any is both a top type (a supertype of all types), and a bottom type (a subtype of all types). Python has a gradual typing system, meaning that it will never be required that everything is typed. Everything that is not typed is considered Any.

We examine five ways to handle intersections with Any:

  • Any is removed from intersections: T & Any = T.
  • Intersections containing Any become Any: T & Any = Any.
  • Any is forbidden in intersections: T & Any is an error.
  • Any is not reduced within intersections.
  • Any is only considered in an intersection in deference to non-gradual types.

Remove Any from Intersections

Arguments in favour

Arguments against

An Intersection containing Any becomes Any

Arguments in favour

Arguments against

  • @mikeshardmind argues that this reduces the effectiveness of gradual typing by discarding constraints.

Disallow Any in Intersection

Arguments in favour

Arguments against

  • @randolf-scholz argues that they can arise from sequential instance-checks.
  • @randolf-scholz argues that code could "synthesize an intersection type dynamically".
  • @erictraut argues that "intersections with Any will arise, often in the bowels of the type checker's logic where there's no good way to report an error to the user".

Treat T & Any as irreducible in general

Arguments in favour

Arguments against

  • at least @CarliJoy has no idea how that should work in the reference, how should a type checker handle it in detail?
  • @DiscordLiz and @mikeshardmind each argued that functionally this has all the same issues as treating it as Any

Any is only considered in an intersection in deference to non-gradual types.

Arguments for

Arguments against

  • @NeilGirdhar argues that it breaks logical consistency with unions, which are the dual of intersections.

⚠️ Rules for contribution to this Issue

  • Do not post, if your argument is already handled within the description
  • Posts should include include a proper header i.e. ## Arguments in favour to Disallow Any in Intersection
  • If you want to correct or specify things in the description write a comment in the format "Old" > "New"

The general idea is that I will update the description, allowing the discussion to be included in the PEP and prevent a discussion going in circles.

I will react with 🚀 once I included something in the description.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 23, 2023

Arguments in favor of Removing Any from Intersections

Any is essentially the universal set of types. The identity element of intersection on sets is the universal set for domains in which a universal set does exist. The Identity element for unions is the empty set (conceptually, typing.Never) The divergence in behavior is easily explainable as a consequence of logical ordering, and can be taught easily utilizing python builtins any and all and their corresponding behavior, as well as the corresponding set operations.

Edit: Argument retracted, Any's definitional mismatch between ? and Ω does matter here.

@mikeshardmind
Copy link
Collaborator

Arguments against An Intersection containing Any becomes Any

This reduces the effectiveness of gradual typing, as it causes a known constraint of the intersection to be thrown away, causing Any existence of Any to become contagion to code using intersections. This could further fragment the ecosystem between fully typed code and not.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 23, 2023

Arguments in favour to Disallow Any in Intersection

It is possible that Any & T is inherently unsafe.

Consider the case Any & AsyncCallback where AsyncCallback is a protocol or abstract base class describing the expected interface a coroutine will be scheduled with. It is possible that a type that satisfies Any conflicts with the required interface of AsyncCallback. (This can also be seen as a reason to remove Any and only treat Any as the universal set, rather than disallow)

@antonagestam
Copy link

I think it's worth mentioning that TypeScript considers T & any = any, though I find this surprising. Perhaps there have been previous discussions there that can be learned from?

I think it's also worth mentioning that the description of Any in the original post here is incomplete. It is both a subtype and a supertype of all other types.

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Jul 24, 2023

Dear everyone.

I'd like to propose a fourth option, which is that T&Any collapses to Any in some situations, collapses to T in others, and is irreducible in general. I organized this post in a few sections:

  • some background information that I think highlights some pitfalls,
  • a mathematical analysis with Python examples,
  • various consequences of the analysis,
  • an argument against universally collapsing intersection with Any, and
  • a very short conclusion.

I briefly discussed this with Jelle who suggested I post this to this thread. Please let me know if I've made any errors.

Background: Unions with Any

Before I get into the meat of it, here's an interesting discrepancy between PyRight and MyPy:

from typing import reveal_type, Any, TypeVar
def f(x: bool, t: str, a: Any):
    y = t if x else a
    reveal_type(y)  # Pyright: str | Any; MyPy: Any
    y.f()  # Pyright: Error; MyPy: Okay

As we can see, Pyright is not collapsing str | Any to Any. It treats it as irreducible.

I think MyPy is doing the wrong thing here. (I discussed this with Jelle who said it was due to "mypy's general behavior to use the join operator for ternary expressions. We'd like to fix that at some point but it's difficult for compatibility reasons.")

The reason this is incorrect is for the same reason it's an error to do this:

from typing import reveal_type, Any, TypeVar
def f(x: bool, t: str, a: int):
    y = t if x else a
    reveal_type(y)  # str | int
    y.strip()  # Cannot access member "strip" for type "int".

Thus, we can see that in general T | Any is not Any. Although, there are cases where it can be collapsed to Any.

Background: Definition of Any

It's also worth exploring what Any really means since that seems to be a source of confusion. Any is not:

  • the universal base class object,
  • nor is it the union of all types (the "universal set" as mentioned in a comment above) Omega = T_1 | T_2 | ... for all types T_i.

That's because its supertypes Sup(Any) = Any whereas Sup(Omega) = ∩_i Sup(T_i) = Sup(object) = object.
This is why we cannot reason from set theory.

Analysis

For any type T, consider its subtypes Sub(T) and supertypes Sup(T). For an intersection, A & B, we have:

  • Sub(A & B) = Sub(A) ∩ Sub(B)
    This means:
def f(y: A & B): ...
if isinstance(x, A) and isinstance(x, B):
  f(x)  # okay.
  • Sup(A & B) = Sup(A) ∪ Sup(B)
    This means:
def g(a: A): ...
def h(b: B): ...
x: A & B
g(x)  # okay
h(x)  # okay

So whereas it is true that each component of the intersection narrows the set of Sub(A&B), it increases the set Sup(A&B). Intuitively, this means that each element is an extra constraint on what you can pass to a function that accepts an intersection, but is a broader interface of what you can do with a variable of type A&B.

We can now deduce the rules:

  • Sub(A&C) = Sub(A) ∩ Sub(C)
  • Sup(A&C) = Sup(A) ∪ Sup(C)

For ordinary types A and C<A, then we get:

  • Sub(A&C) = Sub(C)
  • Sup(A&C) = Sup(C).
    And thus, the regular rule that A&C=C is true for ordinary types.

However, this is not the case when C is Any! Then we have

  • Sub(A&Any) = Sub(A) ∩ Sub(Any) = Sub(A)
  • Sup(A&Any) = Sup(A) ∪ Sup(Any) = Sup(Any) = Omega!

Thus, in general, A&Any is an irreducible form.
But as a parameter of a function (or similar subtype question), we have

class A: pass
class B: pass
def f(x: A&Any): ...
f(A())  # Okay, of course.  A is both a subtype of A and Any.
f(B())  # Error, of course.  

As a parameter to a function (or similar supertype question), we have

class A: pass
class B: pass
def f(b: B): ...
x: A & Any
f(x)  # No problem!  This is because intersection broadens the interface to everything.

Why does intersection broaden the way a variable can be used?

This seems paradoxical, but intersection broadens the way a variable can be used in exactly the same way that its dual, union, narrows the way a variable can be used.

Consider:

class A: ...
class B(A): pass

def g(a: A): pass
def h(b: B): pass

def f(x: bool):
    y = A() if x else B()
    reveal_type(y)  # A | B
    g(y)
    h(y)  # Argument of type A | B cannot be assigned to parameter of type B

Given a union, only those operations that can be done to every element of the union can be done to the union.

This is why any operation that can be done to any element of the intersection can be done to the whole intersection. We can test this in Pyright:

class A:
    def a(self): pass
class B:
    def b(self): pass

def f(x: A):
    assert isinstance(x, B)
    reveal_type(x)  # subclass of A and B; i.e., A & B
    x.a()  # Okay!
    x.b()  # Okay!

Thus, the intersection A&B can do anything that any of its components can do.

Consequences

In a function, the desired behavior of collapsing Any is correct:

def f(x: T & Any) -> None: ...
reveal_type(f)  # (T) -> None

(This is what everyone wants.)

If a variable ends up with type T & Any, for example as the output of a generic function, it should be usable everywhere:

def f[T](x: U) -> T & U: ...
x: Any
y = f(x)
reveal_type(y)  # T & Any

def g(s: str): ....
g(y)  # Always passes!

We need some kind of special case for isinstance(x, T) when x has type Any.

def f(x: T, y: Any):
  assert isinstance(x, U)
  reveal_type(x)  # Ideally, T & U; currently, PyRight can sometimes do this; MyPy gives U.
  assert isinstance(y, T)
  reveal_type(y)  # Must remain T;  Cannot be Any & T or type checking will be severely impacted.

I think the easiest thing to do would be for isinstance to intersect types unless one type is Any, in which case it replaces it. Some thought should go to this to ensure that no weird cases pop up. In the worst case, a new parameter to isinstance may be necessary to select intersection or replacement. This addresses the "contagion" effect mentioned above. It also means that T & Any will be extremely rare, which I think is what everyone wants as well 😄.

Along the same lines, there are other ways to narrow types, including singledispatch, match statements, and function overloads. I think the logical thing to do is to allow Any & T to leak into them and out of them. If this becomes problematic, a user may be able to use isinstance to eliminate Any.

Danger of unconditionally collapsing T & Any to T

Consider:

class A:
    def f(self):
        pass

class B:
    pass

T = TypeVar('T')
def g(x: T) -> T & B:
    assert isinstance(x, B)
    return x

def j() -> A:  # Case 1: return type annotated A; Case 2: Any
    return A()
def k() -> Any:
    return A()

y = g(j())
reveal_type(y)  # A & B
z = g(k())
reveal_type(z)  # Any & B

y.f()  # Allowed.
z.f()  # Only works if Any & B is not collapsed to B.

If intersection collapses types automatically, then z.f() fails even though all that's changed between y and z is that the argument to g has been changed from a type to Any. It would be extremely weird for broadening a type to Any to create type errors! These errors can be very hard to track down since they can be totally unlocalized (one library can change a type to Any, and the error can pop up miles away in user code that doesn't even use that function directly). Thus, these errors would be extremely pernicious.

Conclusion

T & Any is irreducible in general. Its subtypes are the subtypes of T, and its supertypes are the universe of all types.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

I entirely disagree with the above. Any is the universal set of python objects and object is not. Not only is this the actual definition of Any per pep 484, but object is clearly not the universal set of objects, as the type of types is type While having a problematic defined interface.

The current definition of Any itself creates contradictions in any attempt at a logically consistent type system

A special kind of type is Any. Every type is consistent with Any. It can be considered a type that has all values and all methods. Note that Any and builtin type object are completely different.

The attempt to treat Any as if it is a subtype or supertype is misguided at best and leads to inconsistent results.

While Any is implemented as a type in python, it isn't conceptually a type, but the set of all types when it comes to how it is treated in type checking and it's purpose in the type system.

Additionally, the claim that intersection should work like Union in the below is a complete misunderstanding of logical operators

class A: pass
class B: pass
def f(b: B): ...
x: A & Any
f(x)  # No problem!  This is because intersection broadens the interface to everything.

Intersection is "all" requirements, Union is "Any" requirements. Intersection should not be seen as broadening the allowable types, that is what Union does. Intersection is more restrictive on the types allowed the more types which are added to it.

Conflating the potential interface with the allowable types is problematic. In the Case of A & Any Things which are not A strictly are not part of the intersection.

The other problem comes in from the other part of the definition of Any per pep 484

A special kind of type is Any. Every type is consistent with Any. It can be considered a type that has all values and all methods. Note that Any and builtin type object are completely different.

This speaks to the potential interface of Any, not to the type of Any. Without changing the definition of Any, this part should make Any unsafe as an operand in an intersection, as by having all potential methods, it also has all potential conflicting method definitions with other operands of the intersection.

consider:

class A:

    def do_something(self):
        ...


class B:

    def do_something(self, x: int, /):
        ...

These are incompatible interfaces for an intersection and Any can be either.

@NeilGirdhar
Copy link
Collaborator

Any is the universal set of python objects

Where do you see in PEP 484 that Any is the universal set? Perhaps it would be best to clarify using mathematical language exactly what you mean by "universal set". It's not the union of all types Omega for the reason I described above. Do you mean something else?

and object is not.

I didn't say object was the universal set. I said that object is the universal base class. This is a technical term. I'll add a link to the wikipedia article.

Intersection should not be seen as broadening the allowable types,

Intersection broadens the allowable parameter types for which x can be passed for the exact same reason that union narrows it. I will add a section on a comparison to unions then to make all of the rules clearer and more obvious. Since unions are implemented today, you will be able to verify these rules yourself.

@mikeshardmind
Copy link
Collaborator

Intersection broadens the allowable parameter types for which x can be passed for the exact same reason that union narrows it. I will add a section on a comparison to unions then to make all of the rules clearer and more obvious. Since unions are implemented today, you will be able to verify these rules yourself.

This is entirely wrong, you are conflating types and their interfaces here, as was already explained to you in the main thread here: python/typing#213 (comment)

I didn't say object was the universal set. I said that object is the universal base class. This is a technical term. I'll add a link to the wikipedia article.

Except that it isn't, you can see this with the type of object being type, and the type of object() being object, I'm quite aware of how the type system works and don't need you to link me a Wikipedia article on the matter.

Where do you see in PEP 484 that Any is the universal set? Perhaps it would be best to clarify using mathematical language exactly what you mean by "universal set". It's not the union of all types Omega for the reason I described above. Do you mean something else?

I've edited in more detail. The term universal set of objects is not used, but definitions consistent with it are. However there's another part of the definition which is problematic by both saying Any is consistent with all objects, and also saying Any has all interfaces. This is quite literally self-contradictory the moment we consider that interfaces can have multiple conflicting definitions.

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Jul 24, 2023

This is entirely wrong, you are conflating types and their interfaces here,

I think the fundamental disagreement here is that you are identifying a type by the set of subtypes only Sub(X)? The inferface (Sup(X)) is also important, and it's affected by union and intersection. This is why X | Any is irreducible in Pyright today.

Except that it isn't, you can see this with the type of object being type, and the type of object() being object,

I think it's fairly well accepted that object is the universal base class since isinstance(x, object) is true for all x, which is the definition.

The term universal set of objects is not used, but definitions consistent with it are.

Normally, in mathematics, the universal set is the union of all elements. Any is not union of all types Omega, so I assume you mean something different.

there's another part of the definition which is problematic by both saying Any is consistent with all objects, and also saying Any has all interfaces.

I don't think this is problematic. This is why I prefer to consider to type expressions X and Y to be equivalent only when Sup(X) == Sup(Y) and Sub(X) == Sub(Y).

I think at heart, the main point of contention is that are thinking about intersection only in parameter lists or similar subtype questions. In those situations, you are right that you can reduce T & Any to T. However, those are not the only situations in which intersections pop up. For example, see the consequences example I gave. Similarly, unions T | Any do reduce to Any in parameter lists and other subtyping questions, and similarly, they are irreducible in other situations (see the example in the background section).

So, I'm not contradicting you in those parameter list situations. However, there are other situations to consider, and it is possible for a revealed to type to meaningfully have type T & Any, which would be irreducible.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

This is entirely wrong, you are conflating types and their interfaces here,

I added a section with an example so that you can see for yourself that intersection does indeed broaden what you can do with an object. Is it possible you misunderstood what I'm saying?

your edit adds "Why does intersection broaden the way a variable can be used?" this is the definition of an interface, not a type. You are still conflating the two, and the difference between types and their interfaces is actually why this is such a problem with the current definition of Any with intersections. Please review what other people have told you about this.

Unions are broadening of type, but only the shared interfaces may be safely used because we only know that it has at least one of these interfaces.

Intersections are narrowing of type, but because of the requirement of compatibility with all of an intersection's types, allow using the combined interface of the types.

In any interpretation of Any based on its current definition in accepted peps, Any presents a potential problem here.

Any having all potential interfaces makes Any means that all of the potential interfaces of Any are in conflict with the other operands as they can be defined as incompatible, it has all interfaces, compatible versions of them and not. For this reason, either the definition of Any needs changing or Any is unsafe as an intersection operand.

But even with this, the other type requirement of the intersection does not go away in the process.

The definition of Any is logically inconsistent and intersection and its logical consequences just shine a light on this pre-existing problem.

For what it is worth, mypy provides configuration options forbidding the use of Any specifically because of the problems of Any, and Ruff provides lints disallowing Any for the same reasons.

If it comes between implementing a typing feature in a way that is logically broken, and not allowing some code that already isn't typed to make use of a typing feature, the latter is preferable.

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Jul 24, 2023

this is the definition of an interface, not a type. Y

When we talk about types, they can occur in a variety of places. One of those places is that they can be associated with a variable as a "static type". These static types must encode all of the information that we have about the static type, including the interface (the way in which the variable can be used).

This discussion is about how type checkers should manipulate intersections with static types. Because the interface matters, for the static type of variables, you cannot collapse T&Any to T.

Ultimately, type checkers have to report the static types with reveal_type. And so the types returned by reveal-type will sometimes report T&Any, which is distinct from T.

Maybe for you a type checker's "static type" is not a "type". But this is one sense in which I am discussing types.

Another place types come up is in parameter lists. In that location, T&Any does collapse to T. But type checkers deal with types in a variety of places—not just parameter lists.

in conflict with the other operands as they can be defined as incompatible, and it has all interfaces, compatible versions of them and not.

That's not a conflict for the same reason that T | Any is not a "conflict". You can pass whatever you want to a function that accepts this type.

If it comes between implementing a typing feature in a way that is logically broken, and not allowing some code that already isn't typed to make use of a typing feature, the latter is preferable.

To be honest, I don't think anything is logically broken.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

in conflict with the other operands as they can be defined as incompatible, and it has all interfaces, compatible versions of them and not.

That's not a conflict for the same reason that T | Any is not a "conflict". You can simply do whatever you want with such an object of this type.

It is a conflict in the intersection case though. T & Any requires the interfaces of both T and Any. Any has all interfaces, those compatible with T and not.

You can't do whatever you want with T | Any

from typing import Any
from datetime import datetime


def function(x: datetime | Any) -> None:
    x.f()
main.py:6: error: Item "datetime" of "datetime | Any" has no attribute "f"  [union-attr]
Found 1 error in 1 file (checked 1 source file)

This also isn't unique to parameters, as the type also matters when it comes to potential reassignment, it happens even with a direct typing of a variable:

from typing import Any
from datetime import datetime


def function(x: datetime | Any) -> None:
    x.f()


def other_function() -> None:
    x: datetime | Any = datetime.utcnow()
    reveal_type(x)
    x.f()
main.py:6: error: Item "datetime" of "datetime | Any" has no attribute "f"  [union-attr]
main.py:11: note: Revealed type is "Union[datetime.datetime, Any]"
main.py:12: error: Item "datetime" of "datetime | Any" has no attribute "f"  [union-attr]
Found 2 errors in 1 file (checked 1 source file)

You can only use the interface provided by T safely, as you could receive T here, rather than Any. While Any has a compatible interface, you still need to care about the interface of T.

The difference between types and their interfaces matters. you cannot handwave them away with less rigorous definitions, and this is highly important here because the behavior of Union and Intersection are diametrically opposed in their effects of broadening vs narrowing of both interfaces and types

To be honest, I don't think anything is logically broken.

The logical contradiction has already been presented, and you have not given an explanation which satisfactorily shows it isn't in contradiction.

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Jul 24, 2023

You can't do whatever you want with T | Any

Sorry, I meant the dual of "pass whatever you want". You can do whatever you want with T & Any the same way that you can pass whatever you want to a function accepting T | Any.

The difference between types and their interfaces matters.

I think we're going in circles again. The static type includes all of the information known about the type. You seem to be dividing this information into something you're calling the "type" and the "interface". I think your definitions roughly correspond to Sub(X) and Sup(X), but it's hard to say. Nevertheless, when I use the word "type", I mean the static type (as returned by reveal type). The static type must encode what you call the "interface"—it is not distinct.

The logical contradiction has already been presented, and you have not given an explanation which satisfactorily shows it isn't in contradiction.

Just what I said above: "You can do whatever you want with T & Any the same way that you can pass whatever you want for T | Any." There's no contradiction. Why don't you show some code that is ambiguous?

@DiscordLiz
Copy link
Collaborator

There's no contradiction. Why don't you show some code that is ambiguous?

I don't know why you are still arguing this @NeilGirdhar but type theory is pretty clear here and you keep asking for things you've already been provided by people being far too patient with you.

@NeilGirdhar
Copy link
Collaborator

I don't know why you are still arguing this @NeilGirdhar but type theory is pretty clear here and you keep asking for things you've already been provided by people being far too patient with you.

I only posted because Jelle looked my post over and said it looked right. Let's try to keep the discussion civil.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

I don't know why you are still arguing this @NeilGirdhar but type theory is pretty clear here and you keep asking for things you've already been provided by people being far too patient with you.

My patience on this is my own, please don't presume that I should be more or less patient with people, thanks.

Why don't you show some code that is ambiguous?

Already have presented it, but I'm fine to explain one more time.

Any is defined as having all interfaces. Some interfaces are incompatible with others, as shown above, but here it is again.

class A:

    def do_something(self):
        ...


class B:

    def do_something(self, x: int, /):
        ...


A & B  # unsatisfyable condition

class AP(Protocol):

    def do_something(self):
        ...


class BP(Protocol):

    def do_something(self, x: int, /):
        ...

AP & BP  # unsatisfyable condition

An intersection must satisfy all of its requirements, conflicting interfaces create a situation where there are no types that could possibly satisfy the requested interface. Any is defined as having all interfaces. So while it may be safe to interact as if any has any version of that interface, it would be impossible to correctly say that Any has a non-conflicting interface with non-infinitely defined interfaces.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

the "type" and the "interface". I think your definitions roughly correspond to Sub(X) and Sup(X), but it's hard to say.

This would be a far easier discussion to have from a shared mathematical understanding of type or category theory, rather than the non-rigorous definitions that python currently has adopted, but if python had a mathematically rigorous type system, we wouldn't need to have this discussion currently. The moment you attempt to formalize all of the existing rules into concrete logic, there are some types which can be created at runtime, but are inexpressible by the typing, as well as logical contradictions that go away if you ignore the existence of Any

Sub(X) and Sup(X)

These don't have definitions agreed upon anywhere outside of type theory, and your use of them here does not match the use in type theory. I've seen the doc floating around that proposes using this, but the use isn't consistent with existing type theory, so I've avoided potentially muddying this conversation with the use of one definition where the other might be assumed.

Interface means the obvious thing here, namely the known methods, attributes, etc of an instance of a type, and their signatures and/or types.

Type also means the obvious thing here. (The type which an instance of an object is)

Interfaces and types are largely identical outside of LSP violations when talking about only one type. When discussing unions and intersections of types, this changes.

Treating Any as a singular type with an infinite interface is the mathematical problem here. Python does not have the means to have multiple methods of the same identifier with differing arity/signature like some other languages, so the infinite interface contradicts other interfaces in the case of an intersection where any shared interfaces must be compatible

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Jul 24, 2023

An intersection must satisfy all of its requirements, conflicting interfaces create a situation where there are no types that could possibly satisfy the requested interface. Any is defined as having all interfaces. So while it may be safe to interact as if any has any version of that interface, it would be impossible to correctly say that Any has a non-conflicting interface with non-infinitely defined interfaces.

Oh I see. That's a great point.

Let's see what PyRight does for this:

class A:
    def f(self, x: int): pass
class B:
    def f(self, x: str): pass

def f(x: A):
    assert isinstance(x, B)
    reveal_type(x)  # subclass of A and B
    x.f(2)  # Okay!
    x.f('s')  # Fails; expects int.

So it seems that PyRight is not really intersecting, but rather creating a virtual subclass:

class X(A, B): ...

What are the possible solutions?

  • Forbid intersecting types that have conflicts including __init__. That is going to be seriously problematic since derived classes very often conflict with their parents' __init__ method.
  • Forbid intersecting types that have conflicts other than __init__. That could be reasonable.
  • Allow intersecting types, but only allow the intersection to have an interface of compatible methods.

My preference is probably for case 3 as the most permissive. It would allow you to do something like this:

class A:
  def f(self): ...
  def g(self) -> str: pass

class B:
  def f(self): ...
  def g(self) -> int: pass  # g is different

def g(x):
  assert isinstance(x, A) and isinstance(x, B)  # No subclass like this exists yet, but we may yet create one.
  x.f()  # No problem
  x.g()  # Error!

class C(A, B): ...  # C.g is A.g
class D(B, A): ...  # D.g is B.g
g(C())
g(D())

Type also means the obvious thing here.

The way you've been using type doesn't match what is returned by reveal type in Python, which I think is the source of the confusion. I imagine, you would say that T | Any is not a type either? And yet it's returned by reveal type. So this is the way I've been using it.

Treating Any as a singular type with an infinite interface is the mathematical problem here. Python does not have the means to have multiple methods with differing arity like some other languages, so the infinite interface contradicts other interfaces in the case of an intersection where any shared interfaces must be compatible

Right, that's a fair point.

In cases 1 and 2 above, Any should be forbidden from intersection. In case 3, Sup(T & Any) = object. (This resolves your conflict by basically saying that you can't use the variable except as an object.)

My preference with Any though is to let it leak through so that the interface fo T & Any is Any for the reason described in my long post at the top under "Danger".

@mikeshardmind
Copy link
Collaborator

The way you've been using type doesn't match what is returned by reveal type in Python, which I think is the source of the confusion. I imagine, you would say that T | Any is not a type either? And yet it's returned by reveal type. So this is the way I've been using it.

Correct, I consider T | Any a set of types defined by the union of types which satisfy either T or Any, having the knowable interface shared between them.

I think the fact that python's terminology is so loose compared to more well-defined type theory creates more situations like this than I'd like.

Allow intersecting types, but only allow the intersection to have an interface of compatible methods.

I believe this should already be considered a requirement, but I disagree about it solving Any with Any's current definition.

I think it's possible to redefine Any and Never in terms that resolve this without needing to remove the usefulness of Any and without disrupting any current uses (because the problem only arises currently with the not yet implemented intersection, and other things that have not been accepted yet, such as Not[T])

A definition for Any which would work:

The set of all possible types. Type checkers should allow any operation on things which are only known to be a member Any, as this is the default state of objects. Type checkers may restrict the allowed interface or the known type as more information becomes known about the object through narrowing via instance checks, type checks, or intersections with other types.

And a corresponding definition of Never:

The empty set of types. Functions which always raise never return anything. Objects which have no valid types should be marked with this, and code which is statically typed and attempts to use objects with this type should be considered as an error as this code is either unreachable or there has been a logical error in the types given to the analysis tool

@NeilGirdhar
Copy link
Collaborator

I think the fact that python's terminology is so loose compared to more well-defined type theory creates more situations like this than I'd like.

Right, now that I understand how you're using terms, I can just use your vocabulary.

So my preference is for T & Any to have the interface of Any, and what you're calling the "type" of T. It should be reported as a static type as irreducible T & Any since we need to keep track of all of information.

A definition for Any which would work:

What are the consequences of this defintion for the examples above?

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

What are the consequences of this defintion for the examples above?

For the case of Any, I believe it leaves all existing code exactly as is due to the restriction on type-checker behavior included. Type checkers already coerce Any to specific types via isinstance, so saying that this is desirable behavior doesn't change anything either. Leaving it as may leaves room for type checkers to not be considered in error if they have an opinion about not narrowing something for either performance, edge case, or any other reasons not considered at this time. This language could be restricted more in the future if python's type system is ever considered to be near-complete, but currently I think may leaves a good amount of freedom, while still having a strong definition of what behavior is allowed here.

The behavior on intersection would result in the type Any & T being reducible to T as rather than being all compatible types, it's merely the set of all types. I don't have a good answer for the impact on the edge case you presented near the start with Any, but I believe that this case is likely to be rare, and only happen in cases where the required interface of T is what is cared about, as users of Any have no benefit for using intersection to say it conforms with other interfaces as Any is already treated that way.

It may be something that type checkers should warn is happening by default or include a configuration setting to inform users is happening

The behavior of Never with these definitions would be entirely as it currently is. The current use of Never is already consistent with the set-based definition.

@NeilGirdhar
Copy link
Collaborator

Incidentally, we should at least then change the language in the top post of this thread to concord with the language you've been using especially because these differ from the existing PEPs. Specifically:

  • we should also provide definitions for "type" and "interface", and
  • most importantly, instead of asking "what type is produced by T & Any", we should ask what "type" and "interface" is produced by T & Any where T is a set of types and an interface, and Any has a set of types and an interface.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

Agreed. I didn't realize how far apart we were on definitions initially, despite that I was consciously avoiding certain terms I felt had some ambiguity, there were others which also had mixed meanings in play.

@NeilGirdhar
Copy link
Collaborator

Okay, productive and interesting discussion, I'm going to take a break, but looking forward to seeing the next draft of this proposal. Intersections solve so many type errors; it's a feature I'm really looking foward to.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

most importantly, instead of asking "what type is produced by T & Any", we should ask what "type" and "interface" is produced by T & Any where T is a set of types and an interface, and Any has a set of types and an interface.

It may also help to provide some examples, so starting that.

Intersection A & B

Type: Intersection of A and B. If A and B are both concrete types and not protocols, this must be a subclass of both. If either is a protocol, the protocol must be satisfied. If A or B is already an intersection, the intersections are combined.
Interface: The interface of A and B are both available and must not conflict.

(A & B) & C is equivalent to A & B & C

Union A | B

Type: Union of A and B. If A or B are already Unions, The unions are combined

Interface: The attributes, methods, and properties shared between A and B which have compatible types. If A and B use an identifier in conflicting ways, the accessor of the object cannot know which way it is used without checking.

(A | B) | C is equivalent to A | B C

(The below may seem contrived, but I think including it can assure the definitions we end up on remain consistent)

The order of intersections and unions matters

Unions of Intersections (A & B) | (B & C)

Type: a union of the intersection A & B and the intersection B & C
Interface: The shared compatible interface of 1) the combined interfaces of A and of B which must not conflict with each other and 2) the combined interfaces of B and of C which must not conflict with each other

(A & B) | (B & C) is equivalent to (A | C) & B

Intersection of Unions (A | B) & (B | C)

Type: Intersection of the unions A | B and B | C
Interface: The interface of both the unions A | B and B | C, where each of those unions has the shared minimum nonconflicting interface, and no permutation of union elements results in a conflicting interface for the intersection*

(A | B) & (B | C) is not equivalent to B & (A | C), but to (A & C) | (B & (A | C))

* A contrived case, to be sure, but...

class A(Protocol):
    def this(self: Self) -> Self:
        ...

class B(Protocol):
    def that(self: Self, other: T) -> T:
       ...

class C(Protocol):
    def this(self: Self) -> None:
        ...

Problematic = (A | B) & (B | C)

There are two possibilities here

  1. This reduces to: (A & B) | B | (B & C) due to the incompatibility between A & C
  2. This is considered unsatisfiable as a set of types

I believe 2 to be the correct interpretation. 1 results in a different interface than may be expected.

Why this matters when considering Any

This is a set of ordering and satisfiability constraints that works with the proposed update to the definitions of Any and Never. Should these orderings and constraints be considered undesirable, we may be back to the drawing board on handling Any

@tomasr8
Copy link
Collaborator

tomasr8 commented Jul 24, 2023

Thanks for the writeup @mikeshardmind! FWIW, I'm also working on trying to figure out how to combine intersections with unions and I think we mostly agree except for the last case - on the surface it seems like having the usual laws like distributivity satisfied is desirable, but let's maybe discuss it over at discord, so we don't pollute this issue :D

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Jul 24, 2023

I was thinking about chaing the language of "type" and "interface". Let's call this the "literature type".

At the end of the day though, you're going to have to translate the "literature type/interface" into things like:

  • a type returned by reveal_type,
  • the type of a parameter annotation, and
  • the type of a variable annotation.

These three types need to have a canonical representation that represents the literature type and interface—we can call this representation the canonical type. I'm not sure whether it will ultimately be easier to stick with the Python language for this reason.

Either way, there will need to be a canonical way to specify T & Any. If T & Any has a different interface than whatever "literature type" is chosen for T, then it will need a different canonical type than the canonical type of T. I guess we'll have to see what meaning T & Any takes first, but we may end up the irreducible T & Any that I suggested as the canonical type.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

I was thinking about chaing the language of "type" and "interface". Let's call this the "literature type".

I'll give some examples of why I don't know that more definitions are strictly helpful and that it's more about understanding that there are some things we should not consider a type, but a specification of what we know about possible/allowed types.

The moment many of these are thought of this way, the distinction between a concrete runtime type and the interface we can semantically and statically show to exist becomes automatic.

In many cases, we don't have a concrete exact type. This is true even without considering Unions or Intersections when it comes to protocols. An annotation of a protocol is not a type, it's an expectation of a required interface and is sometimes referred to as structural typing (which is distinct from typing based on composition or inheritance)

In many places, we are not specifying a type, but what we know about the possible types or interfaces an object might have.

A protocol as an annotation is a specification about a known interface but says nothing about the concrete type other than that instances of it which satisfy it must implement that interface.

This is a protocol taken from real library code. The intent is that a user could provide most choices for a numeric container, including those that I might not have knowledge of as the library author, such as a custom vector class. The library does abstract math and simplification of terms on objects for which these are well-defined operations. This includes symbolic simplifications and in the future higher-order operations on functions of functions.

class SupportsBasicArithmetic(Protocol):
    
    def __add__(self: Self, other: Self) -> Self:
        ...
    
    def __sub__(self: Self, other: Self) -> Self:
        ...
    
    def __mul__(self: Self, other: Self) -> Self:
        ...
    
    def __truediv__(self: Self, other: Self) -> Self:
        ...

    def __eq__(self: Self, other: Self) -> bool:
        ...

Code that accepts parameters annotated with this, has no knowledge of the type of the object they are accepting, only of a required portion of the interface.

Additionally, writing

x: int | Fraction = 1

The type of x at that point in time is int (or Literal[1], but the annotation states that this isn't the intent), but it is annotated to specify that it could also be assigned as a Fraction.

Saying that x has a type of int | Fraction there feels wrong to me in terms of terminology, that's only what types (plural) x is allowed to have.

As to specific things you wanted defined:

a type returned by reveal_type,

With this model of thinking about it, reveal_type is a request to the type checker for what it knows about the potential types of an object (and type(thing) is still the canonical way to get the actual type at runtime)

the type of a parameter annotation, and
the type of a variable annotation.

The annotation is a specification of what must be true about the type of the object for code relying on this annotation to be sound from the perspective of static analysis.... a set of specifications about the allowed type(s) and interface(s) required.


Edit: Actually, we probably still need more definitions accepted in some regard, and many definitions could stand to be made more clear, more rigorous, and/or more consistent, I just don't think we need multiple definitions for type. We just need to consider where we don't have a type, but a description of potential types.

We should still have interface as a defined term. I believe it was defined well in the discussions about protocol and some of the discussions about variance, but I don't think there's a good canonically accepted definition of interface for python currently

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Jul 24, 2023

I just don't think we need multiple definitions for type. We just need to consider where we don't have a type, but a description of potential types.

I understand, but the problem is that I'm not the person that needs to be swayed by the PEP. You're going to communicate with all of the Python people who are used to one set of definition.

For these people, Any is the "canonical type"; so, above, when you claimed that Any was the "universal set", this is incorrect since the universal set for them is Omega.

Now, I've figured what you mean by type ("literature type"); so, I know that for you the literature type LiteratureAny is the universal set. We can speak the same langauge.

But ultimately, you'll need to write a PEP and that PEP is going to have to relate to the other typing PEPs. When it uses symbols to represent types, those symbols have to make sense in the world of symbols that we already have. They have to be intelligible to people who are using those symbols.

If, as I think you're suggesting, the interface to X & Any is going to be different than the interface to X, the irreducible proposal that I had is probably the best way to communicate that—even though it may not be the type theory that you're used to. After all, this is the same choice that Pyright made with irreducible X | Any

Edit: Actually, we probably still need more definitions accepted in some regard, and many definitions could stand to be made more clear, more rigorous, and/or more consistent, I just don't think we need multiple definitions for type. We just need to consider where we don't have a type, but a description of potential types.

Yes, that's fair. Nevertheless, you're going to have to turn that "description" into some canonical representation for display and annotation.

@mikeshardmind
Copy link
Collaborator

The argument in favor of option 5 is that it produces a narrower type, which may or may not be more practical.

You're missing part of the argument for them. It remains consistent with existing specified behavior of subtypes of Any and another known type.

If these false negatives are so pernicious to a user, then my suggestion is for the type checker to optionally warn for them and then the user can either silence the warning or create a wrapper so that the intersection with Any isn't produced.

The problem with a false negative being an optional warning should be obvious and was explained already. There's no on-ramp for a user to know this is important or why, and as an option rather than as the subtyping behavior, it's possible for this to diverge across type checkers.

False positives can be fixed by providing more information as well, and it was shown that it is safer to handle it in that direction already because the method for doing so is to just provide the minimum amount of information if you can't fully type it

@NeilGirdhar
Copy link
Collaborator

It remains consistent with existing specified behavior of subtypes of Any and another known type.

I don't think that behavior is intentional since they cannot currently return an intersection even if they wanted to.

it's possible for this to diverge across type checkers.

If the type checkers think this is a problem, they can turn the warning on by default. MyPy already has similar options:

allow_untyped_defs = True
allow_untyped_calls = True

and it was shown that it is safer

Sorry, I don't agree that "it was shown", and "safety" is a matter of opinion here.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Dec 19, 2023

It remains consistent with existing specified behavior of subtypes of Any and another known type.

I don't think that behavior is intentional since they cannot currently return an intersection even if they wanted to.

The behavior was intentional, this was discussed in depth in discord. The behavior was directly accepted to handle cases where things were only partially unknown, such as unittest.Mock, and a baseclass which was not type.

@mikeshardmind
Copy link
Collaborator

This makes it so you can pass a mock object to any function, regardless of
its annotation, which is good because that's usually how you use mock
objects. But it also means that the type checker knows what the signature
of the assert_has_calls method is, and can tell you if you get it wrong.

https://mail.python.org/archives/list/[email protected]/thread/GULRKYI7XOB3FLAEFC6OYSTBS5FIA5PU/

python/cpython#91154

@mark-todd
Copy link
Collaborator

mark-todd commented Dec 19, 2023

Let's examine the practicality of both options. Consider that you have T & Any and then you eventually change it to T & U where

class T:
def f(self, x: int) -> None: ...
class U:
@overload
def f(self, x: int) -> None: ...
@overload
def f(self, x: str) -> None: ...
Under option 4,

f("a") is okay with the annotation T & Any and the annotation T&U,
f(1j) is okay with the annotation T & Any (false negative), but errors with the annotation T & U.
Under option 5,

f("a") is an error with the annotation T & Any (false positive), but passes with the annotation T&U,
f(1j) is an error with the annotation T & Any and the annotation T & U.
If these false negatives are so pernicious to a user, then my suggestion is for the type checker to optionally warn for them and then the user can either silence the warning or create a wrapper so that the intersection with Any isn't produced.

I fully follow your point here - as much as I do really dislike false positives, I think option 5 is the lesser of the two evils. The odds of clashing signatures in T & U seems quite unlikely to me in the real world, and if there was a clash in the way you've described, I suspect one side or the other side you could tweak the name of the method.

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Dec 19, 2023

odds of clashing signatures in T & U seems quite unlikely to me in the real world,

In general, I agree with you that this is a rare case, but consider when U has something like def __call__(self, *args: Any, **kwargs: Any) and T is more strict. There's going to be that one odd case even if it's rare.

And the amount of disagreement I've seen on python-discuss from users who absolutely cannot stand type checkers telling them that correct code is wrong is immense. False positives are despised.

I suspect one side or the other side you could tweak the name of the method.

With something like __call__ you can't change the name of the method. These false positives would need to be ignored. And can they not happen deep in library code where you don't have control nor any way to ignore them?

The behavior was intentional, t

I read both threads and don't see how it is. Mock inheriting from Any is not an intersection between Mock and Any. That actually does give you the "option 5" behavior you want.

Actually, that's pretty interesting It seems like the option 5 behavior is something else entirely. I'm not proposing it, but consider a new operator SubClassOf used as follows:

def f(U: type) -> SubClassOf[T, U]:
  class T(U): ...
  return T

Now, it is true that SubClassOf[T, U] has the deferring behavior that you want even if U is Any, it will prefer T.f regardless of U's definition.

Edit: it's a bit more complicated than I have here (since I don't think the forward reference is allowed), but I think something like this can be made to work and fit with option 5.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Dec 19, 2023

I read both threads and don't see how it is. Mock inheriting from Any is not an intersection between Mock and Any. That actually does give you the "option 5" behavior you want.

I'm not saying that Mock inheriting from Any creates an intersection, I'm saying we have a behavior for when Any is one of multiple supertypes and that intersection having differing behavior would be inconsistent with that even with it not being an intersection, and just considering "what happens when we have multiple known supertypes" Note that the order in which Any is present in bases does not change the behavior.

You claimed to want consistency from the rules, so I'm presenting to you the point that option 4 would be inconsistent in terms of subtyping relationships.

@mark-todd
Copy link
Collaborator

In general, I agree with you that this is a rare case, but consider when U has something like def call(self, *args: Any, **kwargs: Any) and T is more strict. There's going to be that one odd case even if it's rare.

Yeah magic methods is good point. Random thought - could we use something in the other half of the intersection to say "This method takes priority - forget the other one!" Then we could have this:

class T:
  @priority_method
  def f(self, x: int) -> None: ...
class U:
  @overload
  def f(self, x: int) -> None: ...
  @overload
  def f(self, x: str) -> None: ...

Actually, that's pretty interesting It seems like the option 5 behavior is something else entirely. I'm not proposing it, but consider a new operator SubClassOf used as follows:

def f(T: type) -> SubClassOf[T, U]:
class Z(T, U): ...
return Z
Now, it is true that SubClassOf[T, U] has the deferring behavior that you want even if U is Any, it will prefer T.f regardless of U's definition.

I actually proposed this on the call last week haha - we concluded it fulfils a different function though and would be a different PEP (but one I would support!)

@NeilGirdhar
Copy link
Collaborator

I actually proposed this on the call last week haha - we concluded it fulfils a different function though and would be a different PEP (but one I would support!)

I'm pretty sure it's just option 5. I support it too—just not as the definition of intersection—even if it is useful in many cases.

'm saying we have a behavior for when Any is one of multiple supertypes

Yes, but that behavior is, in my opinion, a consequence of intersections not existing yet. Type checkers cannot return T.f & Any even if they want to.

@mikeshardmind
Copy link
Collaborator

Yes, but that behavior is, in my opinion, a consequence of intersections not existing yet. Type checkers cannot return T.f & Any even if they want to.

It explicitly is not, please see above quote which was provided showing how it was not and related links.

@DiscordLiz
Copy link
Collaborator

Can we all actually ensure we're reading what's been said before responding? This is not only going to go nowhere, but likely get heated again if people just talk past eachother and I just want to wrap up and focus on any part where we don't agree to figure it out.

@NeilGirdhar
Copy link
Collaborator

It explicitly is not, please see above quote which was provided showing how it was not and related links.

You're talking about this: "This makes it so you can pass a mock object to any function"? This is not the subtyping of (T, Any). This is an object of type T with a base class of Any. And that is closer to SubClassOf than intersection since it necessarily defers to T.

@mark-todd
Copy link
Collaborator

mark-todd commented Dec 19, 2023

Just to summarise currently, I think there's quite a strong movement towards option 5 - if we've discounted all other options, I would suggest we move the conversation to more of a case of solving specific issues with it (like the one demonstrated above) in separate issues for each point. Also, can't speak for anyone else but for me scrolling 300 comments is quite annoying

image

https://docs.google.com/spreadsheets/d/1JLOF0d20olalPHZrR3AoElGY4gpSpl8uDyOafJVFaZ0/edit?usp=sharing

@NeilGirdhar
Copy link
Collaborator

I think you should add the comments of Jelle and Eric who both supported option 4—in fact, the definition of option 4 are from their comments. I can't speak to their feelings about 5 though.

@DiscordLiz
Copy link
Collaborator

You're talking about this: "This makes it so you can pass a mock object to any function"? This is not the subtyping of (T, Any). This is an object of type T with a base class of Any. And that is closer to SubClassOf than intersection since it necessarily defers to T.

@NeilGirdhar You said you read the linked threads, but you've missed something in them important. It was pointed out that even prior to this being accepted for the mock case, it already needed to be the behavior including with multiple inheritence. @mikeshardmind also gave you examples which included multiple inheritence, so can you address it in the full context of the discussion and consistency, or are you just trying to talk past what he said?

@mark-todd
Copy link
Collaborator

I think you should add the comments of Jelle and Eric who both supported option 4—in fact, the definition of option 4 are from their comments. I can't speak to their feelings about 5 though.

Yeah I think it would be helpful to hear from them about option 5 - wasn't sure what to write in the summary as it didn't exist back then, but I'll try and add them a row (I've added None as an option for no opinion known).

@mikeshardmind
Copy link
Collaborator

I'm gonna step away for a bit I may look at this later, but I'm getting tired of having to repeatedly state things and having portions of it quoted out of context, or important details that are inconvenient to someone else being left out in responses. I can understand that some people may not be happy with some of the options, but that's not a reason to completely ignore the arguments being made and ignore details that have directly presented, and this has been a recurring interaction over the course of months.

@mark-todd
Copy link
Collaborator

I'm gonna step away for a bit I may look at this later, but I'm getting tired of having to repeatedly state things and having portions of it quoted out of context, or important details that are inconvenient to someone else being left out in responses. I can understand that some people may not be happy with some of the options, but that's not a reason to completely ignore the arguments being made and ignore details that have directly presented, and this has been a recurring interaction over the course of months.

I do really feel like we're this close to reaching a consensus - there are really only a few edge cases remaining. There is a lot of discussion on this issue and holding all the points and counterpoints in your head at once is difficult, at least for me. I would suggest we try and conclude this issue - then fleshing out the details of the PEP we can figure out in the new year. We work under the assumption that we're going with option 5, but then try and deal with the specific edge cases that might generate false positives etc.

@NeilGirdhar
Copy link
Collaborator

I do really feel like we're this close to reaching a consensus -

I think you really need to get the buy-in of MyPy and Pyright at the least. After all, they are going to review the PEP and their feedback will make or break its passing. I don't think it makes sense to write it up only for it to get rejected. By all means, make the best case as to why 5 is better than 4 if that's what you feel strongly about. I suggest starting by adding links to positive arguments for 5 in this issue header to make it easy to follow.

@NeilGirdhar
Copy link
Collaborator

I'm saying we have a behavior for when Any is one of multiple supertypes and that intersection having differing behavior would be inconsistent with that even with it not being an intersection, and just considering "what happens when we have multiple known supertypes" Note that the order in which Any is present in bases does not change the behavior.

As I said before, I believe that behavior is a consequence of not having intersections already. Do you have a reference for an alternative explanation?

@DiscordLiz
Copy link
Collaborator

@NeilGirdhar Maybe try reading what you said you did.

@mark-todd
Copy link
Collaborator

I think you really need to get the buy-in of MyPy and Pyright at the least. After all, they are going to review the PEP and their feedback will make or break its passing. I don't think it makes sense to write it up only for it to get rejected. By all means, make the best case as to why 5 is better than 4 if that's what you feel strongly about. I suggest starting by adding links to positive arguments for 5 in this issue header to make it easy to follow.

Yeah I do agree - I think I'll try and prepare a document that compares them, although I'd appreciate @mikeshardmind 's input on that one. If not, I will try and gather what I can from the discussions. Then maybe @gvanrossum is there some way you could bring it to them for review?

@DiscordLiz
Copy link
Collaborator

Or here since you're so averse to that:

But it also means that the type checker knows what the signature
of the assert_has_calls method is, and can tell you if you get it wrong.

That part precludes an intersection, as that would no longer tell you "if you get it wrong" for the motivating case.

@NeilGirdhar
Copy link
Collaborator

Yeah I do agree - I think I'll try and prepare a document that compares them

Great idea. You can draw from the top level sections of this issue. I'll try to fill in some arguments for and against if I can.

@erictraut
Copy link
Collaborator

Could someone point me to a full description of options 4 and 5? I'm struggling to understand the nuanced differences between these two. I see some examples that try to demonstrate where the two differ, which is helpful, but I'm looking for a clear articulation of the rules — in particular, how they differ and whether they are fully symmetric with unions.

FWIW, here are my primary criteria for evaluating any proposal:

  1. Is intersection completely symmetric with union? If not, then I think this is a huge red flag.
  2. Can all the behaviors for intersection be derived from a few basic axioms that work with all types in the type system, or do the rules require us to define special cases for Any, Callable[..., Any], etc.? Ideally, there should be no special cases required, and the behavior should fall out of the composition of the basic rules.

@mark-todd
Copy link
Collaborator

mark-todd commented Dec 19, 2023

Could someone point me to a full description of options 4 and 5? I'm struggling to understand the nuanced differences between these two. I see some examples that try to demonstrate where the two differ, which is helpful, but I'm looking for a clear articulation of the rules — in particular, how they differ and whether they are fully symmetric with unions.

FWIW, here are my primary criteria for evaluating any proposal:

  1. Is intersection completely symmetric with union? If not, then I think this is a huge red flag.
  2. Can all the behaviors for intersection be derived from a few basic axioms that work with all types in the type system, or do the rules require us to define special cases for Any, Callable[..., Any], etc.? Ideally, there should be no special cases required, and the behavior should fall out of the composition of the basic rules.

Thanks for this. It took me a while to understand 4 vs 5, but this point helped:
#1 (comment)

I think the only special case is Any? We're currently working on a draft of something to compare them

@DiscordLiz
Copy link
Collaborator

DiscordLiz commented Dec 19, 2023

@erictraut

Could someone point me to a full description of options 4 and 5? I'm struggling to understand the nuanced differences between these two. I see some examples that try to demonstrate where the two differ, which is helpful, but I'm looking for a clear articulation of the rules — in particular, how they differ and whether they are fully symmetric with unions.

@mikeshardmind would probably be the best one to make the argument for 5 over 4, but he's stepped away for now, and reiterated in other places that he is not going to continue the line of discussion today.

Is intersection completely symmetric with union? If not, then I think this is a huge red flag.

Under proposal 5

  • Intersection is symmetric with union for all things which are true by universal property. He linked An issue with the accepted definition of Union #22 when pointing this out.
  • Intersection would be consistent with other examples of subtyping relations already accepted

Under proposal 4

  • Intersection is still symmetrical
  • Intersection would be incosnsitent with other examples of subtyping relations already accepted.

Can all the behaviors for intersection be derived from a few basic axioms that work with all types in the type system, or do the rules require us to define special cases for Any, Callable[..., Any], etc.? Ideally, there should be no special cases required, and the behavior should fall out of the composition of the basic rules.

That depends on if we consider the subtyping behavior we currently have with Any as a basic axiom. It's already a special case. If we allow the same behavior that already exists in the special case, option 5 is the only option which remains fully consistent with all existing prior decisions. If we do not, then option 4 is inconsistent with existing behavior, but follows entirely from basic axioms. "Nothing newly special" was one of the critera he had when designing option 5

As I understood him to mean it, the reasoning for option 5 follows here:

  • If Any & T is irreducible, then it is no longer just Any, and does not have the silent upcasting/downcasting compatability
  • We already have an example of how multiple supertypes with Any as a supertype behaves
  • We should remain consistent with that to reduce the potential surprises for users.
  • option 4 treats Any & T the same as Any for what operations are allowed on the intersection because it intersects every attribute, method, and property with Any as well.
  • option 5 only considers Any after considering the known types. This is consistent with the behavior in existing cases with subtyping where this was shown to be the desired effect. If a user needs to widen it further, they can safely and without a type ignore or cast.

The comments linked by @mark-todd help, one of mine may help as well: #31 (comment) , as well as his[@mikeshardmind ] response to that #31 (comment)

@CarliJoy
Copy link
Owner Author

Hey everybody. I will lock this discussion.
This issues has awy too many comments. Reading it is not easily possible anymore.
We are anyhow IMHO just discussing between Option 4 and 5 and for this #31 exists.

Repository owner locked as too heated and limited conversation to collaborators Dec 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests