-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling of Any
within an Intersection
#1
Comments
Arguments in favor of Removing Any from IntersectionsAny is essentially the universal set of types. The identity element of intersection on sets is the universal set for domains in which a universal set does exist. The Identity element for unions is the empty set (conceptually, typing.Never) The divergence in behavior is easily explainable as a consequence of logical ordering, and can be taught easily utilizing python builtins Edit: Argument retracted, Any's definitional mismatch between |
Arguments against An Intersection containing Any becomes AnyThis reduces the effectiveness of gradual typing, as it causes a known constraint of the intersection to be thrown away, causing Any existence of Any to become contagion to code using intersections. This could further fragment the ecosystem between fully typed code and not. |
Arguments in favour to Disallow Any in IntersectionIt is possible that Any & T is inherently unsafe. Consider the case |
I think it's worth mentioning that TypeScript considers I think it's also worth mentioning that the description of Any in the original post here is incomplete. It is both a subtype and a supertype of all other types. |
Dear everyone. I'd like to propose a fourth option, which is that
I briefly discussed this with Jelle who suggested I post this to this thread. Please let me know if I've made any errors. Background: Unions with
|
I entirely disagree with the above. The current definition of Any itself creates contradictions in any attempt at a logically consistent type system
The attempt to treat Any as if it is a subtype or supertype is misguided at best and leads to inconsistent results. While Any is implemented as a type in python, it isn't conceptually a type, but the set of all types when it comes to how it is treated in type checking and it's purpose in the type system. Additionally, the claim that intersection should work like Union in the below is a complete misunderstanding of logical operators
Intersection is "all" requirements, Union is "Any" requirements. Intersection should not be seen as broadening the allowable types, that is what Union does. Intersection is more restrictive on the types allowed the more types which are added to it. Conflating the potential interface with the allowable types is problematic. In the Case of The other problem comes in from the other part of the definition of Any per pep 484
This speaks to the potential interface of Any, not to the type of Any. Without changing the definition of Any, this part should make Any unsafe as an operand in an intersection, as by having all potential methods, it also has all potential conflicting method definitions with other operands of the intersection. consider: class A:
def do_something(self):
...
class B:
def do_something(self, x: int, /):
... These are incompatible interfaces for an intersection and Any can be either. |
Where do you see in PEP 484 that
I didn't say
Intersection broadens the allowable parameter types for which |
This is entirely wrong, you are conflating types and their interfaces here, as was already explained to you in the main thread here: python/typing#213 (comment)
Except that it isn't, you can see this with the type of
I've edited in more detail. The term universal set of objects is not used, but definitions consistent with it are. However there's another part of the definition which is problematic by both saying |
I think the fundamental disagreement here is that you are identifying a type by the set of subtypes only
I think it's fairly well accepted that
Normally, in mathematics, the universal set is the union of all elements.
I don't think this is problematic. This is why I prefer to consider to type expressions X and Y to be equivalent only when I think at heart, the main point of contention is that are thinking about intersection only in parameter lists or similar subtype questions. In those situations, you are right that you can reduce So, I'm not contradicting you in those parameter list situations. However, there are other situations to consider, and it is possible for a revealed to type to meaningfully have type |
your edit adds "Why does intersection broaden the way a variable can be used?" this is the definition of an interface, not a type. You are still conflating the two, and the difference between types and their interfaces is actually why this is such a problem with the current definition of Any with intersections. Please review what other people have told you about this. Unions are broadening of type, but only the shared interfaces may be safely used because we only know that it has at least one of these interfaces. Intersections are narrowing of type, but because of the requirement of compatibility with all of an intersection's types, allow using the combined interface of the types. In any interpretation of Any based on its current definition in accepted peps, Any presents a potential problem here. Any having all potential interfaces makes But even with this, the other type requirement of the intersection does not go away in the process. The definition of Any is logically inconsistent and intersection and its logical consequences just shine a light on this pre-existing problem. For what it is worth, mypy provides configuration options forbidding the use of Any specifically because of the problems of Any, and Ruff provides lints disallowing Any for the same reasons. If it comes between implementing a typing feature in a way that is logically broken, and not allowing some code that already isn't typed to make use of a typing feature, the latter is preferable. |
When we talk about types, they can occur in a variety of places. One of those places is that they can be associated with a variable as a "static type". These static types must encode all of the information that we have about the static type, including the interface (the way in which the variable can be used). This discussion is about how type checkers should manipulate intersections with static types. Because the interface matters, for the static type of variables, you cannot collapse Ultimately, type checkers have to report the static types with Maybe for you a type checker's "static type" is not a "type". But this is one sense in which I am discussing types. Another place types come up is in parameter lists. In that location,
That's not a conflict for the same reason that
To be honest, I don't think anything is logically broken. |
It is a conflict in the intersection case though. You can't do whatever you want with T | Any from typing import Any
from datetime import datetime
def function(x: datetime | Any) -> None:
x.f()
This also isn't unique to parameters, as the type also matters when it comes to potential reassignment, it happens even with a direct typing of a variable: from typing import Any
from datetime import datetime
def function(x: datetime | Any) -> None:
x.f()
def other_function() -> None:
x: datetime | Any = datetime.utcnow()
reveal_type(x)
x.f()
You can only use the interface provided by T safely, as you could receive T here, rather than Any. While Any has a compatible interface, you still need to care about the interface of T. The difference between types and their interfaces matters. you cannot handwave them away with less rigorous definitions, and this is highly important here because the behavior of Union and Intersection are diametrically opposed in their effects of broadening vs narrowing of both interfaces and types
The logical contradiction has already been presented, and you have not given an explanation which satisfactorily shows it isn't in contradiction. |
Sorry, I meant the dual of "pass whatever you want". You can do whatever you want with
I think we're going in circles again. The static type includes all of the information known about the type. You seem to be dividing this information into something you're calling the "type" and the "interface". I think your definitions roughly correspond to Sub(X) and Sup(X), but it's hard to say. Nevertheless, when I use the word "type", I mean the static type (as returned by reveal type). The static type must encode what you call the "interface"—it is not distinct.
Just what I said above: "You can do whatever you want with |
I don't know why you are still arguing this @NeilGirdhar but type theory is pretty clear here and you keep asking for things you've already been provided by people being far too patient with you. |
I only posted because Jelle looked my post over and said it looked right. Let's try to keep the discussion civil. |
My patience on this is my own, please don't presume that I should be more or less patient with people, thanks.
Already have presented it, but I'm fine to explain one more time. Any is defined as having all interfaces. Some interfaces are incompatible with others, as shown above, but here it is again. class A:
def do_something(self):
...
class B:
def do_something(self, x: int, /):
...
A & B # unsatisfyable condition
class AP(Protocol):
def do_something(self):
...
class BP(Protocol):
def do_something(self, x: int, /):
...
AP & BP # unsatisfyable condition An intersection must satisfy all of its requirements, conflicting interfaces create a situation where there are no types that could possibly satisfy the requested interface. Any is defined as having all interfaces. So while it may be safe to interact as if any has any version of that interface, it would be impossible to correctly say that Any has a non-conflicting interface with non-infinitely defined interfaces. |
This would be a far easier discussion to have from a shared mathematical understanding of type or category theory, rather than the non-rigorous definitions that python currently has adopted, but if python had a mathematically rigorous type system, we wouldn't need to have this discussion currently. The moment you attempt to formalize all of the existing rules into concrete logic, there are some types which can be created at runtime, but are inexpressible by the typing, as well as logical contradictions that go away if you ignore the existence of
These don't have definitions agreed upon anywhere outside of type theory, and your use of them here does not match the use in type theory. I've seen the doc floating around that proposes using this, but the use isn't consistent with existing type theory, so I've avoided potentially muddying this conversation with the use of one definition where the other might be assumed. Interface means the obvious thing here, namely the known methods, attributes, etc of an instance of a type, and their signatures and/or types. Type also means the obvious thing here. (The type which an instance of an object is) Interfaces and types are largely identical outside of LSP violations when talking about only one type. When discussing unions and intersections of types, this changes. Treating Any as a singular type with an infinite interface is the mathematical problem here. Python does not have the means to have multiple methods of the same identifier with differing arity/signature like some other languages, so the infinite interface contradicts other interfaces in the case of an intersection where any shared interfaces must be compatible |
Oh I see. That's a great point. Let's see what PyRight does for this: class A:
def f(self, x: int): pass
class B:
def f(self, x: str): pass
def f(x: A):
assert isinstance(x, B)
reveal_type(x) # subclass of A and B
x.f(2) # Okay!
x.f('s') # Fails; expects int. So it seems that PyRight is not really intersecting, but rather creating a virtual subclass: class X(A, B): ... What are the possible solutions?
My preference is probably for case 3 as the most permissive. It would allow you to do something like this: class A:
def f(self): ...
def g(self) -> str: pass
class B:
def f(self): ...
def g(self) -> int: pass # g is different
def g(x):
assert isinstance(x, A) and isinstance(x, B) # No subclass like this exists yet, but we may yet create one.
x.f() # No problem
x.g() # Error!
class C(A, B): ... # C.g is A.g
class D(B, A): ... # D.g is B.g
g(C())
g(D())
The way you've been using type doesn't match what is returned by reveal type in Python, which I think is the source of the confusion. I imagine, you would say that
Right, that's a fair point. In cases 1 and 2 above, My preference with |
Correct, I consider I think the fact that python's terminology is so loose compared to more well-defined type theory creates more situations like this than I'd like.
I believe this should already be considered a requirement, but I disagree about it solving I think it's possible to redefine Any and Never in terms that resolve this without needing to remove the usefulness of Any and without disrupting any current uses (because the problem only arises currently with the not yet implemented intersection, and other things that have not been accepted yet, such as A definition for Any which would work:
And a corresponding definition of Never:
|
Right, now that I understand how you're using terms, I can just use your vocabulary. So my preference is for
What are the consequences of this defintion for the examples above? |
For the case of Any, I believe it leaves all existing code exactly as is due to the restriction on type-checker behavior included. Type checkers already coerce Any to specific types via The behavior on intersection would result in the type It may be something that type checkers should warn is happening by default or include a configuration setting to inform users is happening The behavior of Never with these definitions would be entirely as it currently is. The current use of Never is already consistent with the set-based definition. |
Incidentally, we should at least then change the language in the top post of this thread to concord with the language you've been using especially because these differ from the existing PEPs. Specifically:
|
Agreed. I didn't realize how far apart we were on definitions initially, despite that I was consciously avoiding certain terms I felt had some ambiguity, there were others which also had mixed meanings in play. |
Okay, productive and interesting discussion, I'm going to take a break, but looking forward to seeing the next draft of this proposal. Intersections solve so many type errors; it's a feature I'm really looking foward to. |
It may also help to provide some examples, so starting that. Intersection A & BType: Intersection of A and B. If A and B are both concrete types and not protocols, this must be a subclass of both. If either is a protocol, the protocol must be satisfied. If A or B is already an intersection, the intersections are combined. (A & B) & C is equivalent to A & B & C Union A | BType: Union of A and B. If A or B are already Unions, The unions are combined Interface: The attributes, methods, and properties shared between A and B which have compatible types. If A and B use an identifier in conflicting ways, the accessor of the object cannot know which way it is used without checking. (A | B) | C is equivalent to A | B C (The below may seem contrived, but I think including it can assure the definitions we end up on remain consistent) The order of intersections and unions matters Unions of Intersections (A & B) | (B & C)Type: a union of the intersection A & B and the intersection B & C (A & B) | (B & C) is equivalent to (A | C) & B Intersection of Unions (A | B) & (B | C)Type: Intersection of the unions A | B and B | C (A | B) & (B | C) is not equivalent to B & (A | C), but to (A & C) | (B & (A | C)) * A contrived case, to be sure, but... class A(Protocol):
def this(self: Self) -> Self:
...
class B(Protocol):
def that(self: Self, other: T) -> T:
...
class C(Protocol):
def this(self: Self) -> None:
...
Problematic = (A | B) & (B | C) There are two possibilities here
I believe 2 to be the correct interpretation. 1 results in a different interface than may be expected. Why this matters when considering AnyThis is a set of ordering and satisfiability constraints that works with the proposed update to the definitions of Any and Never. Should these orderings and constraints be considered undesirable, we may be back to the drawing board on handling Any |
Thanks for the writeup @mikeshardmind! FWIW, I'm also working on trying to figure out how to combine intersections with unions and I think we mostly agree except for the last case - on the surface it seems like having the usual laws like distributivity satisfied is desirable, but let's maybe discuss it over at discord, so we don't pollute this issue :D |
I was thinking about chaing the language of "type" and "interface". Let's call this the "literature type". At the end of the day though, you're going to have to translate the "literature type/interface" into things like:
These three types need to have a canonical representation that represents the literature type and interface—we can call this representation the canonical type. I'm not sure whether it will ultimately be easier to stick with the Python language for this reason. Either way, there will need to be a canonical way to specify |
I'll give some examples of why I don't know that more definitions are strictly helpful and that it's more about understanding that there are some things we should not consider a type, but a specification of what we know about possible/allowed types. The moment many of these are thought of this way, the distinction between a concrete runtime type and the interface we can semantically and statically show to exist becomes automatic. In many cases, we don't have a concrete exact type. This is true even without considering Unions or Intersections when it comes to protocols. An annotation of a protocol is not a type, it's an expectation of a required interface and is sometimes referred to as structural typing (which is distinct from typing based on composition or inheritance) In many places, we are not specifying a type, but what we know about the possible types or interfaces an object might have. A protocol as an annotation is a specification about a known interface but says nothing about the concrete type other than that instances of it which satisfy it must implement that interface. This is a protocol taken from real library code. The intent is that a user could provide most choices for a numeric container, including those that I might not have knowledge of as the library author, such as a custom vector class. The library does abstract math and simplification of terms on objects for which these are well-defined operations. This includes symbolic simplifications and in the future higher-order operations on functions of functions. class SupportsBasicArithmetic(Protocol):
def __add__(self: Self, other: Self) -> Self:
...
def __sub__(self: Self, other: Self) -> Self:
...
def __mul__(self: Self, other: Self) -> Self:
...
def __truediv__(self: Self, other: Self) -> Self:
...
def __eq__(self: Self, other: Self) -> bool:
... Code that accepts parameters annotated with this, has no knowledge of the type of the object they are accepting, only of a required portion of the interface. Additionally, writing x: int | Fraction = 1 The type of x at that point in time is Saying that x has a type of As to specific things you wanted defined:
With this model of thinking about it,
The annotation is a specification of what must be true about the type of the object for code relying on this annotation to be sound from the perspective of static analysis.... a set of specifications about the allowed type(s) and interface(s) required. Edit: Actually, we probably still need more definitions accepted in some regard, and many definitions could stand to be made more clear, more rigorous, and/or more consistent, I just don't think we need multiple definitions for type. We just need to consider where we don't have a type, but a description of potential types. We should still have interface as a defined term. I believe it was defined well in the discussions about protocol and some of the discussions about variance, but I don't think there's a good canonically accepted definition of interface for python currently |
I understand, but the problem is that I'm not the person that needs to be swayed by the PEP. You're going to communicate with all of the Python people who are used to one set of definition. For these people, Now, I've figured what you mean by type ("literature type"); so, I know that for you the literature type But ultimately, you'll need to write a PEP and that PEP is going to have to relate to the other typing PEPs. When it uses symbols to represent types, those symbols have to make sense in the world of symbols that we already have. They have to be intelligible to people who are using those symbols. If, as I think you're suggesting, the interface to
Yes, that's fair. Nevertheless, you're going to have to turn that "description" into some canonical representation for display and annotation. |
You're missing part of the argument for them. It remains consistent with existing specified behavior of subtypes of Any and another known type.
The problem with a false negative being an optional warning should be obvious and was explained already. There's no on-ramp for a user to know this is important or why, and as an option rather than as the subtyping behavior, it's possible for this to diverge across type checkers. False positives can be fixed by providing more information as well, and it was shown that it is safer to handle it in that direction already because the method for doing so is to just provide the minimum amount of information if you can't fully type it |
I don't think that behavior is intentional since they cannot currently return an intersection even if they wanted to.
If the type checkers think this is a problem, they can turn the warning on by default. MyPy already has similar options:
Sorry, I don't agree that "it was shown", and "safety" is a matter of opinion here. |
The behavior was intentional, this was discussed in depth in discord. The behavior was directly accepted to handle cases where things were only partially unknown, such as unittest.Mock, and a baseclass which was not type. |
https://mail.python.org/archives/list/[email protected]/thread/GULRKYI7XOB3FLAEFC6OYSTBS5FIA5PU/ |
I fully follow your point here - as much as I do really dislike false positives, I think option 5 is the lesser of the two evils. The odds of clashing signatures in T & U seems quite unlikely to me in the real world, and if there was a clash in the way you've described, I suspect one side or the other side you could tweak the name of the method. |
In general, I agree with you that this is a rare case, but consider when And the amount of disagreement I've seen on python-discuss from users who absolutely cannot stand type checkers telling them that correct code is wrong is immense. False positives are despised.
With something like
I read both threads and don't see how it is. Mock inheriting from Actually, that's pretty interesting It seems like the option 5 behavior is something else entirely. I'm not proposing it, but consider a new operator def f(U: type) -> SubClassOf[T, U]:
class T(U): ...
return T Now, it is true that Edit: it's a bit more complicated than I have here (since I don't think the forward reference is allowed), but I think something like this can be made to work and fit with option 5. |
I'm not saying that Mock inheriting from Any creates an intersection, I'm saying we have a behavior for when Any is one of multiple supertypes and that intersection having differing behavior would be inconsistent with that even with it not being an intersection, and just considering "what happens when we have multiple known supertypes" Note that the order in which Any is present in bases does not change the behavior. You claimed to want consistency from the rules, so I'm presenting to you the point that option 4 would be inconsistent in terms of subtyping relationships. |
Yeah magic methods is good point. Random thought - could we use something in the other half of the intersection to say "This method takes priority - forget the other one!" Then we could have this: class T:
@priority_method
def f(self, x: int) -> None: ...
class U:
@overload
def f(self, x: int) -> None: ...
@overload
def f(self, x: str) -> None: ...
I actually proposed this on the call last week haha - we concluded it fulfils a different function though and would be a different PEP (but one I would support!) |
I'm pretty sure it's just option 5. I support it too—just not as the definition of intersection—even if it is useful in many cases.
Yes, but that behavior is, in my opinion, a consequence of intersections not existing yet. Type checkers cannot return |
It explicitly is not, please see above quote which was provided showing how it was not and related links. |
Can we all actually ensure we're reading what's been said before responding? This is not only going to go nowhere, but likely get heated again if people just talk past eachother and I just want to wrap up and focus on any part where we don't agree to figure it out. |
You're talking about this: "This makes it so you can pass a mock object to any function"? This is not the subtyping of |
Just to summarise currently, I think there's quite a strong movement towards option 5 - if we've discounted all other options, I would suggest we move the conversation to more of a case of solving specific issues with it (like the one demonstrated above) in separate issues for each point. Also, can't speak for anyone else but for me scrolling 300 comments is quite annoying https://docs.google.com/spreadsheets/d/1JLOF0d20olalPHZrR3AoElGY4gpSpl8uDyOafJVFaZ0/edit?usp=sharing |
I think you should add the comments of Jelle and Eric who both supported option 4—in fact, the definition of option 4 are from their comments. I can't speak to their feelings about 5 though. |
@NeilGirdhar You said you read the linked threads, but you've missed something in them important. It was pointed out that even prior to this being accepted for the mock case, it already needed to be the behavior including with multiple inheritence. @mikeshardmind also gave you examples which included multiple inheritence, so can you address it in the full context of the discussion and consistency, or are you just trying to talk past what he said? |
Yeah I think it would be helpful to hear from them about option 5 - wasn't sure what to write in the summary as it didn't exist back then, but I'll try and add them a row (I've added None as an option for no opinion known). |
I'm gonna step away for a bit I may look at this later, but I'm getting tired of having to repeatedly state things and having portions of it quoted out of context, or important details that are inconvenient to someone else being left out in responses. I can understand that some people may not be happy with some of the options, but that's not a reason to completely ignore the arguments being made and ignore details that have directly presented, and this has been a recurring interaction over the course of months. |
I do really feel like we're this close to reaching a consensus - there are really only a few edge cases remaining. There is a lot of discussion on this issue and holding all the points and counterpoints in your head at once is difficult, at least for me. I would suggest we try and conclude this issue - then fleshing out the details of the PEP we can figure out in the new year. We work under the assumption that we're going with option 5, but then try and deal with the specific edge cases that might generate false positives etc. |
I think you really need to get the buy-in of MyPy and Pyright at the least. After all, they are going to review the PEP and their feedback will make or break its passing. I don't think it makes sense to write it up only for it to get rejected. By all means, make the best case as to why 5 is better than 4 if that's what you feel strongly about. I suggest starting by adding links to positive arguments for 5 in this issue header to make it easy to follow. |
As I said before, I believe that behavior is a consequence of not having intersections already. Do you have a reference for an alternative explanation? |
@NeilGirdhar Maybe try reading what you said you did. |
Yeah I do agree - I think I'll try and prepare a document that compares them, although I'd appreciate @mikeshardmind 's input on that one. If not, I will try and gather what I can from the discussions. Then maybe @gvanrossum is there some way you could bring it to them for review? |
Or here since you're so averse to that:
That part precludes an intersection, as that would no longer tell you "if you get it wrong" for the motivating case. |
Great idea. You can draw from the top level sections of this issue. I'll try to fill in some arguments for and against if I can. |
Could someone point me to a full description of options 4 and 5? I'm struggling to understand the nuanced differences between these two. I see some examples that try to demonstrate where the two differ, which is helpful, but I'm looking for a clear articulation of the rules — in particular, how they differ and whether they are fully symmetric with unions. FWIW, here are my primary criteria for evaluating any proposal:
|
Thanks for this. It took me a while to understand 4 vs 5, but this point helped: I think the only special case is Any? We're currently working on a draft of something to compare them |
@mikeshardmind would probably be the best one to make the argument for 5 over 4, but he's stepped away for now, and reiterated in other places that he is not going to continue the line of discussion today.
Under proposal 5
Under proposal 4
That depends on if we consider the subtyping behavior we currently have with Any as a basic axiom. It's already a special case. If we allow the same behavior that already exists in the special case, option 5 is the only option which remains fully consistent with all existing prior decisions. If we do not, then option 4 is inconsistent with existing behavior, but follows entirely from basic axioms. "Nothing newly special" was one of the critera he had when designing option 5 As I understood him to mean it, the reasoning for option 5 follows here:
The comments linked by @mark-todd help, one of mine may help as well: #31 (comment) , as well as his[@mikeshardmind ] response to that #31 (comment) |
Hey everybody. I will lock this discussion. |
There is a great deal of confusion about handling the
Any
type within anIntersection
.In Python,
Any
is both a top type (a supertype of all types), and a bottom type (a subtype of all types). Python has a gradual typing system, meaning that it will never be required that everything is typed. Everything that is not typed is consideredAny
.We examine five ways to handle intersections with
Any
:Any
is removed from intersections:T & Any = T
.Any
becomeAny
:T & Any = Any
.Any
is forbidden in intersections:T & Any
is an error.Any
is not reduced within intersections.Any
is only considered in an intersection in deference to non-gradual types.Remove Any from Intersections
Arguments in favour
Arguments against
An Intersection containing
Any
becomesAny
Arguments in favour
Arguments against
Disallow
Any
inIntersection
Arguments in favour
Any & T
is inherently unsafe".Arguments against
Any
will arise, often in the bowels of the type checker's logic where there's no good way to report an error to the user".Treat
T & Any
as irreducible in generalArguments in favour
T
toAny
should not cause type errors to appear, and by examining supertypes and interfaces.Any
.Any
as a wildcard in gradual typing.Arguments against
Any
is only considered in an intersection in deference to non-gradual types.Arguments for
Arguments against
## Arguments in favour to Disallow Any in Intersection
The general idea is that I will update the description, allowing the discussion to be included in the PEP and prevent a discussion going in circles.
I will react with 🚀 once I included something in the description.
The text was updated successfully, but these errors were encountered: