Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change builtin method signatures to be compatible with Sequence, at the expense of accuracy. #1107

Closed
wants to merge 1 commit into from

Conversation

gwk
Copy link
Contributor

@gwk gwk commented Mar 27, 2017

This one is a bit weird, but it's the best I could do. Essentially, some of the builtin types implementing Sequence have start/end or start/stop optional parameters that accept None, and others do not. list and tuple raise errors suggesting they accept None but actually do not. range does not take start/end at all. Finally, slice accepts any type implementing __index__, suggesting that the typing module should have a SupportsIndex ABC.

Depending on how seriously you take this note in the object.__index__ documentation, SupportsIndex might require __int__ as well as __index__:

Note In order to have a coherent integer type class, when __index__() is defined __int__() should also be defined, and both should return the same value

@@ -31,7 +31,7 @@ class SRE_Pattern(object):
pattern = ... # type: str
flags = ... # type: int
groups = ... # type: int
groupindex = ... # type: Mapping[int, int]
groupindex = ... # type: Mapping[str, int]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to belong in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally correct, it somehow bled in from #1106.

…he expense of accuracy.

This may not be acceptable, but it's the best I could do. Essentially, some of the builtin types implementing Sequence have start/end or start/stop optional parameters that accept None, and others do not. list and tuple raise errors suggesting they accept None but actually do not. range does not take start/end at all. The others appear to accept slice-compatible values. Finally, slice accepts any type implementing __index__, suggesting that the typing module should have a SupportsIndex ABC.

Depending on how seriously you take the note in the object.__index__ documentation, SupportsIndex might require __int__ as well as __index__.
@gvanrossum
Copy link
Member

Is there a bug report this is trying to fix? It's not clear to me from the barrage of changes with TODO comments what you're trying to fix. (The confusing thing is that you're touching a bunch of methods that aren't part of Sequence.)

@gwk
Copy link
Contributor Author

gwk commented Mar 27, 2017

Sorry for the confusion. This started as a fix to make str.count match the observed behavior of python3.6 regarding accepting None for start/end . It snowballed because mypy started complaining that count did not match the signature of Sequence.

The TODO: __index__ is matching earlier TODO notes which allude to the lack of proper support for types implementing __index__, which mypy will reject. This further complicates the question of how to properly type annotate this stuff.

There is no bug report, but there probably should be. I started with the PR because my commits represent everything I know of the problem. I think the ideal solution would be to add SupportsIndex to python`s typing module; should I open a bug on python.org?

@JelleZijlstra
Copy link
Member

I haven't reviewed for correctness, but we'd prefer writing start: Optional[int] = ... instead of start: int = None. All defaults should be ...

@gwk
Copy link
Contributor Author

gwk commented Mar 27, 2017

OK, I was following suite for other annotations in builtins.pyi.

@gwk
Copy link
Contributor Author

gwk commented Mar 27, 2017

@gvanrossum I see that some of the changes are not part of Sequence; I suppose could have come up with a better title for this. Basically this was an effort to correct a number of outstanding problems with these start/end parameters. It seems to me to be a rather sticky problem, because CPython is not actually implementing a uniform interface for count and index. As a first pass, it seemed better to me to make the Sequence definition a bit loose, meaning give it the most general optional parameter types, even though list, tuple, and range don't accept all of it. Maybe it's possible to express these variations on the ABC, but I don't know how.

@gwk
Copy link
Contributor Author

gwk commented Mar 27, 2017

If you'd prefer to take this to a bug report I'm fine with that, but it seems to be part typeshed, part cpython problem, because list and tuple raise errors saying they accept int, None, or index when they don't.

@gvanrossum
Copy link
Member

No, changing Sequence is not an option. Maybe @JelleZijlstra can guide you through this? (Note that I don't think the TODO comments are acceptable as they stand -- I'm not against TODOs, but these are repetitive and at the same time don't explain what there is to do.)

@gwk
Copy link
Contributor Author

gwk commented Mar 27, 2017

OK, thanks. Again I was just imitating what I saw in there. I'm happy to take out the TODOs, redo this commit, or whatever. I just wanted to bring the full scope of the problem to attention.

@JelleZijlstra What I don't know how to do is implement an annotation for a concrete class that varies from that declared in the ABC. Thanks!

@JelleZijlstra
Copy link
Member

Yes, I can give some guidance (don't have a lot of time right now, though).

@@ -221,15 +221,15 @@ class str(Sequence[str]):
def __init__(self, o: bytes, encoding: str = ..., errors: str = 'strict') -> None: ...
def capitalize(self) -> str: ...
def center(self, width: int, fillchar: str = ' ') -> str: ...
def count(self, x: str) -> int: ...
def count(self, x: str, start: int = None, stop: int = None) -> int: ... # TODO: __index__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional[int] = .... In order to make future use of SupportsIndex or similar easier, you could for now define a type alias _Index = int, then declare these parameters as Optional[_Index]. If we gain support for SupportsIndex in the future, we can then easily change the alias.

@@ -525,7 +525,7 @@ class slice:
@overload
def __init__(self, stop: Optional[int]) -> None: ...
@overload
def __init__(self, start: Optional[int], stop: Optional[int], step: int = None) -> None: ...
def __init__(self, start: Optional[int], stop: int = None, step: int = None) -> None: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't the previous code correct? The first overload takes care of the case where only one argument is given.

@@ -546,7 +546,8 @@ class tuple(Sequence[_T_co], Generic[_T_co]):
def __rmul__(self, n: int) -> Tuple[_T_co, ...]: ...
def count(self, x: Any) -> int: ...
if sys.version_info >= (3, 5):
def index(self, x: Any, start: int = 0, end: int = 0) -> int: ...
def index(self, x: Any, start: int = None, end: int = None) -> int: ... # TODO: __index__
# NOTE: tuple does not actually support None start/end, but it should to be compatible with Sequence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should describe how tuple actually behaves, not how it's supposed to.

However, in this case I think you found at least a documentation bug in CPython: when you pass None it says TypeError: slice indices must be integers or None or have an __index__ method, and that error shouldn't mention None if None is not an acceptable value. Feel free to report that at bugs.python.org.

Unrelated problems with the existing code (outside the scope of this PR):

  • The defaults should be ..., not 0.
  • I feel like it would be more helpful to type the argument as _T_co instead of Any.

@@ -195,7 +195,7 @@ class Sequence(_Collection[_T_co], Reversible[_T_co], Generic[_T_co]):
def __getitem__(self, s: slice) -> Sequence[_T_co]: ...
# Mixin methods
if sys.version_info >= (3, 5):
def index(self, x: Any, start: int = 0, end: int = 0) -> int: ...
def index(self, x: Any, start: int = None, end: int = None) -> int: ... # TODO: __index__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional[int] = ... here too.

@JelleZijlstra
Copy link
Member

JelleZijlstra commented Mar 28, 2017

Left a few comments, but wanted to discuss two things in more detail:

  • You can propose adding SupportsIndex to typing, but I'm not sure it would be accepted. My personal preference would be to wait for PEP 544 to make it into typing, then locally define a small Protocol that defines __index__.
  • I think you hinted that you're getting type errors from mypy when you define the methods as they're actually used. I think it's fine to ignore those with # type: ignore (typeshed should describe the types as they actually exist, whatever Liskov thinks of us). But that runs into problems with pytype's parser, which chokes up on type ignore comments in certain places. Until Incorrect ParseError for type comment in unexpected location google/pytype#53 is fixed, maybe we can just keep the annotations on Sequence as they are and fix only the str/bytes ones.

Also, thanks for your contribution! It's great that we're getting more precise at describing these types.

@gvanrossum
Copy link
Member

I still think it would be helpful to have a good bug report (either a separate issue or a comment here). It should take the form:

  • Here's a small self-contained correct program (with sample output if any)
  • These are the errors that mypy reports
  • This is why I think mypy is wrong

@JelleZijlstra
Copy link
Member

I don't think anyone has described a mypy bug here; as I understand it the errors are because of inconsistent types between subclasses and superclasses, something like this:

class Sequence:
    def index(self, x: Optional[int]) -> None: ...
class tuple(Sequence):
     def index(self, x: int) -> None: ...  # E: Argument 1 of "index" incompatible with supertype "Sequence"

We'll have to ignore this error if it turns out tuple is actually inconsistent with the way Sequence wants it to behave.

@gvanrossum
Copy link
Member

This is like pulling teeth. That made up example doesn't correspond to any real code. Whether it's a bug in mypy or in typeshed, I would like to see a program that you claim is correct and where you'd like different output from mypy. And I need to be able to just copy-paste that program into a file and run it and reproduce the results you claim.

My guess is that there are several issues. Maybe some builtin classes don't fully conform to the Sequence protocol, in that there are calls that are legal for a sequence that are rejected on the basis of types alone. That would be unfortunate (and a Liskov violation) but it happens sometimes. Your made-up index() example would be one. The best we can do there is add a #type: ignore on the affected method definition for the subclass.

The other issue is that some subclasses of Sequence support extra parameters that aren't in the Sequence protocol. This is not a violation of Liskov! A subclass may take more optional parameters or a less strict type, and it may return a stricter type. You can just override the method in the subclass with the correct signature and mypy will accept it -- if not, let's discuss it here.

@gwk
Copy link
Contributor Author

gwk commented Mar 28, 2017

@gvanrossum issue #1108 provides a real example.

@gwk
Copy link
Contributor Author

gwk commented Mar 28, 2017

I'm exploring this further and will open a new PR when I have a better handle on the various issues. Thanks for all of your input, and patience!

@JelleZijlstra
Copy link
Member

Sorry, maybe I'm misunderstanding but I don't see why we need a reproduce script or a new PR. As far as I can see, this PR mostly just fixes some small omissions (e.g. missing Optionals) that we can already merge. We can just drop the problematic parts (changing tuple and Sequence). Having an argument be non-Optional in a parent class and Optional in a child is fine with mypy.

@gvanrossum
Copy link
Member

You two can decide but please wait for me before merging.

@gwk
Copy link
Contributor Author

gwk commented Mar 28, 2017

Superseded by #1109, #1110.

@gwk gwk closed this Mar 28, 2017
@gwk gwk deleted the pr-sequence branch February 5, 2018 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants