Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider making Mapping covariant in the key type #445

Closed
JukkaL opened this issue Jun 28, 2017 · 13 comments
Closed

Reconsider making Mapping covariant in the key type #445

JukkaL opened this issue Jun 28, 2017 · 13 comments
Labels
resolution: wontfix A valid issue that most likely won't be resolved for reasons described in the issue

Comments

@JukkaL
Copy link
Contributor

JukkaL commented Jun 28, 2017

Confusion with invariant container types is one of the more common questions mypy users have. Often these can be easily worked around by using a covariant type such as Sequence instead of List. However, there is no fully covariant dictionary-like type -- Mapping is invariant in the key type, since the key type is used as an argument type of __getitem__ and get. This is particularly problematic in Python 2, as mappings with unicode keys are common, and a mapping with str keys is not compatible with them, even though they are fine at runtime (at least if the keys are ascii only).

I can see a few things we could do.

1) Don't care about unsafety

We'd make Mapping covariant even though it's known to be unsafe (#273). The argument is that the potential unsafety is less of a problem than the confusing user experience. This wouldn't help with certain related things like using a unicode key to access Mapping[str, X], which is arguably okay.

Note that this was proposed and rejected earlier.

2) Use object in Mapping argument types

We'd make Mapping covariant and use object in the argument type of __getitem__ and get. This would result in less effective type checking, but I'd consider this to still be correct at least for get, since it can always fall back to the default. Also, __getitem__ would typically only generate a KeyError if given a key of an unrelated type, and this is arguably not a type error. Indexing operations on mappings can already fail with KeyError, and this isn't considered a type safety issue.

We could also recommend that type checkers special case type checking of Mapping.__getitem__ and Mapping.get (and also the corresponding methods in subclasses such as Dict). A reasonable rule would be to require that the actual key type in code like d[k] overlaps with the declared key type. This would allow things like using unicode keys for Mapping[str, int], which should perhaps be fine in Python 2. We could still reject things like d[1] if the declared key type is str, since int and str are not overlapping.

Subclasses of Mapping that only support specific runtime key types would now be rejected. For example, this wouldn't work any more:

class IntMap(Mapping[int, int]):
    def __getitem__(self, k: int) -> int: ...   # needs an object argument
    ...

However, it would be easy enough to refactor this to type check:

class IntMap(Mapping[int, int]):
    def __getitem__(self, k: object) -> int: ...
        assert isinstance(k, int)  # sorry, can't deal with non-int keys
        ...
    ...

I think that this is reasonable -- the required assert (or cast) makes it clear that this is potentially unsafe.

3) Just teach users to work around using unions or type variables

Maybe keeping Mapping is invariant in the key type is reasonable. Often the lack of covariance can be worked around by using other type system features. Here are some examples:

  • Use Union[Mapping[str, int], Mapping[unicode, int]] instead of Mapping[unicode, int].
  • Use Mapping[AnyStr, int] in Python 2 and Mapping[str, int] in Python 3 using a conditional type alias.
  • Define a type variable like AnyText = TypeVar('AnyText', str, Text) and use Mapping[AnyText, int]. This is a little awkward, since in Python 3 the values of the type variable would be str and str.

I don't like these for a few reasons:

  • These are not intuitive.
  • These are inconsistent with other containers such as Sequence.
  • These are more verbose than just Mapping[unicode, int].
@ilevkivskyi
Copy link
Member

We could also recommend that type checkers special case type checking of Mapping.__getitem__ and Mapping.get (and also the corresponding methods in subclasses such as Dict).

I think this is a crucial point. I don't see a good solution without adding some internal special-casing in type checkers.

@JukkaL
Copy link
Contributor Author

JukkaL commented Jun 28, 2017

Pinging @rowillia, since he had issues with invariance recently.

@ilevkivskyi
Copy link
Member

@JukkaL
There appeared an idea, see http://bugs.python.org/issue25988 to have a base class of Mapping that would be contravariant in the key type. It seems to me that sometimes contravariance rather than covariance is something that is wanted. For example (see b.p.o. issue):

def func(arg: BaseMap[str, int]) -> int: ...
x: Dict[Union[str, unicode], int]
func(x) # people want this to work, but it doesn't with invariant Mapping instead of BaseMap

@JukkaL
Copy link
Contributor Author

JukkaL commented Jul 21, 2017

It's unclear to me how useful BaseMap would be. For any new feature, I'd like to see how it would help with real-world code. For example, maybe actual use cases often iterate over the keys, which wouldn't be supported by BaseMap. If that's the case, using Mapping[AnyStr, int] or Union[Mapping[str, int], Mapping[unicode, int]] would be more reasonable, and both of these already work.

@ilevkivskyi
Copy link
Member

Actually the original proposal in http://bugs.python.org/issue25988 was to add an ABC with two methods __getitem__ and __iter__, but it would be invariant, so this will not help with this problem. TBH my main motivation here is to have a protocol in typing that has __getitem__ and is not "bulky". Now we have Mapping and Sequence but those have 7-8 methods each.

@JukkaL
Copy link
Contributor Author

JukkaL commented Jul 21, 2017

Well, it's probably only worth adding such a protocol if it's useful often enough -- defining one manually is easy in case it would only be used occasionally. I don't know whether this is the case, though.

@ilevkivskyi
Copy link
Member

OK, right, let us postpone this then. It will be easy to add one later, if we will see that it is actually needed.
Do we need to say something about this in the rejected/postponed section of PEP 544?

@JukkaL
Copy link
Contributor Author

JukkaL commented Jul 21, 2017

This is small enough that it probably won't need to be mentioned.

@LunarLanding
Copy link

LunarLanding commented Sep 4, 2020

Hi, this might add to the cases where the key not being covariant fails, not sure:
Python 3.8 using pylance for typechecking.

from __future__ import annotations
from typing import *
A=TypeVar('A',bound=Hashable)
B=TypeVar('B')
class C(MutableMapping[A, B], Generic[A, B]):
	def __init__(self):
		self.d = cast(Dict[A,B],dict())
	def test(self,a:A):
		return self.d.get(1) #should not typecheck

@srittau
Copy link
Collaborator

srittau commented Nov 4, 2021

I believe it's now too late to fix this. Mapping is also problematic as it is not a protocol, but a concrete class, as opposed to most other similar classes.

@srittau srittau closed this as completed Nov 4, 2021
@srittau srittau added the resolution: wontfix A valid issue that most likely won't be resolved for reasons described in the issue label Nov 4, 2021
@harahu
Copy link

harahu commented May 19, 2022

However, there is no fully covariant dictionary-like type -- Mapping is invariant in the key type, since the key type is used as an argument type of __getitem__ and get.

I encountered this issue when adding type annotations to Streamlit. I came across a function that looked like

def lower_clean_dict_keys(dict):
    return {k.lower().strip(): v for k, v in dict.items()}

Annotating this function, I just followed my instincts based on naming, and annotated it as

_Value = TypeVar("_Value")


def lower_clean_dict_keys(dict: Dict[str, _Value]) -> Dict[str, _Value]:
    return {k.lower().strip(): v for k, v in dict.items()}

Mypy quickly pointed out that Dict is not covariant with regards to key, upon which I thought:

Of course, I should use a Mapping for this, that would be the covariant type. Silly me.

My next attempt looked like this:

_Value = TypeVar("_Value")


def lower_clean_dict_keys(dict: Mapping[str, _Value]) -> Mapping[str, _Value]:
    return {k.lower().strip(): v for k, v in dict.items()}

Imagine my surprise when this did not work out for the same reason as with Dict. This led me to a search, both for a covariant mapping type and for answers as to why Mapping isn't covariant. I came across this issue, which answered the "why" part (__getitem__ and get), which, once I understand it, seems perfectly reasonable. The remaining question was just: what does this mean for me in practice?

I never found my covariant mapping type. Instead, I landed on this solution:

_Key = TypeVar("_Key", bound=str)
_Value = TypeVar("_Value")


def lower_clean_dict_keys(dict: Mapping[_Key, _Value]) -> Dict[str, _Value]:
    return {k.lower().strip(): v for k, v in dict.items()}

Here, I add a free type variable for the key, effectively making the mapping covariant. For the purposes of type checking, this covers what I need, but I don't think it is elegant. It forces me to introduce yet another type variable (which, confusingly, is only used once*).

The issue here, really, is that Mapping is a too specific type in light of what I actually do with it in the function body. My only interaction consists of dict.items(), which has covariant behavior. I could rewrite the function such that it takes an ItemsView instead, but that makes for an ugly API, IMO. Or, preferably, I really just need a handy little SupportsItems protocol. Which, it seems, someone at typeshed has seen the need for as well.

Given that Mapping was the type I, and many others, reached for in order to achieve covariance, I'd like the following:

  • Easily accessible, granular protocols, for the covariant methods on Mapping. In the simplest case, that could be all of the covariant methods grouped into a single Protocol. Perhaps something like:

    class CovariantMapping(Protocol[KT_co, VT_co]):
        def __eq__(self, other: object) -> bool: ...
        def __ne__(self, other: object) -> bool: ...
        def keys(self) -> KeysView[KT_co]: ...
        def items(self) -> ItemsView[KT_co, VT_co]: ...
        def values(self) -> ValuesView[VT_co]: ...

    My gut tells me this belongs in typing.

  • Improved documentation for the mapping type, at typing.Mapping, collections.abc.Mapping, or, preferably both. I would want a section that can be paraphrased as:

    You came here looking for a type that is covariant with regards to keys. This type is not it (because reasons), but here are the things you probably want to do instead...

    @srittau I think this is a solution that would fix most of the issue (for developers mature enough to realize why Mapping cannot be covariant), while still being something that can be done today, without changing behavior.

* The "normal" pattern is to use it at least twice, in order to indicate dependency between types.

TL;DR: Given my experience, I desire better documentation for the Mapping type, as well as covariant protocols for the APIs of Mapping that are covariant.

@srittau
Copy link
Collaborator

srittau commented May 19, 2022

Better documentation for typing would be desirable overall and there is some first ideas in the typing repository: https://github.com/python/typing/issues?q=is%3Aopen+is%3Aissue+label%3A%22topic%3A+documentation%22

We have some very tight protocols in _typeshed that can be used a bit awkwardly:

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from _typeshed import SupportsItems

We are currently working to bring some of these types into typing_extensions so they are more easily accessible at runtime (python/typing_extensions#6). But I agree that a "mapping-like" protocol like the one you suggested, that's a bit more comprehensive would also be a good idea. Would you like to open an issue in /python/typing for typing-extensions? We could discuss it in detail over there.

@BlueGlassBlock
Copy link

Not sure whether it belongs here, but it's quite reasonable to implement "type based callback map" in this way:

from typing import Any, TypeVar, Callable

class MyType:
    pass

_T = TypeVar("_T", bound=MyType)

class TypeA(MyType):
    content: str

class TypeB(MyType):
    content: int

class Case:
    def __init__(self, cb: dict[type[_T], Callable[[_T], Any]]):
        pass

Case({
    TypeA: lambda a: a.content + "!!!", # error: Operator "+" not supported for types "str | int" and "str"
    TypeB: lambda b: b.content + 5, # error: Operator "+" not supported for types "str | int" and "int"
})

Type checker assumed that the key type of cb passed into Case is type[TypeA | TypeB], which is acceptable.

However it tried to apply TypeA | TypeB to the callback values, resulting in errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolution: wontfix A valid issue that most likely won't be resolved for reasons described in the issue
Projects
None yet
Development

No branches or pull requests

6 participants