Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register Hooks for Generic Classes #216

Open
raabf opened this issue Feb 2, 2022 · 4 comments
Open

Register Hooks for Generic Classes #216

raabf opened this issue Feb 2, 2022 · 4 comments

Comments

@raabf
Copy link
Contributor

raabf commented Feb 2, 2022

  • cattrs version: 1.10.0
  • Python version: 3.7
  • Operating System: Linux 5.16.1-1 OpenSUSE Thumbleweed

I want to register structure/unstructure hooks for classes which inherit typing.Generic and using a T = typing.TypeVar('T'), i.e. MyClass(typing.Generic[T]) – so that the structure/unstructure hooks are called when MyClass is used with a concrete type, for example MyClass[str].
In the code box below you can see an example in the function main1() about that. So the question is: Is there an intended correct way to register structure/unstructure hooks for Generic Classes?

I currently make use of register_structure_hook_func and check the correct condition myself. The statement MyClass[str] will generate an typing._GenericAlias instance which __origin__ arrtribute holds a reference to the class MyClass, which is the condition I check.

from typing import Generic, TypeVar

import attr
import cattr

T = TypeVar('T')

converter = cattr.GenConverter()


class MyClass(Generic[T]):

    @staticmethod
    def deserialize(value, cls):
        print(f"deserialize {value}, {cls}")
        return MyClass()

    def serialize(self):
        print(f"serialize {self}")
        return 'foobar'


@attr.define
class MyBase:
    a: MyClass[str]


def main1() -> None:
    # MyClass[str] is faulty behaviour here

    print("•••• direct ••••")
    converter.unstructure(MyClass())  # OK
    converter.structure("hello1", MyClass)  # OK

    print("•••• With T ••••")
    converter.unstructure(MyClass[T]())  # OK
    converter.structure("hello2", MyClass[T])  # OK

    print("•••• With str ••••")
    print(f"subclass: {isinstance(MyClass[str](), MyClass)}{isinstance(MyClass[str], MyClass)}")
    converter.unstructure(MyClass[str]()) # Probably unintended behaviour but `serialize()` is called because of above `issubclass` is True

    b = converter.unstructure(MyBase(a=MyClass[str]()))
    print("b:", b)  # unstructured is an instance of 'MyClass' instead of string 'foobar'

    try:  # cattr.errors.StructureHandlerNotFoundError: Unsupported type: __main__.MyClass[str]. Register a structure hook for it.
        converter.structure("hello3", MyClass[str])
    except Exception as err:
        print(err)


def main2() -> None:
    # Everything works fine here

    print("•••• direct ••••")
    converter.unstructure(MyClass())
    converter.structure("hello1", MyClass)

    print("•••• With T ••••")
    converter.unstructure(MyClass[T]())
    converter.structure("hello2", MyClass[T])

    print("•••• With str ••••")
    converter.unstructure(MyClass[str]())

    b = converter.unstructure(MyBase(a=MyClass[str]()))
    print("b:", b)  # correctly is 'foobar'

    converter.structure("hello3", MyClass[str])


if __name__ == '__main__':
    print("•••• main1 ••••")

    converter.register_structure_hook(MyClass, MyClass.deserialize)
    converter.register_unstructure_hook(MyClass, MyClass.serialize)

    converter.register_structure_hook(MyClass[T], MyClass.deserialize)
    converter.register_unstructure_hook(MyClass[T], MyClass.serialize)

    main1()


    print()
    print("•••• main2 ••••")

    converter.register_structure_hook_func(
        # the default value (here `bool`) must be something so that the expression is `False`
        # The object has a __origin__ attribute if it us used as `Class[TYPE]`, then __origin__ will point to `Class`. This
        # test is secure enough since it is not only tested that the class has the attribute, but that it is also
        # a subclass, which is the class we want to support with this converter.
        lambda cls: issubclass(getattr(cls, '__origin__', bool), MyClass),
        MyClass.deserialize,
    )
    converter.register_unstructure_hook_func(
        lambda cls: issubclass(getattr(cls, '__origin__', bool), MyClass),
        MyClass.serialize,
    )

    main2()

Output:

•••• main1 ••••
•••• direct ••••
serialize <__main__.MyClass object at 0x7f6bdb8515d0>
deserialize hello1, <class '__main__.MyClass'>
•••• With T ••••
serialize <__main__.MyClass object at 0x7f6bdb8515d0>
deserialize hello2, __main__.MyClass[~T]
•••• With str ••••
subclass: True – False
serialize <__main__.MyClass object at 0x7f6bdb8515d0>
b: {'a': <__main__.MyClass object at 0x7f6bdb8515d0>}
Unsupported type: __main__.MyClass[str]. Register a structure hook for it.

•••• main2 ••••
•••• direct ••••
serialize <__main__.MyClass object at 0x7f6bdbb97510>
deserialize hello1, <class '__main__.MyClass'>
•••• With T ••••
serialize <__main__.MyClass object at 0x7f6bdbb97510>
deserialize hello2, __main__.MyClass[~T]
•••• With str ••••
serialize <__main__.MyClass object at 0x7f6bdbb97510>
serialize <__main__.MyClass object at 0x7f6bdbb97510>
b: {'a': 'foobar'}
deserialize hello3, __main__.MyClass[str]

The reason why converter.register_structure_hook(MyClass[T], MyClass.deserialize) is probably how typing._GenericAlias defines __eq__ and __hash__, i.e two of them are only equal when both __origin__ and all args (above this is T vs. str) are equal :

class _GenericAlias(_Final, _root=True):

    def __eq__(self, other):
        if not isinstance(other, _GenericAlias):
            return NotImplemented
        if self.__origin__ != other.__origin__:
            return False
        if self.__origin__ is Union and other.__origin__ is Union:
            return frozenset(self.__args__) == frozenset(other.__args__)
        return self.__args__ == other.__args__

    def __hash__(self):
        if self.__origin__ is Union:
            return hash((Union, frozenset(self.__args__)))
        return hash((self.__origin__, self.__args__))
@Tinche
Copy link
Member

Tinche commented Feb 2, 2022

You got it right. register_structure_hook delegates the logic to a functools.singledispatch instance, and those don't work well with a lot of these more advanced use cases.

So then you use the register_unstructure_hook_func approach. These can be whatever you like; yours looks good.

@raabf
Copy link
Contributor Author

raabf commented Feb 4, 2022

OK very nice!

However, to make it easier for other users: Usually in cattrs, when you un-/structure a certain class and this class does not have an un-/structure hook itself, but its parent, cattrs will use the un-/structure of the parent.

cattrs could be designed in that way that it sees MyClass[WHATEVER] as a subclass of MyClass although it isn’t technically and automatically use un-/structure hook of MyClass for any MyClass[WHATEVER], i.e. that you only register MyClass and that the following work:

def main3() -> None:

    print("•••• direct ••••")
    converter.unstructure(MyClass())
    converter.structure("hello1", MyClass)

    print("•••• With str ••••")
    converter.unstructure(MyClass[str]()) # Do call MyClass.serialize()

    converter.unstructure(MyBase(a=MyClass[str]())) # Do call MyClass.serialize()

    converter.structure("hello3", MyClass[str]) # Do call MyClass.deserialize()

if __name__ == '__main__':

    converter.register_structure_hook(MyClass, MyClass.deserialize)
    converter.register_unstructure_hook(MyClass, MyClass.serialize)

    main3()

I do not know so much about functools.singledispatch but that sound something hard to tell it. The problem is that something like MyClass[str] and MyClass[int] are two different instances with no common ancestor – the only way I found is checking the __origin__ attribute.

@raabf
Copy link
Contributor Author

raabf commented Feb 8, 2022

Problem of this issue also described here #87

@phiresky
Copy link

I also just hit this, with an attribute of type Foo[Bar] not being serialized with the error:

cattrs.errors.StructureHandlerNotFoundError: Unsupported type: 'Foo[Bar]'. Register a structure hook for it.

The workaround was to do this (i already had the issubclass_safe function):

def issubclass_safe(sub: Any, sup: type) -> bool:
    if typing.get_origin(sub) is not None:
        # is an instantiation of a generic type, so compare also the unparameterized type
        if issubclass_safe(typing.get_origin(sub), sup):
            return True
    # check if sub is actually a class, otherwise issubclass throws
    return inspect.isclass(sub) and issubclass(sub, sup)

_converter.register_structure_hook_func(
        lambda t: issubclass_safe(t, Foo),

But in my opinion cattrs should use typing.get_origin(type) (same as type.__origin__) and check that against registered classes as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants