Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subsequent calls of from_dict on dataclasses that contain Optional[type] and type | None breaks behaviour because of caching #236

Open
TomaszBorczyk opened this issue Jun 30, 2023 · 1 comment · May be fixed by #237
Labels
bug Something isn't working

Comments

@TomaszBorczyk
Copy link

TomaszBorczyk commented Jun 30, 2023

Describe the bug
Exception is thrown in dacite when using from_dict in certain scenarios on dataclass with <type> | None field, eg str | None

To Reproduce

import dataclasses
import dacite
import typing


@dataclasses.dataclass
class OptionalDataclass:
    text1: typing.Optional[str]


@dataclasses.dataclass
class UnionDataclass:
    text2: str | None


if __name__ == '__main__':
    d1 = {"text1": None}
    # In order to reproduce this bug:
    # 1. this needs to be called first
    dacite.from_dict(OptionalDataclass, d1)

    # 2. this field has to be non None
    d2 = {"text2": "abc"}

    # 3. config with cast needs to be provided
    # (content of cast array does not matter, as it only is necessary to trigger
    #   for cast_type in config.cast
    # in dacite/core.py)
    c = dacite.Config(cast=[int])

    # then below fails
    res = dacite.from_dict(UnionDataclass, d2, config=c)

    # without either of 1-3 from above steps, final from_dict will work as expected

Stack trace:

/usr/local/lib/python3.10/site-packages/dacite/core.py:64: in from_dict
    value = _build_value(type_=field_type, data=field_data, config=config)
/usr/local/lib/python3.10/site-packages/dacite/core.py:101: in _build_value
    if is_subclass(type_, cast_type):
/usr/local/lib/python3.10/site-packages/dacite/types.py:174: in is_subclass
    if is_generic_collection(sub_type):
/usr/local/lib/python3.10/site-packages/dacite/types.py:147: in is_generic_collection
    origin = extract_origin_collection(type_)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

collection = str | None

    def extract_origin_collection(collection: Type) -> Type:
        try:
            return collection.__extra__
        except AttributeError:
>           return collection.__origin__
E           AttributeError: 'types.UnionType' object has no attribute '__origin__'

Root cause is caching of dacite.types.is_generic_collection and dacite.types.is_generic - caching mechanism interprets both Optional[str] and str | None as the same value, so it returns wrong value when we first call it with Optional[str], and then call it with str | None. This means first call of from_dict with OptionalDataclass pollutes the cache and breaks the behaviour for from_dict with UnionDataclass. Note that running only dacite.from_dict(UnionDataclass, d2, config=c) (without preceding call) works as expected.

To prove this, below are even simpler reproduction steps that throw exception:

if __name__ == '__main__':
    is_generic(typing.Optional[str])
    is_generic_collection(str | None)

Below works just fine:

if __name__ == '__main__':
    is_generic_collection(str | None)

thus cache pollutes scope of dacite

Removing @cache from is_generic_collection, is_generic and extract_origin_collection fixes the issue in both cases

Expected behavior
Dacite transforms the data without error

Environment

  • Python version: 3.10
  • dacite version: 1.8.1

Workaround
Downgrade is an option - [email protected] works correctly

@TomaszBorczyk TomaszBorczyk added the bug Something isn't working label Jun 30, 2023
@TomaszBorczyk
Copy link
Author

I've opened the PR to fix the issue.

If I might, I would like to start a conversation about caching - as there seems to have been (and are, and might be more) some issues that appeared because of caching, have you considered reverting those changes completely and dropping caching altogether? Would the performance hit be significant enough that you would rather keep the caching?

Asking as I do not know the historical context (maybe there was a need for performance boost)

@mciszczon @konradhalas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant