Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a disambiguator for discriminated unions #392

Merged
merged 10 commits into from
Oct 2, 2023
Merged

Implement a disambiguator for discriminated unions #392

merged 10 commits into from
Oct 2, 2023

Conversation

A5rocks
Copy link
Contributor

@A5rocks A5rocks commented Jul 9, 2023

Another way at "tagged unions" except backwards compatible! Also, a ton of services provide data in this fashion.

Look at the tests for some examples!

Another way at "tagged unions" except backwards compatible! Also, a ton
of services provide data in this fashion. Debatably cattrs already does
something similar where you don't need to register a structure hook for
a `Literal[1, 2, 3, 4]` or whatever.
@A5rocks
Copy link
Contributor Author

A5rocks commented Jul 18, 2023

Hi @Tinche, just pinging in case you missed this!

@Tinche
Copy link
Member

Tinche commented Jul 18, 2023

Hi there! So here are some of my thoughts.

I used to apply this pattern consistently at my $DAYJOB since, as you say, it's simple, powerful and widely used. Although my classes had defaults for the tag field so they could be instantiated directly, not just through cattrs.

But the main issue is it goes against some fundamental cattrs design principles. It has two major problems.

The first one is that you need to have the tag as a Literal field on the classes. In Python, this field is useless: the tag is the actual class. So it just sits there using memory and cpu cycles in __init__ et al. If you think about it, it should really be a ClassVar, and then if you think about it some more, that information is already present in the class object.

The second issue is it puts un/structuring data on the model itself. cattrs is fundamentally an exercize in decoupling serialization logic from models.

The tagged union strategy introduced in the last release solves both of these issues. It was the result of learnings I had using this Literal approach.

That said, maybe there's a way forward here. The one place where cattrs has historically strayed from its design principles was in the default union disambiguator. I'm good with going for convenience here.

So I would be open to you combining this work with the default disambiguator. Currently, the default disambiguator looks for unique field names; how about we add a pass before where it would look for Literal ClassVars?

@A5rocks
Copy link
Contributor Author

A5rocks commented Jul 19, 2023

Overall I think this makes sense! I do agree this does kinda suck but the reason I made this PR is cause I'm autogenerating a discriminated enum and it's quite annoying having to autogenerate cattrs converters too (even if I think I finally figured it out).

Just one question:

So I would be open to you combining this work with the default disambiguator. Currently, the default disambiguator looks for unique field names; how about we add a pass before where it would look for Literal ClassVars?

I'm not sure what this entails. I've placed this next to where the other disambiguator is used; would you rather I rename the other one and then stick this in that one? Provide a nicer way to signalling "this doesn't work" instead of raising exceptions and catching it? I just need more details!

@Tinche
Copy link
Member

Tinche commented Jul 26, 2023

Yeah, I was thinking you remove the disambiguator you created, and put the logic from it into the default disambiguator.

So the default disambiguator first looks for literal fields, and then does the unique field logic on the rest of the classes.

@Tinche
Copy link
Member

Tinche commented Aug 30, 2023

Howdy, how are we looking here? Trying to plan the next release.

@A5rocks
Copy link
Contributor Author

A5rocks commented Aug 30, 2023

Sorry, I think this works as you mentioned.

@Tinche Tinche added this to the 23.2 milestone Sep 3, 2023
@Tinche
Copy link
Member

Tinche commented Sep 4, 2023

Just a small note, I'll be away for the next 3 weeks on a family trip and I can continue reviewing this after I'm back!

@Tinche
Copy link
Member

Tinche commented Sep 23, 2023

Alright, I'm back so let's get this over the line and release the next version!

Looks like it needs a rebase and a linting error fixed.

@A5rocks
Copy link
Contributor Author

A5rocks commented Sep 26, 2023

I'm going to assume the linting errors are fixed because they're in files this PR doesn't touch xd

(I'll see whenever the workflows actually get approved...)

@Tinche
Copy link
Member

Tinche commented Sep 27, 2023

I dropped 3.7 support on main, that would explain these failures ;)

@A5rocks
Copy link
Contributor Author

A5rocks commented Sep 27, 2023

I'm a bit confused as to why isort and ruff are complaining about files I didn't modify :(

I've gone ahead and fixed the obvious isort issue but I'm not sure what else it doesn't like -- I'll try running it locally in a bit :^)

Oh. Ruff complains on main branch and I forgot I added isort to run in CI.

... I believe isort is complaining because the environment is different (it doesn't complain locally). I'll remove it; I don't have the CI expertise for this!

@A5rocks
Copy link
Contributor Author

A5rocks commented Sep 28, 2023

Well, after a lot of problems with linters it looks like that saga's over :D

@Tinche
Copy link
Member

Tinche commented Oct 2, 2023

Thanks, I'll merge this in now.

I might tweak the docs a little and work on a way to disable this by default; there's a change this might break someone and they need to have an escape hatch to the old behavior, but since I think the chance is low the escape hatch doesn't need to be super simple.

What do you think if we also change it to allow literal fields with defaults?

@Tinche Tinche merged commit 43f7d0f into python-attrs:main Oct 2, 2023
8 of 9 checks passed
@A5rocks
Copy link
Contributor Author

A5rocks commented Oct 2, 2023

I think as long as there's only one type with a default for a literal (or missing literal) then that's fine. Or if it's fine for the disambiguation function to raise if it's given a dictionary without that literal key.

@Tinche
Copy link
Member

Tinche commented Oct 3, 2023

What's the use case that's concerning you?

I think we can be more lax with literals since defining a field as field_x: Literal['x'] = 'x' is very uncommon, I think you'd only do it for serialization.

@A5rocks
Copy link
Contributor Author

A5rocks commented Oct 3, 2023

Hm, I'm mostly concerned about deserializing an object missing that literal key.

@Tinche
Copy link
Member

Tinche commented Oct 3, 2023

But this strategy is about serialization, not deserialization ;)

I want to understand so I don't mess up your use cases, since you're the original contributor. Can you sketch a small example for me?

@A5rocks
Copy link
Contributor Author

A5rocks commented Oct 3, 2023

Errrr, sorry I messed up my words. The counter example I'm thinking is:

import attrs
from typing import Literal
import cattrs

@attrs.define()
class A:
  x: Literal[4] = 4

@attrs.define()
class B:
  x: Literal[5] = 5

cattrs.structure({"x": 4}, A | B)  # this might still work
cattrs.structure({}, A | B)  # but what about this?

... I'm probably misunderstanding something. (I'm not sure what is "serialization" and what is "deserialization" with cattrs...)

@A5rocks A5rocks deleted the discriminated-union-support branch October 3, 2023 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants