Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add deriving for unions with a single constructor #134

Merged

Conversation

jmickelin
Copy link

There seems to be a hole in the handling of union types in the TemplateHaskell derivation of the Avro types. Namely, there is an unstated assumption that the size of the unions are at least 2.
Moreover, running deriveAvro on an .avsc schema that contains exactly one constructor results in the confusing error message

Unions with more than 5 elements are not yet supported.

Avro unions with a single constructor aren't explicitly mentioned in the specification, but it does say that

Unions [...] are represented [in schemata] using JSON arrays

which does not rule out the length 1 array case. This is corroborated by the reference implementation used by avro-tools, which does generate length 1 arrays from .avdl schemata containing union declarations with a single element.

This PR modifies deriveAvro to wrap such unions in the Identity functor, serving as the degenerate case of the EitherN collection of types. It also provides more helpful error messages for union types
with an unsupported number of constructors.

I wasn't sure if using Identity is the best way to deal with this, or if you would prefer an Either1 newtype, but this can be easily changed!

@AlexeyRaga
Copy link
Member

I am thinking, do we actually need an Identity or any other wrapper in this case?
If a union only contains one type, why don't we just generate the field of that type without wrapping it in anything?

@jmickelin
Copy link
Author

jmickelin commented Feb 17, 2020

I'm not sure that will work, since that would fail the serialization/parsing roundtrip test for unions. Correct me if I'm wrong, since I am not fully familiar with the library internals outside of the things I changed, but it seems that the Avro data is parsed into the Avro type based on the HasAvroSchema typeclass. So there would be no way for HasAvroSchema to tell the difference between a serialized integer (say 1 :: Int02) and a serialized single-constructor union with an integer (Identity 102 02). So we do need a newtype in the end.

@jmickelin jmickelin force-pushed the derive-avro-for-singleton-unions branch from bb5b11a to 6dbac82 Compare February 17, 2020 13:31
@AlexeyRaga
Copy link
Member

AlexeyRaga commented Feb 17, 2020

Ah, I see what you mean.
I think that we could control it since we generate FromAvro/ToAvro instances, but it is more effort and I agree that wrapping it with Identity is better :)

Copy link
Member

@AlexeyRaga AlexeyRaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@AlexeyRaga AlexeyRaga merged commit c8c46e5 into haskell-works:master Feb 17, 2020
@jmickelin jmickelin deleted the derive-avro-for-singleton-unions branch March 18, 2020 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants