Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnionType Items in Array Type #39

Closed
mladerman opened this issue Feb 24, 2016 · 2 comments
Closed

UnionType Items in Array Type #39

mladerman opened this issue Feb 24, 2016 · 2 comments

Comments

@mladerman
Copy link

Hi there! I've hit an issue where encoding an ArrayType that has a UnionType for its items fails.

{
  "namespace": "example.avro",
  "type": "record",
  "name": "Example",
  "fields": [
    {
        "name": "values",
        "type":
            {
                "type": "array",
                "items": ["int", "string"]
            }
    }
  ]
}

Here's the example running this

var Avro = require('avsc');
var schema = Avro.parse('./example.avsc');
var data = {values: ["test"]}
schema.toBuffer(data);
TypeError: Object.keys called on non-object
    at Function.keys (native)
    at UnionType._write (avsc/lib/types.js:826:19)
    at ArrayType._write (avsc/lib/types.js:1367:19)
    at RecordType.writeExample [as _write] (eval at <anonymous> (avsc/lib/types.js:1678:10), <anonymous>:3:6)
    at RecordType.Type.toBuffer (avsc/lib/types.js:264:8)

Interestingly enough, this succeeds when passing in an empty array for values.

I've tested the same schema with the official python avro implementation to confirm that my schema is valid. Please let me know if there is any more information that I should provide!

@mtth
Copy link
Owner

mtth commented Feb 24, 2016

Hey! This is related to how decoded unions are represented. The default UnionType expects its values to be wrapped inside objects (similarly to their JSON representation). So for your example, the value ({values: ['test']}) you are passing should be transformed to {values: [{'string': 'test'}]}.

This is required to correctly serialize all union values. Here's an example of an ambiguous case otherwise:

var type = avro.parse(['int', 'float']);
var buf = type.toBuffer(2); // Should 2 be serialized as an integer or a float?

The python implementation can cause corruption (and also requires a linear scan of branches during serialization, which is slow). If you're curious, you can take a look at #16 for more context.

Not all unions lead to ambiguous cases though so I'm actually working on adding an option to represent decoded union values without the wrapping object when possible.

@mtth mtth closed this as completed Feb 24, 2016
@mladerman
Copy link
Author

Ah, perfect, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants