Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling unions from JSON #16

Closed
mattallty opened this issue Nov 28, 2015 · 9 comments
Closed

Handling unions from JSON #16

mattallty opened this issue Nov 28, 2015 · 9 comments
Labels

Comments

@mattallty
Copy link

Hi,

Thanks for this great library!

Quick question: is there a simple way to handle "unions" without having to wrap elements, maybe using Type Hooks or LogicalTypes or something else ?

For example:

My schema:

{"name":"foo", "type": ["null", "string"]}

I've to wrap element with the type:

{"foo": {"string": "my foo value"}}

What i'd like is to be able to simply encode my records without having to wrap them, like this:

{"foo": "my foo value"}

Thanks !

@mtth mtth added the question label Nov 28, 2015
@mtth
Copy link
Owner

mtth commented Nov 28, 2015

Hello,

I'll mention this first in case this is sufficient for your use-case: you can easily wrap values to make them valid by using clone's wrapUnions option (example here).

Now, if you would like unions to never be wrapped, you can also achieve that using a custom logical type and type hook (you had the right idea!):

var avsc = require('avsc'),
    util = require('util');

/**
 * Our new union, which will expose its values directly.
 *
 * This implementation is pretty minimal, we could optimize it by caching the
 * underlying union's type names for example.
 *
 */
function UnwrappedUnionType(attrs, opts) {
  avsc.types.LogicalType.call(this, attrs, opts);
}
util.inherits(UnwrappedUnionType, avsc.types.LogicalType);

UnwrappedUnionType.prototype._fromValue = function (val) {
  return val === null ? null : val[Object.keys(val)[0]];
};

UnwrappedUnionType.prototype._toValue = function (any) {
  return this.getUnderlyingType().clone(any, {wrapUnions: true});
};

/**
 * The type hook in charge of applying our new union type.
 *
 * We can't just add a `logicalType` attribute in the case of unions since
 * their schemas are vanilla arrays.
 *
 */
var visited = []; // To keep track of already unwrapped schemas.
function unwrappingTypeHook(attrs, opts) {
  if (attrs instanceof Array && !~visited.indexOf(attrs)) {
    visited.push(attrs);
    return new UnwrappedUnionType(attrs, opts);
  }
}

To check that it works:

var type = avsc.parse(['null', 'string'], {typeHook: unwrappingTypeHook});
var buf = type.toBuffer('hello');
var str = type.fromBuffer(buf); // 'hello'

Except perhaps when only dealing with "optional" unions (i.e. which contain 'null' and a single other type), I wouldn't recommending doing this though. For example, encoding becomes slower and harder to predict (if your union contains 'int' and 'double', the type of a value might change after a decode/encode round-trip). There are other reasons as well if you are curious.

@mtth mtth closed this as completed Nov 28, 2015
@mattallty
Copy link
Author

Thank you very much for the detailled explanation !

Unfortunetly, this does not seems to work for Record Types (or maybe I did something wrong).
(tried 1st and 2nd solution)

Here is an example:

      var ex_schema = ['null', {
        "type": "record",
        "namespace": "foo",
        "name": "bar",
        "fields": [
          {"name" : "firstname", "type": "string"},
          {"name" : "age", "type": ["null", "int"], "default": null}
        ]
      }]

      var type = avsc.parse(ex_schema, {typeHook: unwrappingTypeHook});
      var buf = type.toBuffer({"firstname" : "Matt"});

Error:

Error: invalid ["null","foo.bar"]: {"firstname":"Matt"}
    at throwInvalidError (./cli/node_modules/avsc/lib/schemas.js:2165:9)
    at UnionType._copy (./cli/node_modules/avsc/lib/schemas.js:898:3)
    at UnionType.Type.clone (./cli/node_modules/avsc/lib/schemas.js:278:15)
    at UnwrappedUnionType._toValue (./cli/lib/Cli/Command/tmp/test-avsc.js:166:26)
    at UnwrappedUnionType.LogicalType._write (./cli/node_modules/avsc/lib/schemas.js:1791:41)
    at UnwrappedUnionType.Type.toBuffer (./cli/node_modules/avsc/lib/schemas.js:247:8)

@mtth
Copy link
Owner

mtth commented Nov 28, 2015

You're right; this is tricky to do for nested unions currently. The above implementation won't support that. I'll see what I can do.

@mtth mtth reopened this Nov 28, 2015
@mtth
Copy link
Owner

mtth commented Nov 28, 2015

Just released a new version (3.1.0), starting from which the above example will work as expected!

I also added an example of how to implement an optional logical type which might be helpful to you. The use of general "unwrapped unions" is still discouraged (see above comment).

@mtth mtth closed this as completed Nov 28, 2015
@mattallty
Copy link
Author

Seems like it's working well, thank you very much for this ! 👍

@mtth
Copy link
Owner

mtth commented Nov 28, 2015

Great :).

@osi
Copy link

osi commented Nov 30, 2015

Thanks for this example! I was going to post the same question. I'm using avro for receipt-only in the browser right now, and eliminating the type information to look like "regular" JS objects is what I wanted to be compatible with JSON serialization. It would be awesome to have this built-in at least on the deserialization side.

@mtth mtth mentioned this issue Dec 3, 2015
@mtth
Copy link
Owner

mtth commented Dec 3, 2015

@osi - Thanks for the feedback. I agree that it'd be good to have a way to get rid of the wrapping object but I also believe this is currently best left to users to implement (so they can choose the best implementation for their use-case: e.g. if they only have "optional" unions, the simple example above would do, but other assumptions would require another).

On a related note: in the future, one of the things I would like to explore is to come up with a subset of Avro for which encoding of unions would always be well-defined without this wrapping object (e.g. with a single number type).

@osi
Copy link

osi commented Dec 4, 2015

One nice balance might be to expose the class and function from the example in the public API so it could be used w/o having to cut and paste it.

Otherwise, having the default be avro-JSON is totally sensible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants