Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Py3 bindings #51

Open
wants to merge 30 commits into
base: master
Choose a base branch
from
Open

WIP Py3 bindings #51

wants to merge 30 commits into from

Conversation

eboto
Copy link
Contributor

@eboto eboto commented Mar 22, 2017

WORK IN PROGRESS DO NOT MERGE

Discussion here

Python3 Courier Data Binding Generator

Courier bindings for Python 3 (tested against Python 3.6)

Features missing in action

Mainline Courier features that are not yet supported, but will be by the time
this thing is ready for submit:

  • Coercers
  • Non-primitive keyed maps. Hashcode not yet implemented for
    non-primitives.
  • Non-JSON serialization. Though avro should be trivial with existing
    avro libraries.
  • Default values.
  • Flat-typed definitions.
  • Optional fields. Currently all fields in MyRecord.create are
    required. These should be easy to add.
  • Deprecation.
  • Enum properties. No Fruits.BANANA.property("color")
  • $UNKNOWN on enums.

Additional tasks before merging to courier master:

  • Fix the most heinous TODO(py3) items from the changelist.
    • Specifically the ones that alter the general, not-only-py3 code.
  • Hook into the top-level cli
  • Many additional test-cases
  • Populate this documentation
  • Import classes via the package's __init__.py rather than per-class files

Additional desirable tasks for later:

  • Ship courier.py as a package through pip instead of through the
    generator
  • Examine abstractions that unify code the large amount of overlap between TSLite and Python3 code generation. Hopefully this can lead to writing new bindings more easily down the road.

Pythonic API Questions

  • What is the least surprising approach to mutability of data? Is it pythonic to expect that setting cookie.fortune.message = 'Hello World' will change the json value of courier.dumps(cookie)?
  • What is the least surprising approach to validation? I'm currently thinking to validate on:
    • courier.loads(MyRecordType, json_string)
    • courier.validate(my_record) of course
    • But we could also validate immediately after setting any field. This would end up being really expensive if setting fields in a loop, but would be maximally useful during development since py lacks a type system.

Notes from interviews with a real python dev:

  • instead of dumps / loads use parse / serialize
  • It strikes a better balance for python users to put the natural constructor args as optional named parameters to init, rather than as required/optional parameters on create. Instead, just enforce the contract in the init function itself, throwing a courier.ValidationError
  • Is useful to throw validation errors when a bad value is set into a courier record
    • if you try to set, and throw a validation error, it should keep the old version of the data (before the set)
  • Is it possible to make the union actually behave like a union of the constituent types? e.g.
    my_union = /* get the union of Cookie | FortuneCookie */
    my_union.calories
  • Absence of a required field should throw a AttributeError or whatever the right one is

@eboto
Copy link
Contributor Author

eboto commented Jun 2, 2017

Identified and patched a bug in array handling here https://github.com/dfxmachina/vision/pull/103#pullrequestreview-41849394

Gotta incorporate that into this PR

@sim0nsays
Copy link

Currently, python bindings don't support required fields that have a default value.
Adding a required field should be a backward-compatible change, meaning json records with no field set should be parsed into a valid python object.
Right now, bindings will fail to validate such json against the generated AVRO schema.

@eboto
Copy link
Contributor Author

eboto commented Aug 17, 2017

Validation of unions is broken.

This is because avro deserializes unions into python dicts with a different expected shape than the shape that courier deserializes them into. For example: given union[string], avro would successfully validate "1", which is not a valid representation of a courier union. Likewise, avro would fail to validate {"string": "1"}, which is the valid representation of a courier union.

We will have to either not validate unions, or bring legit validation into the python bindings.

@sim0nsays
Copy link

Another feature request: would be really great for the objects to have copy method similar to scala bindings. It would accept **args and produce a copy of the target instance with fields replaced by the name fields in the arguments.

@roni-instr
Copy link

roni-instr commented Feb 17, 2018

Bug report: double quotes in the comment result in an invalid schema.

Description:

Request.courier had the comments like:

/** Expected to be in the "file://" format. */

After generating courier bindings Parameters.py:

# ipython
In [1]: from Request import Request
Exception: Error parsing schema from JSON: '{"type":"record",
....
Error message: JSONDecodeError("Expecting ',' delimiter: line 1 column 3096 (char 3095)",).

No error if"file://" is replaced with 'file://'.

@ryfeus
Copy link

ryfeus commented Jan 31, 2019

@eboto I was looking into python bindings for courier and found this PR thread. I would like to help with this PR to prepare it for master. Are there any tasks which could help this branch?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants