Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite code which tags complex objects #223

Merged
merged 5 commits into from
Oct 24, 2016
Merged

Rewrite code which tags complex objects #223

merged 5 commits into from
Oct 24, 2016

Conversation

bernie-simon
Copy link
Contributor

Part of the validation of an asdf tree against a schema is checking
tags in the schema against tags embedded in the data. Function
tagged.tag_objects has code supplies these tags, but orignally was
only coded to handle dctionaries, lists, and strings. This proved a
problem when checking the tags of times in the asdf tree, as times are
none of these types. I previously wrote a function specifically to
handle times, but his code had two problems. First, it created an
extra dependency on the astropy time code and second, the fix was
limited to time fields.

After becoming more familiar with the code I saw that the
functionality for handling more complex data objects was already in
asdf, in yamlutil.custom_tree_to_tagged_tree. So I removed the code I
had written to handle Time objects and replaced it with that
function. That code takes a new argument, ctx, which is an asdf
object, preferable the one associated with the asdf tree being
validated. It is needed to access any extensions that might be used
with the asdf tree. Since this is a new argument, the argument list to
tagged.tag_object was modified to include it as an optional argument
and code in asdf which invokes tagged.tag_object has been modified to
pass it where it is available. If ctx is not passed, the code creates
a new asdf object, which means that extensions cannot be used.

Part of the validation of an asdf tree against a schema is checking
tags in the schema against tags embedded in the data. Function
tagged.tag_objects has code supplies these tags, but orignally was
only coded to handle dctionaries, lists, and strings. This proved a
problem when checking the tags of times in the asdf tree, as times are
none of these types. I previously wrote a function specifically to
handle times, but his code had two problems. First, it created an
extra dependency on the astropy time code and second, the fix was
limited to time fields.

After becoming more familiar with the code I saw that the
functionality for handling more complex data objects was already in
asdf, in yamlutil.custom_tree_to_tagged_tree. So I removed the code I
had written to handle Time objects and replaced it with that
function. That code takes a new argument, ctx, which is an asdf
object, preferable the one associated with the asdf tree being
validated. It is needed to access any extensions that might be used
with the asdf tree. Since this is a new argument, the argument list to
tagged.tag_object was modified to include it as an optional argument
and code in asdf which invokes tagged.tag_object has been modified to
pass it where it is available. If ctx is not passed, the code creates
a new asdf object, which means that extensions cannot be used.
@nden
Copy link
Contributor

nden commented Oct 21, 2016

Just so I understand, is the current change applicable to Time objects?
Because custom tags are validated without this. Or am I mistaken?
For example, why does one need the change to domain?

It would help if there's a failing example (also as a test).
At a minimum this needs a changelog entry.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.08%) to 93.611% when pulling a2c3c08 on bernie-simon:redo_tagged_types into 1a2bf4a on spacetelescope:master.

@bernie-simon
Copy link
Contributor Author

After metadata is changed through datamodels, the change is validated using the asdf validation code. Part of this validation is making sure that the type of the data matches the type in the schema. These schema types are not (necessarily) the same types as in Python. The code handles this by wrapping the data in a new object with a _tag field containing a string which matches the schema type. The code in asdf/tagged.py is an interface between the tagging code in asdf and datamodels so that data is properly tagged before being validated.

Previously time fields failed the validation. This was worked around by changing the schema so that time fields were treated as strings. My previous change to asdf resolved the problem, but because I did not understand the code well enough, I wrote code that only handled time fields rather than the more general code that called the asdf tagging code and handles any tagged field in the schema.

The change to the calling sequence is regrettable, but the new argument is needed by yamlutil.custom_tree_to_tagged_tree. The new argument is used to access any non-standard extensions associated with a file, not a standard situation. Because of this and because it is an optional argument, it should not cause problems with the existing code. I have verfied this by running the tests for jwst against the change.

@nden
Copy link
Contributor

nden commented Oct 21, 2016

@bernie-simon Looks like you were editing an old version.

@@ -460,7 +460,7 @@ def to_tree_tagged(cls, node, ctx):
`to_tree`, allows types to customize how the result is tagged.
"""
obj = cls.to_tree(node, ctx)
return tagged.tag_object(cls.yaml_tag, obj)
return tagged.tag_object(cls.yaml_tag, obj, ctx=ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't obj already containing ctx? Does it need to be passed separately and wouldn't this increase memory usage?

@bernie-simon
Copy link
Contributor Author

obj is the value being tagged. It may already be in the asdf tree, if the asdf tree is being validated, or it may not be in the tree, if a modification to the tree is being validated. In either case, it can't be accessed from ctx without knowing the path from the top of the tree.

@@ -87,7 +87,7 @@ def _to_tree_from_model_tree(cls, tree, ctx):
except KeyError:
raise ValueError("Unknown operator '{0}'".format(tree.value))

node = tagged.tag_object(cls.make_yaml_tag(tag_name), node)
node = tagged.tag_object(cls.make_yaml_tag(tag_name), node, ctx=ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. The CompoundType represents a compound model of type astropy.modeling.core.CompoundModel. In my experience it never fails serialization. Why does it need to be wrapped? I may be wrong but my understanding is that the intention was to tag only python primitive types.
It would really help if there's a failing example which illustrates the problem.

@bernie-simon
Copy link
Contributor Author

I'm speaking of the validation that happens after the asdf tree is modified in datamodels.properties. That code checks the schema for a tag and if it finds one, it calls the asdf validation code, which looks for a matching tag on the data being modified. Because data coming through datamodels.properties comes from the user and is not wrapped the way data is when walking the asdf tree, the tag is not there and validation fails. Probably you have not seen CompoundModel fail validation is because users do not try to add it through the datamodels.properties interface.

I added the test case I wrote for time tagged data to the asdf test
suite. The file is asdf/tests/test_tagged.py.
@bernie-simon
Copy link
Contributor Author

I've added a test case for time tagged data -- the test case is in asdf/tests/test_tagged.py

@coveralls
Copy link

Coverage Status

Coverage increased (+0.1%) to 93.831% when pulling 3336fd5 on bernie-simon:redo_tagged_types into 1a2bf4a on spacetelescope:master.

@bernie-simon bernie-simon merged commit c700408 into asdf-format:master Oct 24, 2016
@bernie-simon bernie-simon deleted the redo_tagged_types branch October 24, 2016 18:13
@nden nden added this to the 1.2.0 milestone Nov 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants