Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement support for ASDF-in-FITS in this package #110

Merged

Conversation

eslavich
Copy link
Collaborator

@eslavich eslavich commented Jan 8, 2023

This PR reimplements support for ASDF-in-FITS in this package. It requires asdf-format/asdf#1289, so we'll need to get that merged and released before merging this.

Checklist

  • added entry in CHANGES.rst (either in Bug Fixes or Changes to API)
  • updated relevant tests
  • updated relevant documentation
  • updated relevant milestone(s)
  • added relevant label(s)

@eslavich eslavich force-pushed the eslavich-implement-asdf-in-fits branch from af2d0d7 to f0dbf80 Compare January 8, 2023 07:30
@codecov
Copy link

codecov bot commented Jan 8, 2023

Codecov Report

Base: 97.96% // Head: 98.18% // Increases project coverage by +0.22% 🎉

Coverage data is based on head (1deeabb) compared to base (c434220).
Patch coverage: 100.00% of modified lines in pull request are covered.

❗ Current head 1deeabb differs from pull request most recent head 11be0d7. Consider uploading reports for the commit 11be0d7 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #110      +/-   ##
==========================================
+ Coverage   97.96%   98.18%   +0.22%     
==========================================
  Files          14       15       +1     
  Lines        1275     1432     +157     
==========================================
+ Hits         1249     1406     +157     
  Misses         26       26              
Flag Coverage Δ
unit 98.18% <100.00%> (+0.22%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
tests/test_fits.py 99.03% <ø> (-0.01%) ⬇️
tests/test_embedded_asdf.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@braingram
Copy link
Collaborator

braingram commented Jan 13, 2023

While looking at increasing the coverage I ran into what looks like a reintroduction of asdf-format/asdf#1232

Adding a breakpoint and inspecting file_path here:

x.arr[2].data = array3
del x.arr[1]
assert len(x.arr) == 2
x.to_fits(file_path)

I see a file with

arr:
- data: !core/ndarray-1.0.0
    source: 0
    datatype: float64
    byteorder: little
    shape: [5, 5]
- data: !core/ndarray-1.0.0
    source: 1
    datatype: float64
    byteorder: little
    shape: [5, 5]
meta: {model_type: DataModel}
...

Which differs from what is seen without this PR:

arr:
- data: !core/ndarray-1.0.0
    source: fits:FOO,1
    datatype: float64
    byteorder: big
    shape: [5, 5]
- data: !core/ndarray-1.0.0
    source: fits:FOO,2
    datatype: float64
    byteorder: big
    shape: [5, 5]
meta: {model_type: DataModel}
...

Specifically, the hdus are not being linked to arrays resulting in duplicate data which would reintroduce asdf-format/asdf#1232

@eslavich eslavich force-pushed the eslavich-implement-asdf-in-fits branch from ebf473d to 94b4b88 Compare January 13, 2023 20:10
@eslavich
Copy link
Collaborator Author

I fixed the bug that @braingram caught and added a bunch of tests. The jwst test failure is due to these two reference files:

jwst_nircam_flat_0141.fits
jwst_nircam_flat_0142.fits

They are using an ancient version of the ASDF Standard:

#ASDF 1.0.0
#ASDF_STANDARD 1.1.0
%YAML 1.1
%TAG ! tag:stsci.edu:asdf/
--- !core/asdf-1.0.0

which means the top-level object is validated against this goofy schema:

https://github.com/asdf-format/asdf-standard/blob/4799a2e951057108525309b0294337c1ee7ec0a5/src/asdf_standard/resources/schemas/stsci.edu/asdf/core/asdf-1.0.0.yaml#L27-L46

The ndarray object that happens to be assigned to data fails because it is no longer in the tagged format. If we update those two references to a more modern version of ASDF then the tests should clear up.

@braingram
Copy link
Collaborator

Thanks @eslavich
I'm not seeing the jwst test failure(s) you're referring to. The ones I see in the CI are CRDS/cache related.
https://github.com/spacetelescope/stdatamodels/actions/runs/3917790580/jobs/6697799856#step:6:409
If I manually sync crds locally and run these tests I see no failures for jwst.datamodels.tests
Are the errors from other tests in jwst?

@eslavich
Copy link
Collaborator Author

When I run the jwst tests on my laptop (with this stdatamodels branch) I get failures like this:

FAILED jwst/datamodels/tests/test_schema_against_crds.py::test_crds_selectors_vs_datamodel[nircam] - jsonschema.exceptions.ValidationError:      array([[1., 1., 1., ..., 1., 1., 1.],

I checked the current nircam flat rmap and confirmed that it does include rules for those ancient reference files, so I think the problem must be real. If we replace those two files it should stop complaining.

@braingram
Copy link
Collaborator

Thanks for the details. I had the wrong version of stdatamodels, with the code in this PR I am able to reproduce the issue you described.

jsonschema.exceptions.ValidationError:      array([[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
...,
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.]], dtype=float32) is not valid under any of the given schemas

     Failed validating 'anyOf' in schema['properties']['data']:
         {'$schema': 'http://stsci.edu/schemas/yaml-schema/draft-01',
...

Am I understanding this correctly that the issue is that _tagged_tree_transform is converting the tagged dict array object (with a source describing the fits hdu for the array data) into a ndarray (returning hdu.data). ASDF then attempts to validate what it expects is a tagged tree and fails when it encounters the ndarray?

@eslavich
Copy link
Collaborator Author

Am I understanding this correctly that the issue is that _tagged_tree_transform is converting the tagged dict array object (with a source describing the fits hdu for the array data) into a ndarray (returning hdu.data). ASDF then attempts to validate what it expects is a tagged tree and fails when it encounters the ndarray?

Yes, that's right. I don't think this will be a problem for jwst since the model schemas don't $ref in ndarray-1.0.0 for FITS arrays. The only reason it's failing is due to the weird old asdf-1.0.0 schema that is kicking in because our array happens to be named "data".

@braingram
Copy link
Collaborator

Thanks for the details.

I opened a PR against your branch with some changes that so far seem to pass all tests (including the jwst ones) and doesn't require the addition of _tagged_tree_transform to asdf.
https://github.com/eslavich/stdatamodels/pull/1/files

It does require reaching into the internal NDArrayType._source attribute. Also, I believe that opening without lazy load would result in an exception as the attempt to resolve the source (e.g. fits:HDU) would attempt to open an external file (as this appears like a url). However this option doesn't appear to be overridden anywhere in stdatamodels.

@eslavich what are your thoughts on the approach in the PR?

Copy link
Collaborator

@braingram braingram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

I started a jwst regression test run with the source branch and will update this with results.
https://plwishmaster.stsci.edu:8081/job/RT/job/JWST/2184

I also took a look at jdaviz and hcipy to see how the uses of AsdfInFits might need to be updated. The general proposal would be to update those libraries to include stdatamodels as a dependency (and perhaps add some convenience functions here to help with the integration). However we can handle that in a different PR once we have a timeframe for dropping AsdfInFits in asdf.

eslavich and others added 8 commits January 24, 2023 11:47
@braingram braingram force-pushed the eslavich-implement-asdf-in-fits branch from 1deeabb to 11be0d7 Compare January 24, 2023 16:47
@braingram
Copy link
Collaborator

I had to restart the regression tests.
https://plwishmaster.stsci.edu:8081/job/RT/job/JWST/2185/
The new run is using my branch: https://github.com/braingram/stdatamodels/tree/feature/alternate_asdf_in_fits which has the same changes as the source branch for this PR but also has updated tags (to allow the version to exceed the requirement in jwst to prevent the regtests from installing stdatamodels from pypi).

@braingram
Copy link
Collaborator

Regtests 1285 showed 12 failures which are the same as the failures in 1284 (run against the pypi version of stdatamodels) and same as failures in other recent test runs.

So this PR looks good to merge to me.

@WilliamJamieson how do you want to time the merge of this PR with #98 and #112?

@WilliamJamieson
Copy link
Contributor

I think should wait #98 until this one is merged too.

That way it can just migrate all the changes to the pyproject.toml at once.

@braingram braingram requested a review from nden January 25, 2023 14:27
@braingram
Copy link
Collaborator

I think should wait #98 until this one is merged too.

That way it can just migrate all the changes to the pyproject.toml at once.

Sounds good. I added @nden for another review.

Copy link
Collaborator

@nden nden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the comment on the dependency of these changes on ASDF is not valid any more it should be removed to not confuse later reviewers.

@braingram
Copy link
Collaborator

If the comment on the dependency of these changes on ASDF is not valid any more it should be removed to not confuse later reviewers.

Thanks! I put a strike through that part of the conversation. Let me know if any other clean up would be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants