Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SpectrumList loader for JWST data #579

Merged
merged 7 commits into from
Feb 17, 2020

Conversation

jdavies-st
Copy link
Contributor

@jdavies-st jdavies-st commented Jan 31, 2020

  • Allow the reader to read either flux or surf_bright
  • Generalize the extraction of units to track column name, not column number
  • Read in the uncertainty array
  • Fix up some docstrings elsewhere
  • Write tests, tests, tests

Note that extract1d spectral products in the pipeline currently do not have populated uncertainty arrays. They are all zeros. That will be changing in the next month or so.

Copy link
Contributor

@nmearl nmearl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good @jdavies-st! I'd recommend adding a final test that is expected to fail to handle the case where the srctype is malformed (e.g. not point or extended), but otherwise, this is great.

@jdavies-st
Copy link
Contributor Author

Good point. I believe srctype can sometimes be "UNKNOWN" as well, so this would be a good thing to check.

Copy link
Member

@eteq eteq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor (but potentially important) suggestion, but otherwise I think this looks good.

An additional suggestion, which might be better answered as "this is better done as a follow-on PR" - there's not really any part of the documentation that explains how to use this from the user perspective. There's https://specutils.readthedocs.io/en/stable/spectrum1d.html#reading-from-a-file but it doesn't have enough information for the user to really grasp anything about the subtelties of Spectrum1D vs SpectrumList and so on (or even how to select the appropriate reader if they want a particular one), while https://specutils.readthedocs.io/en/stable/custom_loading.html is geared towards making a loader rather than using one. Perhaps this loader should be mentioned in the docs as an example of when you might use SpectrumList instead of Spectrum1D?

return True
# This probably means we didn't have a FITS file
except Exception:
return False


@data_loader("JWST", identifier=identify_jwst_fits, dtype=SpectrumList,
@data_loader("JWST", identifier=identify_jwst_x1d_fits, dtype=SpectrumList,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@data_loader("JWST", identifier=identify_jwst_x1d_fits, dtype=SpectrumList,
@data_loader("JWST_x1d", identifier=identify_jwst_x1d_fits, dtype=SpectrumList,

I think this is necessary if we want to have a whole family of JWST loaders, right? I.e., if we don't do this there's no way to use the name to specify which of the JWST loaders to use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this is necessary as well. Would this be a backwards incompatible change though? Does that matter?

Perhaps this is where my limited knowledge of the infrastructure of the io registry and specifically in how it has been implemented in specutils works.

@eteq
Copy link
Member

eteq commented Feb 7, 2020

A related thought to the above (but somewhat independent): should there also be a Spectrum1D that can be used for JWST data sets that have just one spectrum in them? I don't know if this is actually an at all common use case, but it seems like it would be a trivial addition to this if there are any?

(This could also be a separate issue to not delay this particular PR)

  - Retain SpectrumList.read() on multi-object x1d functionality
  - Add Spectrum1D.read() on single spectrum x1d files
  - Return RuntimeError if trying to use Spectrum1D.read on multi-object x1d data
  - Update tabular-fits reader to ignore JWST data
@jdavies-st
Copy link
Contributor Author

I've added a reader for single spectrum x1d files for the Spectrum1D class. If passed multiple spectra, it raises RuntimeError. I've also expanded the testing. Hopefully I've responded to all review comments now.


error_units = u.Unit(hdu.columns["error"].unit)
uncertainty = StdDevUncertainty(hdu.data["error"] * error_units)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having just sneaked into a discussion on the pros and cons of using the Table API for loading hdu data, I am wondering if this could be streamlined to

            data = Table.read(hdu)

            if srctype == "POINT":
                flux = Quantity(data["flux"])
                uncertainty = StdDevUncertainty(data["error"])
...

and so on for the other srctypes – see e.g. for comparison muscles_sed.py.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks good! I'll give it a try. Thanks for the tip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That worked very well. Thanks. Added in a new commit. And I added some units checking to the tests, something I had overlooked before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remembered that parsing_utils has a function spectrum_from_column_mapping that in principle would even allow to simply specify this as dictionaries like

            if srctype == "POINT":
                column_mapping = {'wavelength': ('spectral_axis', None),
                                  'flux': ('flux', None),
                                  'error': ('uncertainty', None)}
            elif srctype == "EXTENDED":
                column_mapping = {'wavelength': ('spectral_axis', None),
                                  'surf_bright': ('flux', None),
                                  'sb_error': ('uncertainty', None)}

but this would require #573 to be merged to work without explicitly specifying the units - just for reference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add that simplification in a future PR. 👍

  - The 2 identifiers can distinguish between single and multi spec x1d
    files and load it into the appropriate Spectrum1D or SpectrumList
  - Add tests for this and for warning when error array is zeros
@jdavies-st
Copy link
Contributor Author

And finally I added some population of the Spectrum1D.meta attribute. Basically concat together the PRIMARY FITS header and the per-EXTRACT1D extension header into the meta dict as key:value pairs. This doesn't retain the header object comments, but perhaps this is preferable to having the whole FITS header as a value in a header dict key?

@nmearl
Copy link
Contributor

nmearl commented Feb 17, 2020

@jdavies-st I think the method of concatenating the header is fine, I'm not see any reason why it should cause issues now. But we should remind ourselves that we did it if we do run into programs. This looks great!

@nmearl nmearl merged commit 2da78bd into astropy:master Feb 17, 2020
@jdavies-st jdavies-st deleted the loaders-jwst branch February 21, 2020 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants