Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "viirs_edr" reader #2531

Merged
merged 61 commits into from
Aug 18, 2023
Merged

Add "viirs_edr" reader #2531

merged 61 commits into from
Aug 18, 2023

Conversation

djhoese
Copy link
Member

@djhoese djhoese commented Jul 18, 2023

This is a replacement of #1972, #1483, and may be a replacement for the viirs_l2_cloud_mask_nc reader added in #2316. Basically I need to be able to read the NOAA Enterprise EDR files and all of these PRs share a little bit of logic but were never completed. My priority right now is the surface reflectance products and the vegetation indices. There are cloud top temp and height products that I may add here or in a separate PR.

The existing readers or PRs for readers all use the "L2" name. These are "EDRs". Functionally the same, but we should be consistent with the majority of NOAA documentation (which admittedly also occasionally refers to them as L2). There are also existing readers that I had added in the past called "viirs_edr_X". Given the nature of these products, I think this new reader can be called a generic viirs_edr name (no suffix). I haven't looked into most of the files I hope to support, but my hope is that the enterprise algorithms are relatively similar in file structure. Even if they aren't, they are NetCDF files and represent the same level of product for the same instrument. Anyone object?

Side note: Kathy Strabala and I thought maybe the surface reflectance products could be called "srf_i01" and so on. Reflecting that they are different than the SDR bands (TOA versus surface), but also being consistent with the "i01" style naming rather than the "i1" naming that the existing PRs are using. I could also see doing "I01" (uppercase letter) to be consistent with the viirs_sdr reader, but then I'd have to uppercase SRF which I don't really want to do.

I'll have to decide later whether or not it is appropriate to deprecate the viirs_l2_cloud_mask_nc reader in favor of this one.

Opinions? Please comment.

CC @pnuu @simonrp84 @nbearson

Edit: I see @simonrp84 used surf_refl_I01 for his surface reflectance products. Maybe I'll stick with that. Again, opinions welcome.

  • Closes #xxxx
  • Tests added
  • Fully documented

@djhoese djhoese added enhancement code enhancements, features, improvements component:readers labels Jul 18, 2023
@djhoese djhoese self-assigned this Jul 18, 2023
@codecov
Copy link

codecov bot commented Jul 18, 2023

Codecov Report

Merging #2531 (3b078da) into main (f74d4d6) will increase coverage by 0.00%.
Report is 14 commits behind head on main.
The diff coverage is 99.28%.

@@           Coverage Diff            @@
##             main    #2531    +/-   ##
========================================
  Coverage   94.87%   94.87%            
========================================
  Files         349      349            
  Lines       50855    51155   +300     
========================================
+ Hits        48248    48534   +286     
- Misses       2607     2621    +14     
Flag Coverage Δ
behaviourtests 4.29% <0.23%> (-0.03%) ⬇️
unittests 95.47% <99.28%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
satpy/readers/file_handlers.py 98.88% <ø> (ø)
satpy/readers/viirs_edr.py 98.08% <98.08%> (ø)
satpy/readers/__init__.py 98.29% <100.00%> (ø)
satpy/tests/reader_tests/test_viirs_edr.py 100.00% <100.00%> (ø)

... and 6 files with indirect coverage changes

@simonrp84
Copy link
Member

simonrp84 commented Jul 19, 2023

I prefer uppercase band names with leading zeros (I01 etc) for consistency with the SDR datasets.

Regarding the choice of name, I chose surf_refl_ simply for lack of another obvious option. My gut feeling is that srf_ could be confused with other variables, or even spectral response function, so to me spelling it out more clearly is the better option.

Seems fine to call the whole lot EDR rather than L2.

Copy link
Member

@pnuu pnuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as-is.

do I change all these hardcoded-in-YAML variables to be dynamically discovered so the reader can basically load anything?

The reader YAML is pretty long, so I'm not against it. The only downside I can think if is that to see what is supported (and by which name) one would need to first have the data and use available_dataset_names().

@djhoese
Copy link
Member Author

djhoese commented Aug 8, 2023

Yeah that is true, but I think that comes with the territory of a generic reader that reads generic files (NOAA's enterprise algorithm output). @simonrp84 thoughts?

Copy link
Member

@simonrp84 simonrp84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor comments. Personally, I'd like the masking of invalid data to be optional rather than hardcoded.
My comment about the number of columns is due to bitter experience of being hit by granules varying from the "typical" number of columns in the past!

satpy/etc/composites/viirs.yaml Show resolved Hide resolved
satpy/readers/viirs_edr.py Show resolved Hide resolved
def get_dataset(self, dataset_id: DataID, info: dict) -> xr.DataArray:
"""Get the dataset."""
data_arr = self.nc[info['file_key']]
data_arr = self._mask_invalid(data_arr, info)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be made optional? Some users may wish to access the unmasked data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the NDVI/EVI special case with QF checks, I think it could be optional since that's very complex and opinionated. Otherwise, I'm not sure I agree. This is checking fill values and valid ranges. Why would a user need to have that? The L2 products don't encode anything with different fill values as far as I know...I think, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant to comment on the other _mask_invalid, the one for the NDVI/EVI. If we could make that one optional then that'd be great. This one, that I actually comment on, is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done now. I had a very non-creative moment and went with filter_veg as the kwarg. Other ideas?

@djhoese
Copy link
Member Author

djhoese commented Aug 15, 2023

@pnuu I added your viirs_l2_cloud_mask_nc reader to the PENDING_OLD_READER_NAMES dictionary. This basically tells users "here's the name you used, but you should use this name. For now Satpy will automatically switch to the new reader". But in the case of our two readers aren't just a rename, the variable names are different. This is causing the tests for your reader to fail because they try to load variables that don't exist in the new reader. I feel like my only two options are:

  1. Remove the code and tests for your reader. This is something I forgot about this deprecation functionality I added many years ago. The idea is that the old reader doesn't exist anymore.
  2. Don't deprecate your reader.

What do you think?

@pnuu
Copy link
Member

pnuu commented Aug 16, 2023

Go with option 1.

I tried the reader, and currently I get this error:

glbl.load(['CloudMaskBinary'])
*** KeyError: "No dataset matching 'DataQuery(name='longitude_jrr_cloudmask', resolution=750)' found"

Looking at the files I have, the coordinate variables are Latitude and Longitude.

@djhoese
Copy link
Member Author

djhoese commented Aug 16, 2023

I did detect one bug late last night that I'll work on this morning. I'll try to find the same files you're using and figure out the issues.

@djhoese
Copy link
Member Author

djhoese commented Aug 16, 2023

Oh FYI @pnuu that error message still makes sense. The variables for CloudMaskBinary actually makes sense because I name the individual lon/lat datasets based on the file type they are coming from. I decided to change how Simon had done it in the original version of this reader where it assumed you were only ever loading one file type at a time and that all lon/lats were equivalent across files. So it was doing "oh we found longitude in this file type, use that for all longitude requests", but now it always loads lon/lat from the file for a particular dataset.

I'm still debugging things...

@djhoese
Copy link
Member Author

djhoese commented Aug 16, 2023

@pnuu Try now. Recent commits should fix a couple important issues and have removed your readers python and YAML.

@pnuu
Copy link
Member

pnuu commented Aug 17, 2023

Thanks, seems to be working as before (after reader and channel name adjustments, ofc), good job!

@djhoese
Copy link
Member Author

djhoese commented Aug 17, 2023

Great. I think after I fix a few more bugs I've encountered this should be ready to merge.

@djhoese djhoese merged commit 2958fe9 into pytroll:main Aug 18, 2023
19 checks passed
@djhoese djhoese deleted the viirs_edr_jrr branch August 18, 2023 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:readers enhancement code enhancements, features, improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants