Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VIIRS L2 Cloud Mask reader #2316

Merged
merged 19 commits into from
Jan 13, 2023
Merged

Conversation

pnuu
Copy link
Member

@pnuu pnuu commented Dec 8, 2022

This PR adds a reader for VIIRS L2 Cloud Mask products produced by CSPP.

So far there's only support for reading cloud mask (cloud_mask) and binary cloud mask (cloud_mask_binary).

  • Closes #xxxx
  • Tests added
  • Fully documented
  • Add your name to AUTHORS.md if not there already

@pnuu pnuu self-assigned this Dec 8, 2022
@codecov
Copy link

codecov bot commented Dec 8, 2022

Codecov Report

Merging #2316 (899e426) into main (85e629f) will increase coverage by 0.01%.
The diff coverage is 98.42%.

@@            Coverage Diff             @@
##             main    #2316      +/-   ##
==========================================
+ Coverage   94.59%   94.60%   +0.01%     
==========================================
  Files         314      316       +2     
  Lines       47547    47674     +127     
==========================================
+ Hits        44976    45101     +125     
- Misses       2571     2573       +2     
Flag Coverage Δ
behaviourtests 4.49% <0.78%> (-0.01%) ⬇️
unittests 95.24% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
satpy/composites/__init__.py 90.45% <ø> (ø)
satpy/utils.py 24.70% <33.33%> (+0.07%) ⬆️
satpy/readers/viirs_l2.py 100.00% <100.00%> (ø)
satpy/tests/reader_tests/test_viirs_l2.py 100.00% <100.00%> (ø)
satpy/tests/test_utils.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@coveralls
Copy link

coveralls commented Dec 8, 2022

Coverage Status

Coverage: 95.2% (+0.01%) from 95.187% when pulling 899e426 on pnuu:feature-viirs-cspp-cloud-mask into 85e629f on pytroll:main.

@pnuu pnuu marked this pull request as ready for review December 14, 2022 10:28
Comment on lines 1 to 2
#!/usr/bin/env python
# -*- coding: utf-8 -*-
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those lines are redundant and do not add anything useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in c6c97a4

@@ -0,0 +1,155 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (c) 2011-2019 Satpy developers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The years seem wrong, should probably be 2022 or 2022-2023.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in c6c97a4


from satpy.readers.netcdf_utils import NetCDF4FileHandler

LOG = logging.getLogger(__name__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think lowercase, for example, logger, would fit better with style guides.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging wasn't even used, so removed it in 0720a1f

shape = self.get_shape(dataset_id, ds_info)
file_units = self._get_dataset_file_units(ds_info, var_path)

i = getattr(self[var_path], 'attrs', {})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this variable have a more descriptive name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 732ad51

def _get_dataset_file_units(self, ds_info, var_path):
file_units = ds_info.get('file_units')
if file_units is None:
file_units = self.get(var_path + '/attr/units')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If codecov is right, this could use a test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think this bit of code isn't even needed, the units are coming from the reader YAML. I'll just remove this if block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in c32ec3d

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a check that 'units' is in the dataset attributes added in 6e41d23

Comment on lines 150 to 151
yield is_avail, ds_info
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand what's going on here, but it seems this could also use a test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither do I. This was in the file handler I used as a template.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you define all the datasets in the yaml file, this function shouldn't be needed at all.

Comment on lines 1 to 2
#!/usr/bin/env python
# -*- coding: utf-8 -*-
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are redundant and don't add anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 526b454


def __init__(self):
"""Initialize the class."""
self._dir = tempfile.mkdtemp()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to use the pytest tmp_path fixture so you don't need to take care of the cleanup yourself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which way? Have the fixture passed to every test function and then pass the fixture to JRRCloudMaskFile.__init__()? When would the temp directory removed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be possible to use a class-scoped fixture. In this case the directory would be created before the first test in the class and deleted after the last one. There is also tmp_path_factory which is, I think, always session-scoped. Personally, I would just pass the (function scoped) fixture (or a fixture using the tmp_path fixture) explicitly to every test function needing it.

I don't feel strongly either way though; just a suggestion, no need to do anything as far as I'm concerned if you prefer to handle the mkdtemp yourself the way you've done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmp_path has the advantage of taking care of removing the files after it, creating custom path for each tests, etc...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, I would make a fixture that returns a created file using this class and tmp_path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After running the tests three times:

$ ls -1 /tmp/pytest-of-lahtinep/
pytest-0
pytest-1
pytest-2
pytest-current
$ du -ch /tmp/pytest-of-lahtinep/
24M	/tmp/pytest-of-lahtinep/pytest-1/test_cloud_mask_read_cloud_mas0
24M	/tmp/pytest-of-lahtinep/pytest-1/test_cloud_mask_read_latitude0
24M	/tmp/pytest-of-lahtinep/pytest-1/test_cloud_mask_read_longitude0
24M	/tmp/pytest-of-lahtinep/pytest-1/test_cloud_mas_read_binary_clo0
94M	/tmp/pytest-of-lahtinep/pytest-1
24M	/tmp/pytest-of-lahtinep/pytest-0/test_cloud_mask_read_cloud_mas0
24M	/tmp/pytest-of-lahtinep/pytest-0/test_cloud_mask_read_latitude0
24M	/tmp/pytest-of-lahtinep/pytest-0/test_cloud_mask_read_longitude0
24M	/tmp/pytest-of-lahtinep/pytest-0/test_cloud_mas_read_binary_clo0
94M	/tmp/pytest-of-lahtinep/pytest-0
24M	/tmp/pytest-of-lahtinep/pytest-2/test_cloud_mask_read_cloud_mas0
24M	/tmp/pytest-of-lahtinep/pytest-2/test_cloud_mask_read_latitude0
24M	/tmp/pytest-of-lahtinep/pytest-2/test_cloud_mask_read_longitude0
24M	/tmp/pytest-of-lahtinep/pytest-2/test_cloud_mas_read_binary_clo0
94M	/tmp/pytest-of-lahtinep/pytest-2
282M	/tmp/pytest-of-lahtinep/
282M	total

No cleanup. At all, as far as I can see.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try a couple times more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from https://docs.pytest.org/en/7.1.x/reference/reference.html#pytest.tmpdir.tmp_path :

By default, a new base temporary directory is created each test session, and old bases are removed after 3 sessions, to aid in debugging. If --basetemp is used then it is cleared each session. See The default base temporary directory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to update. Indeed, on the next run the 0 index directories were deleted and new 3 indexed were created. This could potentially leave quite a lot of stuff behind, so maybe there should be a more strict cleaning. But that's another topic outside of this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and now the tests use tmp_path fixture. I also added a note at the top of the file listing the external fixtures that are used. I'll be creating a PR discussing this in the dev documentation, and possibly also adding the notes to other tests having this issue of seemingly undefined fixtures in them.

Comment on lines +30 to +31
NUM_COLUMNS = 3200
NUM_ROWS = 768
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the test need to have such a large array? Does this make the test slower and use more resources than necessary?

Copy link
Member Author

@pnuu pnuu Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the shape of a single array in the actual data, so I'd say "yes, this is the shape it should be". Running all the current tests for the reader takes 0.7 seconds, so it's not a huge burden.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, some comments and questions

short_name: VIIRS CSPP Cloud Mask
long_name: VIIRS CSPP Cloud Mask data in NetCDF4 format
description: VIIRS CSPP Cloud Mask reader
status: alpha
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beta maybe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upgraded (not pushed yet).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 23b0a41

satpy/etc/readers/viirs_l2_cloud_mask_nc.yaml Show resolved Hide resolved

def __init__(self):
"""Initialize the class."""
self._dir = tempfile.mkdtemp()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmp_path has the advantage of taking care of removing the files after it, creating custom path for each tests, etc...


def __init__(self):
"""Initialize the class."""
self._dir = tempfile.mkdtemp()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, I would make a fixture that returns a created file using this class and tmp_path.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor comments, but it looks good!

satpy/readers/viirs_l2.py Outdated Show resolved Hide resolved
Comment on lines 150 to 151
yield is_avail, ds_info
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you define all the datasets in the yaml file, this function shouldn't be needed at all.

Comment on lines +631 to +632
elif isinstance(f, pathlib.Path):
local_files.append(f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codecov is complaining...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's weird, because without this addition the tests weren't passing locally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And adding a print inside that branch shows the filename.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets see if the complaint is still there after the tests complete for the latest changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a separate test in 596aa59

@pnuu
Copy link
Member Author

pnuu commented Jan 12, 2023

If you define all the datasets in the yaml file, this function shouldn't be needed at all.

For some reason can't reply on this in a thread.. Do you mean the whole available_datasets() could be removed?

@pnuu
Copy link
Member Author

pnuu commented Jan 12, 2023

Ok, seems to work also without the available_datasets() method, so removed it in de48145.

@mraspaud
Copy link
Member

All good for me, feel free to merge when the tests pass.

@pnuu
Copy link
Member Author

pnuu commented Jan 12, 2023

And one final fix for ColormapCompositor docstring that caused the RtD build to fail.

@pnuu pnuu merged commit c990b07 into pytroll:main Jan 13, 2023
@pnuu pnuu deleted the feature-viirs-cspp-cloud-mask branch January 13, 2023 06:14
@djhoese djhoese mentioned this pull request Jul 18, 2023
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants