-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add auxiliary data download API #1513
Conversation
DeepCode failed to analyze this pull requestSomething went wrong despite trying multiple times, sorry about that. |
This seems like a really good idea, the |
Thinking of how our operational environment doesn't have access to internet by default, I was just wondering if there is a location that I can set the data in that will automatically search by satpy, outside of the home directory? So that I can install the static files there from eg an rpm package |
@mraspaud Yes, that's the whole point. Just combine example 2 and 3 from above. Or set the environment variable |
@mraspaud To clarify, there is no location outside of the home directory that is searched by default for data. The data directory needs to be one directory (not a series of directories) and by default we assume the user doesn't have permission to write to a system directory (since Satpy will assume it has to write at some point). Depending on how your operational processing works (docker containers?) there may be a way to soft link or mount a directory to the default location that Satpy will look. |
Thanks for clarifying. I guess I'll use the SATPY_DATA_DIR then |
Codecov Report
@@ Coverage Diff @@
## master #1513 +/- ##
==========================================
+ Coverage 91.65% 92.08% +0.42%
==========================================
Files 248 251 +3
Lines 36167 36660 +493
==========================================
+ Hits 33150 33759 +609
+ Misses 3017 2901 -116
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@mraspaud and others, I'm seeking re-review. I have solutions for everything I wanted to do, but I'm not overjoyed with how they all have to work. The main one is data downloads in a reader. Do to this you have to:
In summary, the file handler ends up having to know the class name of the Reader to be able to get at (download or cached) the file. Edit: The only remaining thing is a dedicated documentation page on this in the dev guide and a section in the custom reader page. |
Without looking at the code (late night, ...):
|
@pnuu here are a few points to clarify some of the design here and how things work in general and to address your points here and some you mentioned on slack.
|
This has now been updated based on the comments by @mraspaud on slack. It is now ready for re-review. |
@mraspaud after our conversation at the status meeting, I think I've implemented the big important things:
If you're OK with it I'd like to avoid the complications of the safe YAML loading and downloading files. This is a bit more complex and would probably require changes to a lot of lower-level functions (the ones that get the reader configs and writer configs and stuff). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I just have some minor review items inline.
Ok I think I've addressed all of your suggestions/concerns. |
Actually, I didn't rename the module. Maybe I will. One sec... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The Problem
There are a couple cases in Satpy where additional files are needed to completely function. Examples include the StaticImageCompositor, "crefl" rayleigh correction which uses data files for elevation values, or the MiRS reader in #1511 that needs LUTs to properly perform some limb correction. It is a pain for users to have to download these files and set an environment variable (previously SATPY_ANCPATH) just to use something that is builtin to Satpy. This should be automatic.
The Solution
This pull request utilizes the pooch library to allow Satpy components to download the files that they need to operate. Pooch allows us to cache the files and check hashes. It has some versioning support but we aren't using it. It also has some typical use cases that we aren't using because of the dynamic nature of Satpy's components (ex. a user customizing things in YAML).
There are a lot of TODOS left for this PR, but the general idea is sketched out here and I'm looking for feedback.
Example 1 - Satpy Component retrieves data file
Example 2 - User wants to download all files for offline use
Example 3 - User wants to customize data location
flake8 satpy