-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caching requests #68
Comments
Hi @mps9506! I get you and yet I'm a bit conflicted on this. One of my other packages, getCRUCLdata, offers caching for this very reason and was recently OTOH, I completely understand what you're saying. I do the same thing as well and having a cache would make it better and having it built into nasapower reduces friction for end-users. Is using something like targets an option or is it too heavy? If targets were an option, I'm happy to create documentation on how to use it to cache data in situations like this for nasapower. What sort of mechanism were you envisioning for a PR? |
targets is definitely a valid workflow. Probably overkill for the use case I am thinking (or what I was doing which triggered this issue), small projects or scripts that might get shared with newish users.
I would have proposed using hoardr as I found it really easy to implement in one of my own API packages. hoardr primarily provides a convenient R6 class to expose and use file paths provided by rapddirs. My naive understanding is that the user cache directory is safe to use. Alternatively, I think writing to a temp file is consistent with CRAN, it just would cache a temp file with every new session (I think). I understand the hesitation to trigger rejection/removal by CRAN. Caching files would be nice, but ultimately users (me) can choose to be more diligent about doing it for themselves. |
Yeah, I'm familiar with hoardr, I've used it and went directly with rappdirs to pare down the interface as it added more than what really was necessary for such a simple data set as the CRU CL 2.0. The whole cache dir seems to be part of that moving target and I've enough issues with an API client and the whims of CRAN so I'm reluctant to add to it. However, is memoise an option for a user to employ? https://memoise.r-lib.org |
I can't believe I've never run into memoise before! Yeah, that is easy for users to implement: > mem_get_power <- memoise(get_power)
> system.time(
+ mem_get_power(community = "ag",
+ pars = c("T2M_MAX"),
+ temporal_api = "daily",
+ lonlat = c(-95.358421, 29.749907),
+ date = c("2001-01-01", "2021-12-31"))
+ )
user system elapsed
0.33 0.03 3.14
> system.time(
+ mem_get_power(community = "ag",
+ pars = c("T2M_MAX"),
+ temporal_api = "daily",
+ lonlat = c(-95.358421, 29.749907),
+ date = c("2001-01-01", "2021-12-31"))
+ )
user system elapsed
0.01 0.00 0.02
> df <- mem_get_power(community = "ag",
+ pars = c("T2M_MAX"),
+ temporal_api = "daily",
+ lonlat = c(-95.358421, 29.749907),
+ date = c("2001-01-01", "2021-12-31"))
> df
NASA/POWER CERES/MERRA2 Native Resolution Daily Data
Dates (month/day/year): 01/01/2001 through 12/31/2021
Location: Latitude 29.7499 Longitude -95.3584
Elevation from MERRA-2: Average for 0.5 x 0.625 degree lat/lon region = 24.97 meters
Value for missing model data cannot be computed or out of model availability range: NA
Parameter(s):
Parameters:
T2M_MAX MERRA-2 Temperature at 2 Meters Maximum (C)
# A tibble: 7,670 x 8
LON LAT YEAR MM DD DOY YYYYMMDD T2M_MAX
<dbl> <dbl> <dbl> <int> <int> <int> <date> <dbl>
1 -95.4 29.7 2001 1 1 1 2001-01-01 7.31
2 -95.4 29.7 2001 1 2 2 2001-01-02 4.65
3 -95.4 29.7 2001 1 3 3 2001-01-03 4.95
4 -95.4 29.7 2001 1 4 4 2001-01-04 15.3
5 -95.4 29.7 2001 1 5 5 2001-01-05 18.5
6 -95.4 29.7 2001 1 6 6 2001-01-06 20.3
7 -95.4 29.7 2001 1 7 7 2001-01-07 20.2
8 -95.4 29.7 2001 1 8 8 2001-01-08 13.3
9 -95.4 29.7 2001 1 9 9 2001-01-09 15.1
10 -95.4 29.7 2001 1 10 10 2001-01-10 14.6
# ... with 7,660 more rows I don't see any reason to add my feature request in that case! I'll go ahead and close. Thanks for talking that through. |
Hi,
Thanks for this great package! I have a brief feature request.
Occasionally I get lazy and leave an API request in my script or rmarkdown document instead of running a data download script/function separately and storing the data locally. This can result in the same request being made to the server every time the function is re-run. If I just need one data snapshot, that request is totally redundant.
Instead of repeatedly making the same exact API request,
get_power()
could have an argument to locally cache the downloaded data and access it whenever the same exact request is made. I believe rnoaa and a few other packages do something similar.If this seems reasonable and something you are open to incorporating I can contribute a pull request at some point.
Thanks!
The text was updated successfully, but these errors were encountered: