-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submission: ramlegacy #264
Comments
thanks for your submission @kshtzgupta1 We are discussing now and will get back to you soon |
sorry for delay on this @kshtzgupta1 - we have some conflicts of interest we're thinking through ... |
Editor checks:
Editor comments@kshtzgupta1: Thanks for submitting a very interesting package. This is the first time I'm serving as an editor for ROpenSci, so I will be learning some parts of the editorial process as I edit your submission. Please let me know if you have any questions throughout. I have a bit of feedback based on an initial look at the submission and some initial editorial checks. Submission summary:In your submission summary, could you please:
Checks from
|
@geanders Thanks for your response. I have updated the submission as per your suggestions. |
Thank you, @kshtzgupta1! Could I also ask you to add an rOpenSci review badge to the README file for your package? The full link for this should be:
|
@geanders I have added the badge. |
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Functionality
Final approval (post-review)
Estimated hours spent reviewing: 5 Review CommentsSummaryI think this is a great package that provides a clear set of instruction on how to download the ramlegacy database. The biggest issue in the package I see is the reliance of the
library(ramlegacy)
#> * Multiple versions found including the latest one: 4.3 . Loading the latest version.
#> * Loading version 4.3 ...
#> v Version 4.3 has been successfully loaded.
TC_with_names <- merge(
timeseries_values_views_v4.3[!is.na(timeseries_values_views_v4.3$TC), c("stockid", "TC")],
stock_v4.3[stock_v4.3$region == "West Africa",],
by = "stockid"
)
hist(TC_with_names$TC) Created on 2019-01-08 by the reprex package (v0.2.1)
library(ramlegacy)
#> * Multiple versions found including the latest one: 4.3 . Loading the latest version.
#> * Loading version 4.3 ...
#> v Version 4.3 has been successfully loaded.
class(stock_v4.3)
#> [1] "tbl_df" "tbl" "data.frame"
head(stock_v4.3, 10)
#> stockid tsn scientificname commonname
#> 1 ACADRED2J3K 166774 Sebastes fasciatus Acadian redfish
#> 2 ACADRED3LNO-UT12 166774 Sebastes fasciatus Acadian redfish
#> 3 ACADREDGOMGB 166774 Sebastes fasciatus Acadian redfish
#> 4 ACADREDUT3 166774 Sebastes fasciatus Acadian redfish
#> 5 ACMACKSARG 172413 Scomber colias Argentine chub mackerel
#> 6 AFLONCH 166156 Beryx splendens Alfonsino
#> 7 ALBAIO 172419 Thunnus alalunga albacore tuna
#> 8 ALBAMED 172419 Thunnus alalunga albacore tuna
#> 9 ALBANATL 172419 Thunnus alalunga Albacore tuna
#> 10 ALBANPAC 172419 Thunnus alalunga Albacore tuna
#> areaid stocklong
#> 1 Canada-DFO-2J3K Acadian redfish NAFO-2J3K
#> 2 Canada-DFO-3LNO-UT12 Acadian redfish Units 1-2 and NAFO-3LNO
#> 3 USA-NMFS-5YZ Acadian redfish Gulf of Maine / Georges Bank
#> 4 Canada-DFO-UT3 Acadian redfish Unit 3
#> 5 Argentina-CFP-ARG-S Argentine chub mackerel Southern Argentina
#> 6 multinational-SPRFMO-CH Alfonsino Chile
#> 7 multinational-IOTC-IO Albacore tuna Indian Ocean
#> 8 multinational-ICCAT-MED Albacore tuna Mediterranean
#> 9 multinational-ICCAT-NATL Albacore tuna North Atlantic
#> 10 Multinational-ISC-NPAC Albacore tuna North Pacific
#> region inmyersdb myersstockid
#> 1 Canada East Coast 0 <NA>
#> 2 Canada East Coast 0 <NA>
#> 3 US East Coast 0 <NA>
#> 4 Canada East Coast 0 <NA>
#> 5 South America 0 <NA>
#> 6 South America 0 <NA>
#> 7 Indian Ocean 0 <NA>
#> 8 Mediterranean-Black Sea 0 <NA>
#> 9 Atlantic Ocean 0 <NA>
#> 10 US West Coast 0 <NA>
library(tibble)
stock_v4.3
#> # A tibble: 1,294 x 9
#> stockid tsn scientificname commonname areaid stocklong region
#> <chr> <dbl> <chr> <chr> <chr> <chr> <chr>
#> 1 ACADRE~ 166774 Sebastes fasc~ Acadian r~ Canad~ Acadian ~ Canad~
#> 2 ACADRE~ 166774 Sebastes fasc~ Acadian r~ Canad~ Acadian ~ Canad~
#> 3 ACADRE~ 166774 Sebastes fasc~ Acadian r~ USA-N~ Acadian ~ US Ea~
#> 4 ACADRE~ 166774 Sebastes fasc~ Acadian r~ Canad~ Acadian ~ Canad~
#> 5 ACMACK~ 172413 Scomber colias Argentine~ Argen~ Argentin~ South~
#> 6 AFLONCH 166156 Beryx splende~ Alfonsino multi~ Alfonsin~ South~
#> 7 ALBAIO 172419 Thunnus alalu~ albacore ~ multi~ Albacore~ India~
#> 8 ALBAMED 172419 Thunnus alalu~ albacore ~ multi~ Albacore~ Medit~
#> 9 ALBANA~ 172419 Thunnus alalu~ Albacore ~ multi~ Albacore~ Atlan~
#> 10 ALBANP~ 172419 Thunnus alalu~ Albacore ~ Multi~ Albacore~ US We~
#> # ... with 1,284 more rows, and 2 more variables: inmyersdb <dbl>,
#> # myersstockid <chr> Created on 2019-01-07 by the reprex package (v0.2.1)
Possible future workI wonder if it would be useful to engage Sean Anderson to see if he would be willing to submit a pull request with his code that converts the Access database into an sqlite database. I think there would be an appetite for the sqlite option which could exist alongside the Excel version. His process requires an additional utility outside of R but maybe a R-only solution is available. |
Thank you for the very thorough review, @boshek! @kshtzgupta1, there will be a second review coming, as well, so you may want to wait for that before you begin addressing reviewer comments. |
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Functionality
Final approval (post-review)
Estimated hours spent reviewing: 4 General CommentsThere is only one version of the RAM database now available on ramlegacy.org, version 4.4. I believe the timing of this package going out for review and the recent website update were around the same time (December 20). That being said, there must be a reason the maintainers of the website no longer host the older versions. I encourage you to reach out to them to understand why. I know them and can make the connection if needed. As it is now, everyone using this package to download older versions will be retrieving the backups on GitHub. I believe this package should only allow retrieval of data that is also available on the site. Sean Anderson has a package of the same name and same basic functionality at https://github.com/seananderson/ramlegacy. I have used this package for my own work in the past although in recent years I’ve been receiving updated RAM data directly via a DropBox link from the maintainers. I wholeheartedly support development of an R package to access the data such as this one and the one from Sean because the current data distribution methods are unreliable and opaque. That being said, I strongly encourage the developers of this package to work with Sean and the maintainers of the RAM legacy database to get this package functional. Again, I can help facilitate the connection as I’ve worked with Sean closely over the past few years. As a frequent user of the RAM data, I know where to go to find information about what each of the loaded tables are. I understand the comment from the other reviewer about providing more information, but if this package is purely intended to just retrieve the RAM data from the website, I think these functions are good as is. I can come up with a myriad of other functions I would like to see from this R package based on the common wrangling tasks I do with this database, but I believe those could be added through feature requests and aren’t a reason to hold this package up. More specific commentsI first tried > download_ramlegacy(version = "4.4")
Error: Invalid version number. Available versions are 1.0, 2.0, 2.5, 3.0, 4.3, The maintainers of this package will have to be monitoring ramlegacy.org carefully to know when a new version is released to then update this package. I can forsee some user issues with the download_ramlegacy(version = NULL, ram_path = NULL,
ram_url = "https://depts.washington.edu/ramlegac/wordpress/databaseVersions") First, it looks like there is a typo ( Also the > load_ramlegacy(version = 4.3, path = "~/github")
★ Loading version 4.3 ...
Error in readRDS(path) : error reading from connection
In addition: Warning message:
In readRDS(path) : error reading the file I suggest changing the function so that if something is entered in as a path argument, it returns a message reminding the user to not change the path argument. Or, remove this as an argument entirely. |
I just wanted to follow up and it seems within the past week, the maintainers of ramlegacy.org have changed where the database is kept. It is now hosted on Zenodo, which will require some changes to this package for accessing the data. |
Thank you to @boshek and @jafflerbach for your very thoughtful reviews! @kshtzgupta1 : We now have both reviews in, so if you haven't already, you can begin in responding to the reviewers' comments and suggestions. In particular, it sounds like some changes to ram legacy.org, brought up by @jafflerbach, will require some package changes. Please let me know if you have any questions as you prepare your response. |
Also worth noting that the authors are now including (or maybe always had) an .RData file will and an R script to load the database into a users session much like what this package does albeit in a less refined way. |
Thanks everyone for great reviews and discussion, this is very helpful. @kshtzgupta1 and I will need a bit more time to really work through and address all of these properly, but meanwhile will just make a few quick observations relevant to the immediate discussion. @jafflerbach Thanks! Yes, we've been in discussion with Mike Melnychuk since September about moving the data to Zenodo to avoid the problems created by down-time and re-organization of the ramlegacy.org website, so we were very happy to hear from them about this development and look forward to using those endpoints for the package. While they have plans to upload older versions to Zenodo as well, these aren't currently linked up as related versions (a la https://blog.zenodo.org/2017/05/30/doi-versioning-launched/), hopefully they will be able to do that to provide a source for the original versions. So hopefully soon the Zenodo hosting will provide a much more stable source for any versions of the data but may be a bit of a moving target for a little longer. (This switch to the versioning approach would also allow automated resolution of the most recent version). |
Nice work @kshtzgupta1. @geanders I won't be able to look at this until late next week but can prioritize it then. |
@kshtzgupta1 : Thanks so much for these revisions! Could you insert a link in your comment to the Issues where you give point-by-point reviews? That would help @boshek and @jafflerbach find them more easily. @boshek : Thanks for letting me know your timeline. That works fine. |
Actually, @kshtzgupta1, could I ask you to copy your point-by-point responses over to this thread? It would be great to keep the full conversation on a single thread. |
@boshek Thank you for a detailed review! We have made many changes following your suggestions. I will update the README and vignette to reflect those changes after they are approved.
Fixed that line!
Fixed!
Revised!
Added!
Agreed! We have removed the loading behavior of
Excellent suggestion! We have added a
Having
Added!
We will definitely be open to including more educating vignettes if the maintainers of the database wanted to do that and certainly be willing to link more informative documentation in the vignette but at the same time we want to avoid writing things that are out-of-date or not coming from the maintainers. Also, in our opinion educating a new user in the database might be a bit outside the scope of the package which was primarily to improve access to the data.
We believe providing access to older versions can be really useful for users trying to reproduce older research papers and studies.
Prof. Boettiger was in touch with the RAM maintainers regarding that. While the maintainers have moved the latest versions (4.40, 4.41, 4.44) to Zenodo they still have to do the same for the older versions. So till that happens the package will have to use the github repo to make the older versions available to the users.
Added to
It was intentional. We wanted to give the user the option to choose whether they wanted the tables as tibbles or dataframe. If for some reason a user prefers
We couldn't reproduce these on our machines. I think they might be specific to your machine and may be occurring because you don't have Tex installed.
Agreed! I will run it after all the changes have been approved.
We are definitely open to working with Sean Anderson. I think Prof. Boettiger has pinged him. |
@jafflerbach Thank you for a detailed review! We have made many changes following your suggestions. I will update the README and vignette to reflect those changes after they are approved.
Fixed!
We couldn't reproduce that warning on our machines. I think it might be specific to your machine and may be occurring because you don't have Tex installed.
I believe Prof. Boettiger has reached out to Sean regarding that.
> download_ramlegacy(version = "4.4")
Error: Invalid version number. Available versions are 1.0, 2.0, 2.5, 3.0, 4.3, This should be resolved now!
download_ramlegacy(version = NULL, ram_path = NULL,
ram_url = "https://depts.washington.edu/ramlegac/wordpress/databaseVersions")
It was actually
> load_ramlegacy(version = 4.3, path = "~/github")
★ Loading version 4.3 ...
Error in readRDS(path) : error reading from connection
In addition: Warning message:
In readRDS(path) : error reading the file
Although the package downloads and caches the database in the user's rappdirs directory by default we have now decided that having |
Great job @kshtzgupta1! Thanks for such a great package and great job addressing issues raised in this review. You have sufficiently addressed any issues I raised. @geanders I recommend this package for acceptance. I do want to raise some final considerations:
I am really of two minds on this. I employ this same trick on a couple packages of mine but I also wonder if this introduces some weird ambiguity that relies on the side effect of having a package loaded. Though I don't think the package acceptance is contingent on this, I would in this case recommend imposing a tibble on the user.
I still don't love this solution. To me, this places too much of the package functionality at the mercy of that repo being available rather having a direct line to the data. I think that it is worth the sacrifice of having fewer versions available to drop this solution. That said, I similarly don't think this is sufficient to hold up acceptance. |
Hi Sam @boshek , thanks for the excellent review and the kinds words! I don't mean to jump in here but just wanted to add a little context to the decision about accessing older versions. I agree that it would be much more desirable to have all versions on Zenodo, and based on our discussions with them, I think the RAM Legacy team will eventually post those, but it is hard to know exactly when. Meanwhile, I do believe access to the old versions is critical, all the more so for them not being available anywhere else now. Among other reasons, there are dozens of high profile papers based on these older versions; just today there is yet another appearing in Science that uses the version 3 data. |
Great context @cboettig . This rationale makes perfect sense. |
@jafflerbach : I wanted to check in with you to see if you had any thoughts on the response to your initial review for this package? |
Also just a note that we did just hear back from Sean Anderson who was very gracious and positive about the package. He gave us a few suggestions (data tables no longer append the version name on the table, which was no longer necesary anyway with new approach of explicitly calling Thanks Sam, Jamie & Brooke for your feedback so far! @kshtzgupta1 summarizes the edits in reply to Jamie above, these are waiting on an PR from an ropensci-review branch in the package meanwhile. |
Thanks @geanders, @kshtzgupta1 and @cboettig. Good hear you've been in touch with Sean Anderson as well as the RAM maintainers. No further requests from me on this package now. |
Approved! @kshtzgupta1 : Thanks so much for these thoughtful revisions. Both reviewers agree that the package should be accepted. Further, based on there reviews, there are a few potential changes you might want to consider in future revisions (e.g., removing the reliance on the GitHub repos of older database versions once this is possible, thinking some more about whether you want different behavior based on whether the user has There are some things we'll need to do for the final processing of this package. I'm including the standard to-dos as a check list here. I'd like to add the caveat that this will be my first time doing the editor-side part of this process, so please bear with me as I figure out my end of things for the process! To-dos:
Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them Welcome aboard! We'd also love a blog post about your package, either a short-form intro to it (https://ropensci.org/tech-notes/) or long-form post with more narrative about its development. (https://ropensci.org/blog/). If you are interested, @stefaniebutland will be in touch about content and timing. We've started putting together a gitbook with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here. |
Also, since you are planning to submit to CRAN, you may find this list of CRAN gotchas helpful, and I'd be happy to provide support through that process. |
Hello @kshtzgupta1 I'm rOpenSci's Community Manager, here to say we would love to feature a post about This link will give you many examples of blog posts by authors of peer-reviewed packages so you can get an idea of the style and length you prefer: https://ropensci.org/tags/software-peer-review/. Shorter tech notes are here: https://ropensci.org/technotes/ Here are some technical and editorial guidelines for contributing a blog post: https://github.com/ropensci/roweb2#contributing-a-blog-post. Please let me know if you're interested and we can discuss a deadline. Happy to answer any questions. |
My apologies for such a late reply @geanders. I was ill and needed to have a surgery. I have finished all the to-dos. Thank you for being such a helpful editor! |
@boshek Thank you for being such a great reviewer! Is it fine if I add you as a "rev"-type contributor in the Authors@R field? If so, can I put in your email as [email protected] and your orcid id as https://orcid.org/0000-0002-9270-7884 ? |
@jafflerbach Thank you very much for reviewing the package! I would like to acknowledge you as a "rev"-type contributor in the Authors@R field if that's okay with you. Can I put in your email as [email protected] and orcid id as https://orcid.org/0000-0002-5215-9342 ? |
@stefaniebutland Thank you so much for reaching out. My sincere apologies for the late reply. I think Prof. Boettiger and I would definitely be interested in authoring a blog post about ramlegacy. I'll discuss the content of the post with Prof. Boettiger and let you know about a timeline very soon. |
Yep both of those are correct.
Jamie Afflerbach
Marine Data Scientist
National Center for Ecological Analysis and Synthesis (NCEAS
<https://www.nceas.ucsb.edu/>)
University of California, Santa Barbara
website <http://jamieafflerbach.com> - github
<https://github.com/jafflerbach>- twitter <https://twitter.com/jafflerbach>
…On Wed, Apr 3, 2019 at 1:39 PM Kshitiz Gupta ***@***.***> wrote:
@jafflerbach <https://github.com/jafflerbach> Thank you very much for
reviewing the package! I would like to acknowledge you as a "rev"-type
contributor in the ***@***.*** field if that's okay with you. Can I put in
your email as ***@***.*** and orcid id as
https://orcid.org/0000-0002-5215-9342 ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#264 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGhojjYhjkc8-ulVxEPQIbg29n4-vmWDks5vdRF4gaJpZM4Ysexu>
.
|
@kshtzgupta1 : No worries at all! Alright, I will work through the next steps on my end. Again, since this is the first package I've edited to reach this point, it might take me a bit longer than usual to make sure I've gotten through any required steps. Congrats on a very nice package! |
@kshtzgupta1 Please look for an email from me to follow up about a blog post |
@kshtzgupta1 : I think we are all set with our post-approval steps, so I'm going to close this issue. Congrats on a really nice package! |
Summary
ramlegacy
helps users access the excel version of the RAM Legacy Stock Assessment Database from www.ramlegacy.org. Data is freely available from the databse website, but downloading and reading in the data by hand can be time-consuming.ramlegacy
is capable of downloading, reading and caching all the available versions of the database.www.github.com/kshtzgupta1/ramlegacy
data retrieval:
ramlegacy
downloads and reads in multiple versions of the excel version of the RAM Legacy Stock Assessment Database from the database's website.Any one who needs access to RAM Legacy Stock Assessment Database: Fisheries Biologists, Conservationists, Students, Teachers, etc.
yours differ or meet our criteria for best-in-category?
Sean Anderson has a namesake package not published to CRAN and it appears to be a stalled project on GitHub (last updated 9 months ago). However, unlike this package which supports downloading and reading in the Excel version of the database, Sean Anderson's project downloads the Microsoft Access copy of the database and converts it to a local sqlite3 database.
There is also
RAMlegacyr
, an older package last updated in 2015. Similar to Sean Anderson's project, the package seems to be an R interface only for the Microsoft Access version of the RAM Legacy Stock Assessment Database and provides a set of functions using RPostgreSQL to connect to the database.Requirements
Confirm each of the following by checking the box. This package:
Publication options
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Detail
R CMD check
(ordevtools::check()
) succeed? Paste and describe any errors or warnings:Please note that running R CMD check on ramlegacy may result in the following warning if the user has not yet downloaded a version of the RAM Legacy Stock Assessment Database.
No version of the database has yet been downloaded. Use function download_ramlegacy() to download a version now.
Also note that R CMD check on this package will result in the following error in the testing suite if the user is not online when checking the package.
Error: Could not connect to the internet. Please check your connection settings and try again.
Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:
If this is a resubmission following rejection, please explain the change in circumstances:
If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:
The text was updated successfully, but these errors were encountered: