-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paola issues #93
Paola issues #93
Conversation
, #90, progress on publish-procedure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Um, I think this is the right button to press but apologies if not - I approve this PR pending a few minor corrections.
@@ -3,8 +3,3 @@ | |||
Data, software and links to external data holdings (such as NCI) can be added to the [CSIRO Data Access Portal](https://data.csiro.au/) (DAP) by **CSIRO staff** only. More information on creating metadata records and uploading files to the CSIRO DAP can be found on the [DAP Help Guide](https://confluence.csiro.au/display/dap/Deposit+and+Manage+Data) (staff only access). | |||
|
|||
CSIRO-affiliated data can be published in the DAP and the lead creator does not have to be CSIRO staff member. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't do any changes until the current PR is merged but happy to take this action (or Katie can) after merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I temporarily updated the links as for the other part just to be consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a newly created issue #88
Co-authored-by: Claire Trenham <[email protected]>
Co-authored-by: Claire Trenham <[email protected]>
Co-authored-by: Claire Trenham <[email protected]>
Co-authored-by: Claire Trenham <[email protected]>
|
||
2. If the output is big publish only a subset. If methods are well described, the software used is easily available, then publishing only the subset of data that underlines a publication is sufficient. For example, the post-processed output is sufficient for a model simulation. However, the model version and configuration used, the input data and model source code should be documented. | ||
|
||
3. It's essential to consider an end user point of view. What would a user look for when considering using a dataset? What kind of information is essential for the data to be usable? Which additional information would make its use easier? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Consider adding an example of what's essential to consider from an end-user point of view. For example, ACS aims to publish bias-corrected data on a 5km grid; this decision is mainly based on user needs who want to have higher and higher resolved data. But I am not sure if that's an example for data creation rather than publish procedure.
The way files are organised in folders, their names and sizes should consider both how the files will be distributed and how they will be used. For example, the protocol used to download the files might have a maximum size allowed, likewise having to list a lot of small files it's inefficient when loading an html page. Names also should be descriptive enough that a file can be recognised easily after has been downloaded as being part of a specific dataset. We covered [files organisation and naming](../tech/drs-names) in the technical pages of this book, however, it is important to check the publisher instructions in this regard, or, if none are available online, contacting them about it as early as possible in the publishing process. | ||
|
||
**Conventions** | ||
It is important to use [conventions](../concepts/conventions) and [controlled vocabularies](../concepts/controlled-vocab) whenever possible, both official ones, like CF conventions for file attributes, and others which are not a requirement but have become common practice in the climate community (e.g. CMIP variable names). As some of these conventions also apply to folder and file names, it is important to be consistent and use the same terms in the files, names and descriptions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could mention the CORDEX convention as an example, too. One thing I learned recently is that the CORDEX convention keeps changing (aka updating), and if one is not aware of the changing convention happening every couple of weeks, one might miss updating the data file and not be CORDEX aligned anymore. Currently, there are data description and meta-data differences between ACS-CCAM and QLD-CCAM published CMIP6 CORDEX data because of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to give some generic advice, hence I mentioned only CF, we should have a publishing with ESGF where all these quirks can be added. CMOR, CORDEX etc. I will copy this comment to the relevant issue. And also could be added to other-conventions file? We should be covering cortex there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see issue #28
Uhh this is confusing I think I answered some of Alicia comments in the files list, can't see my answers here, anyway hopefully they will appear, and I think I covered all the reviews, but if I missed something let me know, I added the "abstract" now renamed "dataset web page" that was fully missing and also re-formatted the readme example in tech. There might be more pages like that that need reviewing. I will check tomorrow while we see if Chloe wants to also review or is happy with the changes. |
Thanks good to go from me :) |
This is addressing a few issues:
#71 making sure we have correspondence between indexes of concepts and tech pages
#92 there was an empty dropdown in this page, I fixed that by adding the tar cheat sheet info
#90 removed a pin to an older version of Jupyter-book
#83 I covered the example mentioned
#73 added an introduction to this new section in the introduction page (please review)
And #53 , this has become a bit bigger, I reviewed the entire publishing section (opened new issues) and:
It also might help to review/merge to at least fix the other issues.
I'm trying to encourage our users to rely on it now they need to manage their data on their own