Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Glider DAC Format Version 2 files #66

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

mwengren
Copy link
Member

@mwengren mwengren commented Jan 18, 2024

This PR moves the current Glider DAC File Format V 2.0 files referenced in the DAC documentation site to this repo as 'gold standard' files for IOOS glider data submission.

Future changes to this repo may include adding other classifications for existing moored/shore station example files under the /datasets directory to help users navigate.

Once this PR is merged, the Glider DAC documentation site will be updated to link to this repo to allow users to download the example files.

For reference, these files originated in the ioos/glider-dac repo at this location: https://github.com/ioos/glider-dac/tree/gh-pages/_nc/template.

cc @MathewBiddle @kbailey-noaa @ocefpaf @kerfoot

@MathewBiddle
Copy link
Contributor

I like the organization here

datasets/gliders/

Reviewing the Gold Standard Examples website, are any of those examples we want to bring over here as well?

I know ATN has a trajectory template that would be good to add as well ioos/ioos-atn-data#42. To note, they will also have a profile template. But that will be separate from this PR.

IMO this looks good!

@mwengren
Copy link
Member Author

Hey @MathewBiddle the first three dataset examples on that page (Fixed Station - Morro Bay and Moored Buoy - C10 met and currents) are actually the source for the three datasets currently in this repo: https://github.com/ioos/erddap-gold-standard/tree/main/datasets.

We need to move that page over here, as previously mentioned in #63, and add additional datasets like ATN, definitely.

The current Gold Standard Datasets page was created before this repo was, and we never went back to update it or move it over to this repo. In addition to the RA portal links currently in that page, dataset links should go to the Gold Standard ERDDAP as well: https://standards.sensors.ioos.us/erddap/info/index.html. I'm going to add more notes about that to the issue above.

@mwengren
Copy link
Member Author

@MathewBiddle @srstsavage I haven't updated the https://github.com/ioos/erddap-gold-standard/blob/main/erddap/content/datasets.xml file accordingly. Any help with that would be most appreciated.

I suppose it could either be committed directly to this PR, or added in a subsequent PR to make these new datasets actually deploy in ERDDAP/Docker, and on https://standards.sensors.ioos.us/erddap/. Let me know your thoughts.

Also within scope is re-categorizing the existing 3 datasets under an appropriate sub-folder similar to the /gliders one I used here, probably more easily done in a subsequent PR by one of you :).

@@ -0,0 +1,408 @@
netcdf IOOS_Glider_NetCDF_v2.0 {
dimensions:
time = UNLIMITED ; // (0 currently)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example doesn't have data values in it. In order to load it into ERDDAP, we need at least one value, I believe.

GenerateDatasetsXml.sh fails with:

java.lang.RuntimeException: ERROR in NDimensionalIndex constructor: shape=[0] has a value less than 1.
 at com.cohort.array.NDimensionalIndex.<init>(NDimensionalIndex.java:83)
 at gov.noaa.pfel.coastwatch.pointdata.Table.readNDNc(Table.java:6994)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromNcFiles.generateDatasetsXml(EDDTableFromNcFiles.java:354)
 at gov.noaa.pfel.erddap.GenerateDatasetsXml.doIt(GenerateDatasetsXml.java:693)
 at gov.noaa.pfel.erddap.GenerateDatasetsXml.main(GenerateDatasetsXml.java:991)

Or

Table.readMultidimNc read /datasets/gliders/IOOS_Glider_NetCDF_v2.0.nc:
Returning an empty table because dim=time's length=0! time=67
java.lang.RuntimeException:
ERROR in Test.ensureTrue:
The file has no variables with dimensions:
 at com.cohort.util.Test.error(Test.java:43)
 at com.cohort.util.Test.ensureTrue(Test.java:71)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromMultidimNcFiles.generateDatasetsXml(EDDTableFromMultidimNcFiles.java:243)
 at gov.noaa.pfel.erddap.GenerateDatasetsXml.doIt(GenerateDatasetsXml.java:666)
 at gov.noaa.pfel.erddap.GenerateDatasetsXml.main(GenerateDatasetsXml.java:991)

Everything with a dimension of time needs to have a value in the variable. We can make something up.

In the past, I have used ncgen to move a cdl to nc file:

#generate netCDF file from cdl text document
ncgen -o ../templates/atn_trajectory_template.nc ../templates/atn_trajectory_template.cdl

I'll work on drafting up a change.

Copy link
Contributor

@MathewBiddle MathewBiddle Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out it hard to add data to this file 😢

@kerfoot do you have an example with data in the variables?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kerfoot Would it be possible to re-create this Glider DAC 2.0 template dataset to include some data? Per @MathewBiddle's comment above, ERDDAP GenerateDatasetsXml won't run on datasets that don't include include data in certain dimensions.

Also, the preview/demo ERDDAP generated from this repo will be more useful if obs data is present in the files. See existing 'gold standard' datasets here: https://standards.sensors.ioos.us/erddap/info/index.html?page=1&itemsPerPage=1000.

It may be that we need two copies, one with dummy data and one without (so users could be pointed to the empty templates alongside the 'live' data files in ERDDAP in the Glider DAC docs: https://ioos.github.io/glider-dac/ngdac-netcdf-file-format-version-2.html).

I can merge this pull request as is, and then you could submit another one including datasets with actual data, and we can update the ERDDAP datasets.xml accordingly from there.

Perhaps these files should be renamed to include _Template at the end of the tile name as well to differentiate? ie:

  • datasets/gliders/IOOS_Glider_NetCDF_v2.0_Template.cdl
  • datasets/gliders/IOOS_Glider_NetCDF_v2.0_Template.nc

cc @kbailey-noaa

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW here's how far I got with adding in data to the existing file. https://gist.github.com/MathewBiddle/44d3c41e3ee3f2325619ed5b4e0752dc

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwengren @MathewBiddle Sure thing, but I'm finishing up a (I guess I'd call it v3.0) version that aligns more closely with the IOOS metadata profile standard. Can you hold off on merging this request and I will get you a CDL with data to review?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kerfoot thanks for your comments. We can definitely hold off on merging any changes here for the time being.

Regarding the v3.0 NGDAC file format changes you're proposing, I feel we should have a review/comment step before any official changes are made, with either the templates in the glider-dac repo or by adding new template files here.

The easiest way for that to occur would be for you to propose changes in markdown form via PR against the ngdac-netcdf-file-format-version-2.md. This will give the DMAC team a chance visualize the changes you're proposing and review and comment. Actual example files may be needed too, if for example, markdown can't convey changes in the data structure.

Part of the motivation for this is that I would like to do a broader assessment across all of our netCDF-based file formats to see if commonalities could be pulled out into what we might call an IOOS Metadata Profile v1.3, that downstream formats like NGDAC could either replicate, or, if possible, inherit from.

Whether this is feasible or not as an approach is TBD until we can compare across the different formats:

@mwengren mwengren mentioned this pull request May 1, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants