Replies: 16 comments
-
Thank you for the input. Registration is not yet open, though. The repository is still under test and the CV needs to be polished. Registration opening will be announced soon. In any case, this is a nice example raising a problem to automatically parse the address. We expected the last two comma-separated fields to be the city and country. The optional but quite common specification of the region makes it difficult to extract these fields. Would it be very non-standard to specify the address as: Ouranos Consortium, Montreal (Quebec), Canada i.e. with the region in parenthesis if present? National Center for Atmospheric Research, Boulder (CO), USA |
Beta Was this translation helpful? Give feedback.
-
Ah okay thank you, I can resubmit once registration is open if that's better. Regarding the formatting, I think parentheses for the region is a reasonable way to do it that is easily parsed. I do think it is a good idea to allow regions here regardless of the way chosen (Springfield, USA can mean many different places, for example). Another question: are accents accepted in the 'institution' field? If so, we would use 'Ouranos Consortium, Montréal (Québec), Canada' |
Beta Was this translation helpful? Give feedback.
-
Thank you. Definitely, we need a way to include regions in the address. No need to resubmit your registration request. We can just let this issue open until solved. Regarding accented characters. In the @larsbuntemeyer what do you think? |
Beta Was this translation helpful? Give feedback.
-
Hey, sorry for the late response! I'm not really an expert on encoding but I think we should follow the obviously unwritten convention of CMIP6 (there is no convention of attribute data types) to avoid these special characters. I played around with it a little and, techically, i think it should be no problem (from what i read, default string attributes have type of |
Beta Was this translation helpful? Give feedback.
-
However @mccrayc i think you put a lot of effort into |
Beta Was this translation helpful? Give feedback.
-
Here is an example downstream application that stumbles on encoding: pangeo-forge/pangeo-forge-recipes#586 |
Beta Was this translation helpful? Give feedback.
-
I didn't think about the accents in @durack1 and @taylor13 did you have any issues with the accents in CMIP6 ? |
Beta Was this translation helpful? Give feedback.
-
@gnikulin @larsbuntemeyer for CMIP6, I imposed a unicode wash, so that accents etc got dropped before being committed to the repository, see this code block for an example. There were major issues in mapping across to HTML if accents and other unusual characters crept in, so doing this wash early and keeping the repo UTF-8 was the simplest solution |
Beta Was this translation helpful? Give feedback.
-
Maybe we should mention in the registration guidance that this will be done? |
Beta Was this translation helpful? Give feedback.
-
No worries about the accents, we are fine leaving them out if that's the standard to be followed. I have also updated our intitution_id to be uppercase (OURANOS) to be consistent with our CMIP5 ID. |
Beta Was this translation helpful? Give feedback.
-
@larsbuntemeyer this is an issue with the published data, rather than the controlled vocabularies/registered content. I do not believe the ESGF publisher currently does a unicode wash before entries are wedged into the ESGF (currently SOLR) indexes, but this would be an interesting question to answer. @sashakames can you speak to publisher behaviours? |
Beta Was this translation helpful? Give feedback.
-
/register |
Beta Was this translation helpful? Give feedback.
-
@mccrayc is OURANOS an acronym or simply a name of a consortium ? |
Beta Was this translation helpful? Give feedback.
-
There is no specific "wash" code in the publisher to reformat strings. To clarify, this would be for "full name" strings that get extracted from GA for inclusion in the record, rather than "id" string. The latest ESGF proposal would be to drop such strings from the published record to the index, so adding such code may become irrelevant. I don't think they are an acronym but the Montreal institution named after Greek. |
Beta Was this translation helpful? Give feedback.
-
@gnikulin Ouranos is the name of the consortium, not an acronym. The idea of capital letters was to have an identical ID with CMIP5. |
Beta Was this translation helpful? Give feedback.
-
OURANOS has been registered so we can close this issue or need more discussions about the accents ? |
Beta Was this translation helpful? Give feedback.
-
institution_id
OURANOS
institution
Ouranos Consortium, Montreal, Quebec, Canada
Beta Was this translation helpful? Give feedback.
All reactions