Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement fixes requested by ESMValTool team #38

Open
9 tasks done
agstephens opened this issue Sep 17, 2020 · 5 comments
Open
9 tasks done

Implement fixes requested by ESMValTool team #38

agstephens opened this issue Sep 17, 2020 · 5 comments
Assignees

Comments

@agstephens
Copy link
Collaborator

agstephens commented Sep 17, 2020

Required steps to manually add fixes:

  • check that our data exhibits the error to be fixed
  • write the xarray fixing code in daops, with associated unit tests
  • add the fix class to dachar.fixes...., with associated unit tests if required
  • decide/agree on content of fix metadata:
    • URL to issue, or code, or both in ESMValTool repo
    • description of the fix
    • "source" property to identify that this came from ESMValTool, maybe with release version or github commit number.
  • use inventory to identify all data sets that will be affected
  • generate (by hand?) fix proposals:
    1. create fix proposal per datasets, OR
    2. create fix proposal template and list of datasets file
  • do we need some kind of QC when the fixes are being proposed?
  • publish fix proposals:
    • Start with a one-off proposal: dachar propose-fixes -p cmip6 <json_file>
    • Adapt to: dachar propose-fixes -p cmip6 --file-list=datasets_files.txt <fix_template.json>
  • process fixes as usual - to accept them

Relevant to:

ESMValGroup/ESMValCore#755

ESMValGroup/ESMValCore#787

@agstephens agstephens changed the title Be prepared to implement fixes requested by ESMValTool team Implement fixes requested by ESMValTool team Sep 21, 2020
@agstephens
Copy link
Collaborator Author

@ellesmith88 , I have assigned this to the two of us.

@agstephens
Copy link
Collaborator Author

Suggested content for source:

"source": {
    "name": "esmvaltool...",
    "version": "3.4.5",
    "comment": "any useful info",
    "url": "https:.....isssue/34"
}

@ellesmith88
Copy link
Collaborator

Write up of what I have done so far:

  1. By providing a json file e.g. https://github.com/roocs/dachar/blob/esmval_fixes/tests/test_fixes/esmval_test_fixes/o2.json
    This option could be used when you have a fix that you only need to propose for one dataset, or just a few datasets.
    The cli usage for this is dachar propose-fixes -f <json_file> but more than one json file can be provided: dachar propose-fixes -f <json_file>,<json_file2>,<json_file3>
    The fix/dataset doesn't need to be the same in these files if multiple are provided.

  2. By providing a template (json format) and a list of datasets that the fix should be proposed for (in a .txt file)
    For example:
    Template: https://github.com/roocs/dachar/blob/esmval_fixes/tests/test_fixes/esmval_test_fixes/o2_template.json
    Dataset list: https://github.com/roocs/dachar/blob/esmval_fixes/tests/test_fixes/esmval_test_fixes/o2_fix_ds_list.txt
    The cli usage for this is dachar propose-fixes -t <json_template> -d <dataset_list>

    The fixes are added to the fix proposal store from the cli via these functions:
    https://github.com/roocs/dachar/blob/esmval_fixes/dachar/fixes/generate_proposals.py#L38-L73

    More than one fix can be proposed in the complete json file or the json template.

    The command-line options are checked in cli.py so that incompatible options cannot be used together:
    https://github.com/roocs/dachar/blob/esmval_fixes/dachar/cli.py#L255

"source": {
    "name": "esmvaltool...",
    "version": "3.4.5",
    "comment": "any useful info",
    "url": "https:.....issue/34"
}

This records the origin of the fix.

For fixes proposed by dachar - the source is currently initialised by the check that identifies that a fix is needed e.g.
https://github.com/roocs/dachar/blob/esmval_fixes/dachar/analyse/checks/coord_checks.py#L16
There may be a better place for this if it needs to be more specific.

  • The proposed fixes are recorded in the fix proposal store and can be processed and published as normal.

@agstephens
Copy link
Collaborator Author

@ellesmith88 Thanks for the detailed write-up. This is looking really good.

One thought about the format of the fixes: at the moment the attribute name and value are being embodied together in a string: key,value. I can think of cases where we might need to preserve the data type of the value and therefore I reckon it is safest to represent them as dictionaries of key/value pairs rather than a comma-separated string.

By way of an example, this current fix:

    {"fixes": [
        "long_name,Dissolved Oxygen Concentration",
        "standard_name,mole_concentration_of_dissolved_molecular_oxygen_in_sea_water"
        ]
    },

would change to:

    {"fixes": [
        {"long_name": "Dissolved Oxygen Concentration"},
        {"standard_name": "mole_concentration_of_dissolved_molecular_oxygen_in_sea_water"}
        ]
    },

If we needed to use floats or ints we could then represent them in their native form, e.g.: {"quality_flag": 3"}

@ellesmith88
Copy link
Collaborator

@agstephens Good point, I'll update that 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants