Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature #2550 compute_diagnostics #2728

Merged
merged 79 commits into from
Nov 17, 2023

Conversation

JohnHalleyGotway
Copy link
Collaborator

@JohnHalleyGotway JohnHalleyGotway commented Nov 6, 2023

Expected Differences

This PR provides major upgrades to the TC-Diag tool through a large number of modified files. While this is a significant amount of work, more changes to TC-Diag are needed in the beta3 development cycle and are described in issue #2729.

Here's an overview of the changes in this PR:

  • Incorporates CSU CIRA Python Diagnostics code:

    • scripts/python/tc_diag is code pulled from https://github.com/robertdemariacira/tc_diag_driver
    • scripts/python/tc_diag/compute_tc_diag.py is the Python script that MET actually calls to interface with the CIRA Python diagnostics code.
    • scripts/python/pyembed contains read_tmp_diag.py and write_tmp_diag.py scripts to facilitate running TC-Diag with MET_PYTHON_EXE pointing to a custom Python installation.
    • Changes to configure.ac, configure, and many Makefiles are caused by adding new directories.
    • Note that data/tc_data/v2023-04-07_gdland_table.dat is added to define the distance to land. I added a note to Further refine TC-Diag tool functionality #2729 about how this overlaps the TC-DLand tool.
  • Made a handful of changes to common library code for convenience during development.

  • Changes to the TC-Diag source code in src/tools/tc_utils/tc_diag.

    • Refine how/what data is passed from Python to the C++ code, support separate sections for specifying storm diagnostics, sounding diagnostics, and custom diagnostics. Also let the Python script define units, long names, and comments to be appended to the bottom of the ASCII output.
  • Do these changes introduce new tools, command line arguments, or configuration file options? [Yes]

    If yes, please describe:

    In TCDiagConfig_default:

    • Adds override_diags config option to specify which diagnostics from the nest should override the parents.
    • Remove output_prefix and replace it with output_base_format.
    • Rename nc_rng_azi_flag with nc_cyc_grid_flag.
    • By default, specify diag_script separately for the "parent" and "nest" domains passing in different yml configuration files.
  • Do these changes modify the structure of existing or add new output data types (e.g. statistic line types or NetCDF variables)? [Yes]

    If yes, please describe:

  • Adds new TC-Diag NetCDF output file.

  • Adds new TC-Diag ASCII output file.

Pull Request Testing

  • Describe testing already performed for these changes:

    Confirmed that the existing unit test in unit_tc_diag.xml runs and produces the two new output files.
    Carefully compared the newly generated ASCII output to exist CIRA Diagnostics output to ensure consistent formatting. I do note some differences in the actual computed diagnostics values, but those diffs will be investigated and fixed through Further refine TC-Diag tool functionality #2729.

  • Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:

    First, read through Further refine TC-Diag tool functionality #2729 to note what additional changes are needed.
    Review these code changes, existing documentation changes, and note that the GHA runs pass with the only difference being the creation of new TC-Diag output files.

This feature branch is compiled and available for your use on seneca in:
/d1/projects/MET/MET_pull_requests/met-12.0.0/beta2/MET-feature_2550_compute_diagnostics

  • Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes]
    Updates to the TC-Diag chapter, Appendix F about Python embedding, and an unrelated change to the TC-Stat chapter. Please review these on the Files changed tab or in this PR 2728 RTD build.

Further needed updates to documentation are described in #2729.

  • Do these changes include sufficient testing updates? [Yes]
    The internal/test_unit/xml/unit_tc_diag.xml file continues running the same Ian case, but the expected output files are updated to include the newly generated diag.nc and diag.txt output file.

Note that the TC-Diag unit tests should really be expanded to include an example of nesting, as mentioned in #2729.

  • Will this PR result in changes to the test suite? [Yes]

    If yes, describe the new output and/or changes to the existing output:

    Creates 2 new TC-Diag output files.

  • Please complete this pull request review by [Fri 11/17/23].

Pull Request Checklist

See the METplus Workflow for details.

  • Review the source issue metadata (required labels, projects, and milestone).
  • Complete the PR definition above.
  • Ensure the PR title matches the feature or bugfix branch name.
  • Define the PR metadata, as permissions allow.
    Select: Reviewer(s) and Development issue
    Select: Milestone as the version that will include these changes
    Select: Coordinated METplus-X.Y Support project for bugfix releases or MET-X.Y.Z Development project for official releases
  • After submitting the PR, select the ⚙️ icon in the Development section of the right hand sidebar. Search for the issue that this PR will close and select it, if it is not already selected.
  • After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
  • Close the linked issue and delete your feature or bugfix branch from GitHub.

git-subtree-dir: scripts/python/tc_diag_driver
git-subtree-split: 2faf117e370906f082b9eff0340a4d2ebd67a0ee
…_diag_driver version 0.7.0. Still need to update the TC-Diag C++ code accordingly and decide how to address issues in diagnostics naming conventions and units.
…iTable objects since they are NOT aligned in CIRA diag output files.
…orting in both increasing and decreasing order.
… insertion order using a vector of map keys.
…ics from the python script and write them to NetCDF and ASCII output files.
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Nov 8, 2023

Testing notes:

  1. Test whether local build and GHA build produce the same python-embedding result:
  1. Test whether Python embedding still works when v3.0 is rebuilt:
  • On 11/8/23, rebuilt the existing v3.0 tag on my laptop:
cd /d1/projects/METbaseimage
git checkout v3.0
docker build -t dtcenter/met-base:v3.0_local .
  • The build failed because installing NumPy package version 1.26 results in an error. The fix is listing specific package version numbers, including NumPy version 1.24.2. I also had to remove the --force-reinstall option because even though we installed NumPy 1.24, XArray reinstalled using 1.26, which also errored out.
  1. Build functional v3.1 image locally, manually push to DockerHub, and retest with GHA for this PR.

@JohnHalleyGotway JohnHalleyGotway marked this pull request as ready for review November 9, 2023 23:43
@jvigh
Copy link
Contributor

jvigh commented Nov 17, 2023

@JohnHalleyGotway

Regarding differences of output:

Here is the screenshot of output from CIRA's TCDiag tool:
image

This file can be found in: seneca:/d1/projects/TCDiag/data_output/DIAGNOSTICS/TCDIAG/sample_diagnostics_data_for_IAN_for_HMON_GFS_HWRF/sal092022_avno_doper_2022092400_diag.dat

vs. MET's TCDiag tool:
image

  1. CIRA uses 'AVNO' in the header while we have "GFSO".
  2. I presume the reason that times beyond 24 h are mostly missing is just because the test data do not include the full set of GFS forecast data. Is that correct?
  3. I'm seeing modest but substantial differences in the values that are present (e.g., LAND, RMW).
  4. I note spacing differences between the first row and units for the sounding section.
  5. The units for 850VORT and 200DVRG should be (/S).
  6. The number of levels used in the sounding section differs (not sure if it's important, but it makes comparison more difficult).
  7. There is no TGRD data in the custom section to compare.

docs/Users_Guide/tc-diag.rst Outdated Show resolved Hide resolved
docs/Users_Guide/tc-diag.rst Outdated Show resolved Hide resolved
docs/Users_Guide/tc-diag.rst Outdated Show resolved Hide resolved
docs/Users_Guide/tc-diag.rst Outdated Show resolved Hide resolved
@JohnHalleyGotway
Copy link
Collaborator Author

@jvigh thanks for clearly describing the differences.

  1. CIRA uses 'AVNO' in the header while we have "GFSO".

I asked about this at the last project meeting and @musgrave-kate indicated we should use "GFSO".

  1. I presume the reason that times beyond 24 h are mostly missing is just because the test data do not include the full set of GFS forecast data. Is that correct?

Correct, the existing unit test only uses GFS data out to 24 h. The need to enhance the unit tests is noted in issue #2729.

  1. I'm seeing modest but substantial differences in the values that are present (e.g., LAND, RMW).

Agreed, the computed values do not match well enough yet. In particular, there is an issue with winds that @robertdemariacira has been tasked with investigating. These differences are noted in issue #2729.

  1. I note spacing differences between the first row and units for the sounding section.

Yes, I see that. This is the difference between fixed-width FORTRAN output versus putting 1 space between columns. Let's ask @musgrave-kate if this matters, and if so, I can define a minimum width for the first column in issue #2729.

  1. The units for 850VORT and 200DVRG should be (/S).

I assume that when the Python code returns all bad data it doesn't return a units string. Hopefully once the wind issues are resolved, so too will this units mismatch.

  1. The number of levels used in the sounding section differs (not sure if it's important, but it makes comparison more difficult).

Yes, I configured MET with the levels @musgrave-kate indicated we should use by default. The CIRA-computed diagnostics include more levels. Let's follow up with @musgrave-kate about this.

  1. There is no TGRD data in the custom section to compare.

Agreed, the CIRA Python code isn't returning it.

I'll update #2729 to further investigate and address these discrepancies.

@jvigh
Copy link
Contributor

jvigh commented Nov 17, 2023

I note that the NetCDF output file is not CF-compliant. Not sure what NOAA's requirements are, but guessing this is okay. If CF compliance was desired, here's the compliance report.
IOOS Compliance Checker.pdf

The main omission is lack of standard_name attributes, which I don't think exist for these diagnostics.

Some file-level metadata might be good to include though.

Minor point: storm_id and model could be provided as global attributes of the file rather than variables. But I'm not sure there's any convention. If they are going to remain as variables, they should have long_name and/or description attributes.

@JohnHalleyGotway
Copy link
Collaborator Author

Agreed. That's why I want to review/revise the output formats before documenting them ;)

Please note any specific requests on issue #2729.

@jvigh
Copy link
Contributor

jvigh commented Nov 17, 2023 via email

docs/Users_Guide/tc-diag.rst Outdated Show resolved Hide resolved
Copy link
Contributor

@jvigh jvigh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left comments in the code review pane and the PR discussion. If these are all handled or identified as to-do items for later, I approve this PR. Great work!

@JohnHalleyGotway JohnHalleyGotway merged commit 36868c1 into develop Nov 17, 2023
6 of 9 checks passed
@JohnHalleyGotway JohnHalleyGotway deleted the feature_2550_compute_diagnostics branch November 17, 2023 22:50
JohnHalleyGotway pushed a commit that referenced this pull request Nov 17, 2023
Co-authored-by: jprestop <[email protected]>
Co-authored-by: Seth Linden <[email protected]>
Co-authored-by: John Halley Gotway <[email protected]>
Co-authored-by: Daniel Adriaansen <[email protected]>
Co-authored-by: John and Cindy <[email protected]>
Co-authored-by: rgbullock <[email protected]>
Co-authored-by: Randy Bullock <[email protected]>
Co-authored-by: Dave Albo <[email protected]>
Co-authored-by: Howard Soh <[email protected]>
Co-authored-by: George McCabe <[email protected]>
Co-authored-by: hsoh-u <[email protected]>
Co-authored-by: MET Tools Test Account <[email protected]>
Co-authored-by: Seth Linden <[email protected]>
Co-authored-by: lisagoodrich <[email protected]>
Co-authored-by: davidalbo <[email protected]>
Co-authored-by: Lisa Goodrich <[email protected]>
Co-authored-by: metplus-bot <[email protected]>
Co-authored-by: j-opatz <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan Vigh <[email protected]>
Co-authored-by: Tracy Hertneky <[email protected]>
Co-authored-by: David Albo <[email protected]>
Co-authored-by: Dan Adriaansen <[email protected]>
Co-authored-by: Julie Prestopnik <[email protected]>
Co-authored-by: root <root@83062d57c5dd>
fix 2518 dtypes appf docs (#2519)
fix 2531 compilation errors (#2533)
fix #2531 compilation_errors_configure (#2535)
fix #2514 develop clang (#2563)
fix #2575 develop python_convert (#2576)
Fix Python environment issue (#2407)
fix definitions of G172 and G220 based on comments in NOAA-EMC/NCEPLIBS-w3emc#157. (#2406)
fix #2380 develop override (#2382)
fix #2408 develop empty config (#2410)
fix #2390 develop compile zlib (#2404)
fix #2412 develop climo (#2422)
fix #2437 develop convert (#2439)
fix for develop, for #2437, forgot one reference to the search_parent for a dictionary lookup.
fix #2452 develop airnow (#2454)
fix #2449 develop pdf (#2464)
fix #2402 develop sonarqube (#2468)
fix #2426 develop buoy (#2475)
fix 2596 main v11.1 rpath compilation (#2614)
fix #2514 main_v11.1 clang (#2628)
fix #2644 develop percentile (#2647)
fix #2730 develop SI_BCU (#2732)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏁 Done
Development

Successfully merging this pull request may close these issues.

Enhance TC-Diag to actually compute and write diagnostics
2 participants