Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port to Orion Rocky 9 #966

Merged
merged 3 commits into from
Jun 26, 2024

Conversation

GeorgeGayno-NOAA
Copy link
Collaborator

@GeorgeGayno-NOAA GeorgeGayno-NOAA commented Jun 24, 2024

DESCRIPTION OF CHANGES:

Orion was recently upgraded to Rocky 9.

  • Update the build module to use Rocky 9 spack-stack.
  • Replace hardcoding of grib utility programs in the regression test driver scripts with module loads.
  • Update the machine_setup.sh script, which did not recognize Orion after the upgrade.

TESTS CONDUCTED:

DEPENDENCIES:

NOAA-EMC/global-workflow#2694

DOCUMENTATION:

N/A

ISSUE:

Fixes #963.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@DeniseWorthen - the cpld_gridgen and ocnice_prep tests are failing. This is not surprising since I am pointing to a new spack-stack on Orion. The differences from the baseline look small to me. But can you please check? I placed all the log files here: /work/noaa/stmp/ggayno/for.denise.

@DeniseWorthen
Copy link
Contributor

I compared your runs against the hercules baseline, since for UWM the baselines on orion/R9 can reproduce the hercules/R9. Compared this way, the cpld_gridgen appears to be b4b, but the ocnice_prep has very small differences, but only in the downscaled velocity fields.

The differences I see in your Orion/R9 baselines are very small and along w/ the comparison above, I think they're acceptable/expected with the OS upgrade. Thanks.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

I compared your runs against the hercules baseline, since for UWM the baselines on orion/R9 can reproduce the hercules/R9. Compared this way, the cpld_gridgen appears to be b4b, but the ocnice_prep has very small differences, but only in the downscaled velocity fields.

The differences I see in your Orion/R9 baselines are very small and along w/ the comparison above, I think they're acceptable/expected with the OS upgrade. Thanks.

Great. Thanks for checking.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

GeorgeGayno-NOAA commented Jun 24, 2024

The regression tests were run on Orion using 3923373. The results:

Some chgres_cube tests failed:

consistency.log01:<<< C96 FV3 RESTART TEST FAILED. >>>
consistency.log02:<<< C192 FV3 HISTORY TEST FAILED. >>>
consistency.log03:<<< C96 FV3 GAUSSIAN NETCDF TEST PASSED. >>>
consistency.log04:<<< C192 GFS GRIB2 TEST FAILED. >>>
consistency.log05:<<< 25-KM CONUS GFS GRIB2 TEST FAILED. >>>
consistency.log06:<<< 3-km CONUS HRRR W/ GFS PHYSICS GRIB2 TEST FAILED. >>>
consistency.log07:<<< 3-km CONUS HRRR W/ GSD PHYSICS AND SFC FROM FILE GRIB2 TEST FAILED. >>>
consistency.log08:<<< 13-KM CONUS NAM GRIB2 TEST FAILED. >>>
consistency.log09:<<< 13-km CONUS RAP W/ GSD PHYSICS AND SFC FROM FILE GRIB2 TEST FAILED. >>>
consistency.log10:<<< 13-KM NA GFS NCEI GRIB2 TEST FAILED. >>>
consistency.log11:<<< C96 FV3 GAUSSIAN NETCDF2WAM TEST FAILED. >>>
consistency.log12:<<< 25-KM CONUS GFS PGRIB2+BGRIB2 TEST FAILED. >>>
consistency.log13:<<< C96 GEFS GRIB2 TEST PASSED. >>>

Some cpld_gridgen tests failed:

025 failed
050 failed
100 failed

All global_cycle test passed:

consistency.log01:<<< C768 GLOBAL CYCLE TEST PASSED. >>>
consistency.log02:<<< C192 GSI based LANDINC SOIL-NOAHMP CYCLE TEST PASSED. >>>
consistency.log03:<<< C768 LANDINC SNOW CYCLE TEST PASSED. >>>
consistency.log04:<<< C48 NOAHMP FRAC GRID TEST PASSED. >>>
consistency.log05:<<< C192 JEDI based LANDINC SOIL-NOAHMP CYCLE TEST PASSED. >>>

All grid_gen tests passed:

consistency.log01:<<< C96 UNIFORM TEST PASSED. >>>
consistency.log02:<<< C96 VIIRS BNU TEST PASSED. >>>
consistency.log03:<<< GFDL REGIONAL TEST PASSED. >>>
consistency.log04:<<< ESG REGIONAL TEST PASSED. >>>
consistency.log05:<<< ESG REGIONAL PERCENT CATEGORY TEST PASSED. >>>
consistency.log06:<<< REGIONAL 12 THREAD GSL GWD TEST PASSED. >>>
consistency.log07:<<< REGIONAL 24 THREAD GSL GWD TEST PASSED. >>>

The ice_blend test passed:

<<< ICE BLEND TEST PASSED. >>>

Some ocnice_prep tests failed:

050_ocean failed
100_ocean failed

The snow2mdl tests passed:

<<< SNOW2MDL OPS TEST PASSED. >>>
<<< SNOW2MDL GLOBAL TEST PASSED. >>>

The weight_gen test passed:

<<< WEIGHT_GEN TEST PASSED. >>>

@GeorgeGayno-NOAA
Copy link
Collaborator Author

The chgres_cube tests showed only very small differences from the baseline, which is expected given just an OS update. An example of a surface file difference:

+ nccmp -dmfqS out.sfc.tile1.nc /work/noaa/nems/role-nems/ufs_utils/reg_tests/chgres_cube/baseline_data/c96_fv3_restart/out.sfc.tile1.nc
Variable Group Count          Sum      AbsSum          Min          Max       Range         Mean      StdDev
f10m     /         1  1.11022e-16 1.11022e-16  1.11022e-16  1.11022e-16           0  1.11022e-16           0
uustar   /         2 -5.55112e-17 1.66533e-16 -1.11022e-16  5.55112e-17 1.66533e-16 -2.77556e-17 1.17757e-16
ffmm     /         1  3.55271e-15 3.55271e-15  3.55271e-15  3.55271e-15           0  3.55271e-15           0
tprcp    /         2 -3.38813e-21 3.38813e-21  -2.5411e-21 -8.47033e-22 1.69407e-21 -1.69407e-21 1.19789e-21

@GeorgeGayno-NOAA
Copy link
Collaborator Author

Denise checked the cpld_gridgen and ocnice_prep results, and they were consistent with an OS update:

#966 (comment)

@GeorgeGayno-NOAA GeorgeGayno-NOAA marked this pull request as ready for review June 24, 2024 19:31
@GeorgeGayno-NOAA
Copy link
Collaborator Author

@KateFriedman-NOAA and @DavidHuber-NOAA - could someone please test in the workflow?

@DavidHuber-NOAA
Copy link
Collaborator

@GeorgeGayno-NOAA The global workflow has not been ported to Orion-rocky9 yet. This may still be a couple of weeks off. I'm happy to test when it is ready and I'm OK with opening another issue/PR if issues are encountered at that time. So if you would like to merge now, the G-W team is fine with that. Tagging @aerorahul for awareness.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@GeorgeGayno-NOAA The global workflow has not been ported to Orion-rocky9 yet. This may still be a couple of weeks off. I'm happy to test when it is ready and I'm OK with opening another issue/PR if issues are encountered at that time. So if you would like to merge now, the G-W team is fine with that. Tagging @aerorahul for awareness.

Ok. I don't have a problem waiting. But if a UFS_UTILS user must have access to Orion, I will have to merge early.

Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine to merge.
Just a note for other reviewers;grib-util should be grib_util if wcoss2 names are to be consistent

@GeorgeGayno-NOAA GeorgeGayno-NOAA merged commit 3ef2e6b into ufs-community:develop Jun 26, 2024
4 checks passed
@GeorgeGayno-NOAA GeorgeGayno-NOAA deleted the rocky9_orion branch September 25, 2024 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update for Rocky 9 on Orion
4 participants