Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert staging job to python and yaml #2651

Merged

Conversation

KateFriedman-NOAA
Copy link
Member

@KateFriedman-NOAA KateFriedman-NOAA commented Jun 3, 2024

Description

This PR converts the staging job from shell to python and introduces the use of yaml.

Changes in this PR:

  1. Rename scripts/exglobal_stage_ic.sh to scripts/exglobal_stage_ic.py.
  2. Update jobs/JGLOBAL_STAGE_IC to use .py script extension. Move COM* variable declarations and member loop down into yaml and python respectively. Move GDATE/gPDY/gcyc settings up to JJOB from ex-script and replace with newer cycle variables (as done in forecast job).
  3. Create parm/stage folder to hold newly created stage.yaml.j2, which both mimics forecast-only functionality in existing scripts/exglobal_stage_ic.sh and adds functionality for cycled mode.
  4. Create ush/python/pygfs/task/stage.py to house staging job python functions for call from scripts/exglobal_stage_ic.py.
  5. Remove stage_ic job rocoto dependencies from xml. Do not need and removes area of duplicate maintenance.
  6. Add cycled staging jobs for gdas and enkf suites.
  7. Rename model_data to model for issue Change model_data to model in COM templates #2686

There will now be distinct stage_ic jobs for each RUN: gdasstage_ic, gfsstage_ic, enkfgdasstage_ic, stage_ic (for gefs).

Related work was done to set up new symlink folder structure under supported platform ICSDIR folder for use by updated staging job.

Resolves #2475
Resolves #2650
Resolves #2686

Type of change

  • New feature (adds functionality)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO

How has this been tested?

Ran all of the CI tests (either just staging jobs or end-to-end) on WCOSS2. Ran some of the CI tests on Hera and Hercules.

Ran some control CI tests from develop and compared outputs against test CI from staging branch. Outputs reproduced between control and test.

KateFriedman-NOAA and others added 30 commits May 14, 2024 14:57
- Change the extension of the exglobal_stage ex-script
from "sh" to "py".

Refs NOAA-EMC#2475
- Update script extension for ex-script from "sh" to "py".
- Pull COM* variable declares up from ex-script.

Refs NOAA-EMC#2475
Remove the functions and calls to set up
symlinks to ICs in ROTDIR.

Refs NOAA-EMC#2475
- Add initial new yaml files for staging information
- Add new stage.py to python tasks.
- Add first draft pythonization of stage ex-script.

Much more work is still to be done.

Refs NOAA-EMC#2475
Revert changes to workflow/setup_expt.py; will do in later task

Refs NOAA-EMC#2475

* upstream/develop:
  Sea-ice analysis insertion (NOAA-EMC#2584)
  Refactored archiving (NOAA-EMC#2491)
  Add remove RUNDIRS step in CI before creating experements (NOAA-EMC#2607)
Also fix to set target and remove source

Refs NOAA-EMC#2475
Add target, remove source, and update file info

Refs NOAA-EMC#2475
Update to use mkdir and copy instead of target and required

Refs NOAA-EMC#2475
If RUN=gefs add keys_gefs to keys.

Refs NOAA-EMC#2475
Set to .false. by default; needed for staging job

Refs NOAA-EMC#2475
- remove master yaml, no longer using
- update fv3_cold, ice, ocean, and wave yamls

Refs NOAA-EMC#2475
- Add keys for GEFS
- Cleanup

Refs NOAA-EMC#2475
- General cleanup
- Rework determine_stage function
- Rework execute_stage function

Refs NOAA-EMC#2475
- Delete da.yaml.j2; will remake in follow-up work
- Update fv3_warm.yaml.j2 to not use src/head variables

Refs NOAA-EMC#2475
@emcbot
Copy link

emcbot commented Aug 19, 2024

Experiment C96C48_ufs_hybatmDA_13dfad25 FAIL on Wcoss2 at 08/19/24 06:14:29 PM

Error logs:

/lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2651/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_13dfad25/logs/2024022318/enkfgdasfcst_mem001.log
/lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2651/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_13dfad25/logs/2024022318/enkfgdasfcst_mem002.log

Follow link here to view the contents of the above file(s): (link)

- Add ratminc.nc to the list of files in the analysis
staging yaml.
- Remove "gdas" condition for including analysis yaml.

Refs NOAA-EMC#2475
@KateFriedman-NOAA KateFriedman-NOAA added CI-Wcoss2-Ready **CM use only** PR is ready for CI testing on WCOSS and removed CI-Wcoss2-Failed **Bot use only** CI testing on WCOSS for this PR has failed labels Aug 19, 2024
@emcbot emcbot added CI-Wcoss2-Building **Bot use only** CI testing is cloning/building on WCOSS and removed CI-Wcoss2-Ready **CM use only** PR is ready for CI testing on WCOSS labels Aug 19, 2024
@emcbot
Copy link

emcbot commented Aug 19, 2024

CI Update on Wcoss2 at 08/19/24 08:56:04 PM
============================================
Cloning and Building global-workflow PR: 2651
with PID: 91691 on host: clogin03

@emcbot emcbot added CI-Wcoss2-Running **Bot use only** CI testing on WCOSS for this PR is in-progress and removed CI-Wcoss2-Building **Bot use only** CI testing is cloning/building on WCOSS labels Aug 19, 2024
@emcbot
Copy link

emcbot commented Aug 19, 2024

Automated global-workflow Testing Results:

Machine: Wcoss2
Start: Mon Aug 19 21:06:34 UTC 2024 on clogin03
---------------------------------------------------
Build: Completed at 08/19/24 09:46:15 PM
Case setup: Completed for experiment C48_ATM_ab35ffce
Case setup: Skipped for experiment C48mx500_3DVarAOWCDA_ab35ffce
Case setup: Skipped for experiment C48_S2SWA_gefs_ab35ffce
Case setup: Completed for experiment C48_S2SW_ab35ffce
Case setup: Completed for experiment C96_atm3DVar_extended_ab35ffce
Case setup: Skipped for experiment C96_atm3DVar_ab35ffce
Case setup: Completed for experiment C96_atmaerosnowDA_ab35ffce
Case setup: Completed for experiment C96C48_hybatmDA_ab35ffce
Case setup: Completed for experiment C96C48_ufs_hybatmDA_ab35ffce

@emcbot emcbot added CI-Wcoss2-Passed **Bot use only** CI testing on WCOSS for this PR has completed successfully and removed CI-Wcoss2-Running **Bot use only** CI testing on WCOSS for this PR is in-progress labels Aug 20, 2024
@emcbot
Copy link

emcbot commented Aug 20, 2024

All CI Test Cases Passed on Wcoss2:

Experiment C48_ATM_ab35ffce *** SUCCESS *** at 08/19/24 11:00:15 PM
Experiment C48_S2SW_ab35ffce *** SUCCESS *** at 08/19/24 11:14:17 PM
Experiment C96C48_hybatmDA_ab35ffce *** SUCCESS *** at 08/19/24 11:56:21 PM
Experiment C96_atmaerosnowDA_ab35ffce *** SUCCESS *** at 08/20/24 12:49:24 AM
Experiment C96C48_ufs_hybatmDA_ab35ffce *** SUCCESS *** at 08/20/24 01:21:14 AM
Experiment C96_atm3DVar_extended_ab35ffce *** SUCCESS *** at 08/20/24 08:42:33 AM

Copy link
Contributor

@DavidHuber-NOAA DavidHuber-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@KateFriedman-NOAA KateFriedman-NOAA added the CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules label Aug 20, 2024
@emcbot emcbot added CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules CI-Hercules-Running **Bot use only** CI testing on Hercules for this PR is in-progress CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully and removed CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules CI-Hercules-Running **Bot use only** CI testing on Hercules for this PR is in-progress labels Aug 20, 2024
@emcbot
Copy link

emcbot commented Aug 20, 2024

CI Passed on Hercules in Build# 1
Built and ran in directory /work2/noaa/stmp/CI/HERCULES/2651


Experiment C48_ATM_ab35ffce Completed 1 Cycles: *SUCCESS* at Tue Aug 20 10:48:23 CDT 2024
Experiment C96_atm3DVar_ab35ffce Completed 3 Cycles: *SUCCESS* at Tue Aug 20 11:55:15 CDT 2024
Experiment C96C48_hybatmDA_ab35ffce Completed 3 Cycles: *SUCCESS* at Tue Aug 20 11:55:17 CDT 2024
Experiment C48_S2SWA_gefs_ab35ffce Completed 1 Cycles: *SUCCESS* at Tue Aug 20 12:08:11 CDT 2024
Experiment C48_S2SW_ab35ffce Completed 1 Cycles: *SUCCESS* at Tue Aug 20 12:25:27 CDT 2024

@WalterKolczynski-NOAA WalterKolczynski-NOAA merged commit 659bcbe into NOAA-EMC:develop Aug 20, 2024
11 of 12 checks passed
@KateFriedman-NOAA KateFriedman-NOAA deleted the feature/issue_2475 branch August 20, 2024 17:41
@KateFriedman-NOAA
Copy link
Member Author

@NeilBarton-NOAA This PR has just been merged into develop. Let me know how I can help update your PR branch to accommodate the staging job refactor.

@NeilBarton-NOAA
Copy link
Contributor

Thanks @KateFriedman-NOAA, I'm currently testing, and I will let you know when I have questions.

DavidHuber-NOAA added a commit to DavidHuber-NOAA/global-workflow that referenced this pull request Aug 21, 2024
* origin/develop:
  support ATM forecast only on Azure (NOAA-EMC#2827)
  Convert staging job to python and yaml (NOAA-EMC#2651)
  Fixed test on UNAVAILBLE in python Rocoto check (NOAA-EMC#2842)
@guillaumevernieres
Copy link
Contributor

guillaumevernieres commented Aug 27, 2024

@KateFriedman-NOAA , It looks like this PR broke the WCDA/gfsv17 tests. Was that tested? I see the hera-ci green but I don't see what was actually tested.

Edit: just seeing @RussTreadon-NOAA 's issue ... I suspect that's my issue as well, so ignore for now

DavidHuber-NOAA added a commit to DavidHuber-NOAA/global-workflow that referenced this pull request Sep 9, 2024
* origin/develop:
  Create JEDI class (NOAA-EMC#2805)
  Restructure the bufr sounding job    (NOAA-EMC#2853)
  Add an archive task to GEFS system to archive files locally (NOAA-EMC#2816)
  Reenable Orion Cycling Support (NOAA-EMC#2877)
  Eliminate race conditions and remove DATAROOT last in cleanup (NOAA-EMC#2893)
  Update aerosol climatology to 2013-2024 mean (NOAA-EMC#2888)
  Add ability to run CI test C96_atm3DVar.yaml to Gaea-C5 (NOAA-EMC#2885)
  Support global-workflow GEFS C48 on Google Cloud (NOAA-EMC#2861)
  Add 3 and 9 hr increment files to IC staging (NOAA-EMC#2876)
  Add diffusion/diag B for aerosol DA and some other needed changes (NOAA-EMC#2738)
  Correct ocean `MOM.res_#` stage copy (NOAA-EMC#2868)
  Support coupling on AWS (NOAA-EMC#2859)
  Add JEDI ATM lgetkf observer and solver jobs (NOAA-EMC#2833)
  Fix gdas build on Gaea and add Gaea to available CI list (NOAA-EMC#2857)
  Support ATM forecast only on Google (NOAA-EMC#2832)
  Add GEFS C48 support on AWS (NOAA-EMC#2818)
  Update omega calculation (NOAA-EMC#2751)
  Add snow DA update and recentering for the EnKF forecasts (NOAA-EMC#2690)
  support ATM forecast only on Azure (NOAA-EMC#2827)
  Convert staging job to python and yaml (NOAA-EMC#2651)
  Fixed test on UNAVAILBLE in python Rocoto check (NOAA-EMC#2842)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully CI-Wcoss2-Passed **Bot use only** CI testing on WCOSS for this PR has completed successfully
Projects
None yet
8 participants