Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should "EXPT_SUBDIR" be a mandatory variable? #978

Closed
mkavulich opened this issue Nov 29, 2023 · 0 comments · Fixed by #997
Closed

Should "EXPT_SUBDIR" be a mandatory variable? #978

mkavulich opened this issue Nov 29, 2023 · 0 comments · Fixed by #997
Assignees
Labels
bug Something isn't working

Comments

@mkavulich
Copy link
Collaborator

mkavulich commented Nov 29, 2023

Description

Currently, the Users Guide (as well as in-line documentation in config_defaults.yaml) states that EXPT_SUBDIR is a mandatory variable: it must be specified by the user. This seems draconian to me, as the only thing it is used for is setting the subdirectory where an experiment is run. And what's more, if the user specifies their own custom full-path value for EXPTDIR, it doesn't get used at all!

As an additional headache, there is currently a bug in the workflow generation that fails in a very messy and unintuitive way if EXPT_SUBDIR is not set:

ln: target ‘}}/launch_FV3LAM_wflow.sh’ is not a directory
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/./generate_FV3LAM_wflow.py", line 768, in <module>
    expt_dir = generate_FV3LAM_wflow(USHdir, wflow_logfile, pargs.debug)
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/./generate_FV3LAM_wflow.py", line 143, in generate_FV3LAM_wflow
    create_symlink_to_file(
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/python_utils/create_symlink_to_file.py", line 53, in create_symlink_to_file
    ln_vrfy(f"-sf {relative_flag} {target} {symlink}")
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/python_utils/filesys_cmds_vrfy.py", line 41, in ln_vrfy
    return cmd_vrfy("ln", *args)
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/python_utils/filesys_cmds_vrfy.py", line 20, in cmd_vrfy
    print_err_msg_exit(f"System call '{cmd}' failed.")
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/python_utils/print_msg.py", line 20, in print_err_msg_exit
    traceback.print_stack(file=sys.stderr)
FATAL ERROR: System call 'ln -sf  /scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/launch_FV3LAM_wflow.sh /scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/expt_dirs/{{ EXPT_SUBDIR }}/launch_FV3LAM_wflow.sh' failed.
Exiting with nonzero status.

*********************************************************************
FATAL ERROR:
Experiment generation failed. See the error message(s) printed below.
For more detailed information, check the log file from the workflow
generation script: /scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/log.generate_FV3LAM_wflow
*********************************************************************

Traceback (most recent call last):
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/./generate_FV3LAM_wflow.py", line 768, in <module>
    expt_dir = generate_FV3LAM_wflow(USHdir, wflow_logfile, pargs.debug)
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/./generate_FV3LAM_wflow.py", line 143, in generate_FV3LAM_wflow
    create_symlink_to_file(
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/python_utils/create_symlink_to_file.py", line 53, in create_symlink_to_file
    ln_vrfy(f"-sf {relative_flag} {target} {symlink}")
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/python_utils/filesys_cmds_vrfy.py", line 41, in ln_vrfy
    return cmd_vrfy("ln", *args)
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/python_utils/filesys_cmds_vrfy.py", line 20, in cmd_vrfy
    print_err_msg_exit(f"System call '{cmd}' failed.")
  File "/scratch1/BMC/hmtb/kavulich/UFS/workdir/issue_966/ufs-srweather-app/ush/python_utils/print_msg.py", line 24, in print_err_msg_exit
    sys.exit(1)
SystemExit: 1

Solution

One of two things should be changed here:

  1. EXPT_SUBDIR should get a default value. The literal string "experiment" seems appropriate, but it could really be anything so long as it's documented appropriately.
  2. generate_FV3LAM_wflow() should fail with a verbose/explicit error message if EXPT_SUBDIR is not specified. This seems as if it was intended behavior, but was not implemented correctly: there is a check for empty EXPT_SUBDIR in setup.py, but it is not empty if not specified in config.yaml, but rather gets set to the literal string {{ EXPT_SUBDIR }}.

I prefer the first solution, but either would be acceptable.

Requirements**

Users should not receive strange errors if they do not set set EXPT_SUBDIR.

Acceptance Criteria (Definition of Done)

Implement either one of the suggested solutions above

@mkavulich mkavulich added the bug Something isn't working label Nov 29, 2023
@mkavulich mkavulich self-assigned this Nov 29, 2023
MichaelLueken pushed a commit that referenced this issue Jan 11, 2024
…d new winter weather verification test with staged data (#997)

New test
* The new test MET_ensemble_verification_winter_wx is added. This test will exercise a number of yet-untested capabilities in the workflow, including a 10-member ensemble, snowfall verification with staged data (so can be run on all platforms, not just Jet and Hera), and several SPP settings.
* As part of this new test, snowfall observations will now be staged on all tier-1 platforms, as well as netCDF GFS data and other observation types, all for the date 2022020300

Resolved issues
* Incorrect octal notation causing ensemble vx to fail #966 Resolved: In several locations, an explicit conversion is done to ENSMEM_INDX to ensure it is a base-10 integer, to avoid problems with bash interpreting numbers with a leading zero as octal.
* Should "EXPT_SUBDIR" be a mandatory variable? #978 resolved: per discussion in a recent SRW code management meeting, give EXPT_SUBDIR a default value "experiment" to avoid unnecessary complications and work for users. Additionally, the default behavior if an experiment directory already exists is changed to "quit" rather than "delete"
* Issue mentioned in this discussion; the setting fhzero=6 is removed from the weather model namelist for CCPP suite FV3_GFS_v17_p8, which allows precipitation and other accumulations to be made every hour rather than 6 hours (SRW output is always hourly, so this makes sense). Also, update diag_table.FV3_GFS_v17_p8 so that all output files will be hourly
* Per discussion in [release/public-v2.2.0] Fix crontab bug for Cheyenne and Derecho, update PR template for new platforms #939, remove an unnecessary special case in get_crontab_contents.py for Derecho

Other fixes
* Some old files for unsupported/removed CCPP suites are removed
* Add some missing task dependencies for retrieving verification obs

General improvements
* Many improvements to verification obs-pulling task
   * NDAS observations are now retrieved for forecast hour zero, and a better obs file is retrieved for major obs times (00z, 06z, 12z, 18z) per EMC guidance
   * Better in-line comments/documentation
   * Standardize order and messaging for file-on-disk checks across all observation types
* Added explanatory comments for reflectivity field in diag_table files
* Update diag_table.FV3_GFS_v17_p8 so that all output files will be hourly
* Simplify task dependencies that rely on staged verification observations; these "get_obs" tasks should always be run (they check that the data exists before trying to retrieve it), so no need to make the dependency conditional
* Add a check in monitor_jobs.py to ensure the yaml file does not contain duplicate experiment directories
* Make sure the key in the experiment dictionary used by is unique by appending the current date/time to the exptdir name; additionally, set this key as the WORKFLOW_ID variable (so that it could be used in the workflow if necessary).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant