Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Update WM and UPP hashes #1083

Merged

Conversation

MichaelLueken
Copy link
Collaborator

DESCRIPTION OF CHANGES:

This PR brings the weather model hash to 26cb9e6 (May 2) and UPP to 5faac75 (April 9).

Type of change

  • New feature (non-breaking change which adds functionality)

TESTS CONDUCTED:

  • hera.intel
  • orion.intel
  • hercules.intel
  • derecho.intel
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

Fundamental tests were ran on all machines. Comprehensive tests were ran on Gaea, Hera, Hercules, and Orion.

DEPENDENCIES:

None

DOCUMENTATION:

No documentation updates required

ISSUE:

None

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes

@RatkoVasic-NOAA
Copy link
Collaborator

With new tags, it compiled and single test on Hera passed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used
----------------------------------------------------------------------------------------------------
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16_2024051020363  COMPLETE              20.63
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE              20.63

Approved.

@EdwardSnyder-NOAA
Copy link
Collaborator

Ran fundamental tests on AWS and they all passed. Approving.

Calculating core-hour usage and printing final summary
----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used
----------------------------------------------------------------------------------------------------
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta_2  COMPLETE             150.83
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2_20240  COMPLETE              17.46
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot  COMPLETE              68.85
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR_2024051  COMPLETE             286.17
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_RAP_suite_WoFS_v0_20240513134  COMPLETE             110.64
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16_2024051313465  COMPLETE             109.92
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE             743.87

…e grid_SUBCONUS_Ind_3km_ics_FV3GFS_lbcs_FV3GFS_suite_WoFS_v0 WE2E test configuration
@MichaelLueken
Copy link
Collaborator Author

Hi @BruceKropp-Raytheon -

I just pushed a modification for the grid_SUBCONUS_Ind_3km_ics_FV3GFS_lbcs_FV3GFS_suite_WoFS_v0 WE2E test configuration that should correct the failure that is happening on Hercules. Specifically, it is possible for the run_fcst job to either run the time steps in a second, or every 5 seconds. If the node allows for 1 second time steps, then the run_fcst job will successfully pass in 1 hour. However, if the node is running 5 second time steps, then the run_fcst job won't complete in 1 hour, leading to a failure due to exceeding walltime. Increasing the walltime from 1 hour to 2 hours allows the run_fcst job to pass regardless of the node that is used on Hercules.

@BruceKropp-Raytheon
Copy link
Contributor

Very nice @MichaelLueken !
I wonder if this is also the case for Orion and Jet, as these occasionally timeout before completing the single test.

@MichaelLueken
Copy link
Collaborator Author

@BruceKropp-Raytheon -

I believe that it is very similar to the occasional issues on Orion.

Looking at the failure on Jet from last night's test (before maintenance began), the run_fcst job successfully ran to completion. There was a failure in verification. The node that the failed job landed on was bad and the job hung until the walltime was hit, at which time it failed.

@MichaelLueken MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label May 15, 2024
@MichaelLueken
Copy link
Collaborator Author

All Jenkins tests have successfully passed. Moving forward with merging this work now.

@MichaelLueken MichaelLueken merged commit 6ddf61b into ufs-community:develop May 15, 2024
4 of 5 checks passed
@MichaelLueken MichaelLueken deleted the feature/hash_update branch May 15, 2024 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants