Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

github actions utest #169

Merged

Conversation

MinsukJi-NOAA
Copy link
Contributor

This PR addresses issue #168

  • Include continuous integration related files (scripts, github actions yaml file, dockerfile, etc.)
  • Include updates to the utest script

@MinsukJi-NOAA MinsukJi-NOAA marked this pull request as ready for review August 31, 2020 21:30
@MinsukJi-NOAA
Copy link
Contributor Author

RT on Orion keeps failing, due to compile_13 failing with this error message:

/work/noaa/stmp/jminsuk/stmp/jminsuk/FV3_RT/rt_420215/compile_13/build_fv3_13/FV3/ccpp/physics/ccpp_static_api.F90(966): catastrophic error: **Internal compiler error: internal abort** Please report this error along with the circumstances in which it occurred in a Software Problem Report. Note: File and line given may not be explicit cause of this error. compilation aborted for /work/noaa/stmp/jminsuk/stmp/jminsuk/FV3_RT/rt_420215/compile_13/build_fv3_13/FV3/ccpp/physics/ccpp_static_api.F90 (code 1)`

@climbfuji
Copy link
Collaborator

RT on Orion keeps failing, due to compile_13 failing with this error message:

/work/noaa/stmp/jminsuk/stmp/jminsuk/FV3_RT/rt_420215/compile_13/build_fv3_13/FV3/ccpp/physics/ccpp_static_api.F90(966): catastrophic error: **Internal compiler error: internal abort** Please report this error along with the circumstances in which it occurred in a Software Problem Report. Note: File and line given may not be explicit cause of this error. compilation aborted for /work/noaa/stmp/jminsuk/stmp/jminsuk/FV3_RT/rt_420215/compile_13/build_fv3_13/FV3/ccpp/physics/ccpp_static_api.F90 (code 1)`

Hmm. I have seen these internal compiler errors for other projects in the past, it often had to do with a too complicated optimization task that the compiler is trying to accomplish. We need to figure out which COMPILE command build_13 corresponds to, and what line the compiler is complaining about. Which version of the Intel compiler are you using?

@MinsukJi-NOAA
Copy link
Contributor Author

Hmm. I have seen these internal compiler errors for other projects in the past, it often had to do with a too complicated optimization task that the compiler is trying to accomplish. We need to figure out which COMPILE command build_13 corresponds to, and what line the compiler is complaining about. Which version of the Intel compiler are you using?

modulefiles/orion.intel/fv3 module loads intel/2018

compile 13 appears to be CCPP=Y and DEBUG=Y

@climbfuji
Copy link
Collaborator

Hmm. I have seen these internal compiler errors for other projects in the past, it often had to do with a too complicated optimization task that the compiler is trying to accomplish. We need to figure out which COMPILE command build_13 corresponds to, and what line the compiler is complaining about. Which version of the Intel compiler are you using?

modulefiles/orion.intel/fv3 module loads intel/2018

compile 13 appears to be CCPP=Y and DEBUG=Y

Interesting. I recently made this change (and it worked on all machines, including orion) so that we have at least one compile command that tests compiling the code without providing any suites (i.e. compile all available suites). What you can do is to duplicate this line N times (where N is the number of machines, i.e. one line for hera.intel, one for cheyenne.intel, ...) and modify the line for orion to include just the suites that are required to run the following tests.

@MinsukJi-NOAA
Copy link
Contributor Author

Interesting. I recently made this change (and it worked on all machines, including orion) so that we have at least one compile command that tests compiling the code without providing any suites (i.e. compile all available suites). What you can do is to duplicate this line N times (where N is the number of machines, i.e. one line for hera.intel, one for cheyenne.intel, ...) and modify the line for orion to include just the suites that are required to run the following tests.

That change allowed the RT to pass. Thanks.

@MinsukJi-NOAA
Copy link
Contributor Author

Preliminary CI documentation can be found here: CI tests for UFS-weather-model

@MinsukJi-NOAA
Copy link
Contributor Author

For an example of a CI test, see Add hera RT results

For an example of a skipped CI test, see Add orion RT results

modulefiles/linux.gnu/fv3 Show resolved Hide resolved
.github/workflows/main.yml Show resolved Hide resolved
tests/ci/json_helper.py Show resolved Hide resolved
tests/run_test.sh Show resolved Hide resolved
@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Sep 3, 2020 via email

@climbfuji
Copy link
Collaborator

@DusanJovic-NOAA I remember Dom said the github action running time for public repo is unlimited, also because of the characteristics of weather model forecast tests, we have to run jobs longer than a few minutes, so what are the issues?

Have a look at https://github.com/pricing, scroll down to "Compare features" please.

@MinsukJi-NOAA
Copy link
Contributor Author

Updated utest documentation can be found here: Unit Test for UFS-weather-model

@MinsukJi-NOAA
Copy link
Contributor Author

All regression tests passed on Hera, Orion, and Dell. All unit tests except restart test passed on Hera, Orion, and Dell. This PR is ready for merge.

Copy link
Collaborator

@climbfuji climbfuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok to me and my limited understanding of the CI design.

@DusanJovic-NOAA DusanJovic-NOAA merged commit 407df4e into ufs-community:develop Sep 9, 2020
DavidHuber-NOAA added a commit to DavidHuber-NOAA/ufs-weather-model that referenced this pull request Sep 10, 2020
@MinsukJi-NOAA MinsukJi-NOAA deleted the feature/ContInteg branch October 19, 2020 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants