-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new unit tests sumchk and bcstchk and update tests #606
Conversation
- Set default debug_model_step=0 (was 99999999) - Add debug_model_[i,j,iblk,task] to define the debug_model diagnostic point in local grid index space. If this point is not set and debug_model is turned on, it will use lonpnt(1),latpnt(1). - Rename forcing_diag namelist/variable to debug_forcing to be more consistent with other "debug_" namelist variables - Rename the local variable forcing_debug in ice_forcing.F90 to local_debug to avoid confusion with global varaible debug_forcing. - Add namelist variable optics_file. Was hardwired in ice_forcing_bgc.F90 - Update optics file variable name to read, still hardwired in model. - Update setting of nbtrcr_sw and allocation of trcrn_sw. nbtrcr_sw was not set in icepack after it was computed and trcrn_sw was allocated before nbtrcr_sw was computed. This impacts the dedd_algae implementation which still isn't working. - move default distribution_wgt_file for gx1 to set_nml.gx1 - update test suite, add testing of debug_model_[i,j,iblk,task], add addtional testing of maskhalo - update documentation
- update ice_broadcast to sync up serial and mpi versions - add get_rank to ice_communicate.F90 - add global_[min/max]val_scalar_int_nodist method to ice_global_reductions.F90 - add tripole output in ice_blocks.F90 with debug_blocks - update set_nml.tx1 to set ns_boundary_type to 'tripole', was 'open'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a big chunk of work, making the code more consistent and (hopefully) robust. Please add some documentation that explain what the unit tests are doing.
@@ -1683,6 +1685,56 @@ function global_maxval_scalar_int (scalar, dist) & | |||
|
|||
end function global_maxval_scalar_int | |||
|
|||
!*********************************************************************** | |||
|
|||
function global_maxval_scalar_int_nodist (scalar, communicator) & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are the new global_max/minval_scalar_int_nodist functions needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. The answer is a little complicated. I added this because I needed it for testing. In the bcstchk (ice_broadcast check) unit test, I broadcast data from one pe to others using the ice_broadcast (mpi or serial) infrastructure. Then I have to validate that. To validate it, I have each pe check it's results and then I broadcast that result to all pes, so now there is a global understanding of whether the test passed. To do that, I need MPI when running on multiple pes but I can't have any MPI when running on a single pe testing serial. So, that broadcast (which is really implemented as an MPI_MAX global reduction) needed to live in the serial/mpi directories rather than in the unit test because we don't have a "NO_MPI" cpp yet. Because all of the global_reduction infrastructure requires a "distribution" (but the ice_broadcast does not and the bcstchk has no grid/decomposition defined), I decided to implement a global reduction with just a communicator, basically as a simple layer on MPI, but it had to be in the serial/mpi directories.
If this is undesirable, there are a few other things I could do
- I could leverage the existing global reduction infrastructure, but this would mean the bcstchk unit tester would have to initialize a grid/decomposition even though it's not needed for the unit test.
- We could add a cpp something like "NO_MPI" and I could then leverage that in the bcstchk itself and implement the MPI global reduction there.
I prefer not to do the first option. The second option is something that maybe we want to do eventually anyway, so I could start to implement it now. All the mpi and "no_mpi" code is segregated into the serial and mpi directories right now. Ultimately, we may want to merge those directories in which case we'd need a cpp.
! | ||
! author: Phil Jones, LANL | ||
! Oct. 2004: Adapted from POP version by William H. Lipscomb, LANL | ||
|
||
#ifndef SERIAL_REMOVE_MPI | ||
use mpi ! MPI Fortran module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the serial comm directory, why have anything "mpi"? Is this the first step in merging the comm directories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experience tells us that maintaining multiple "versions" of the same code (i.e. comm/mpi and comm/serial or io/io_binary, io/io_netcdf, and io/io_pio2) is not straight-forward. What often happens is a change is made in one version but not another. And that inconsistency can take a while to discover. Another option is to merge the code together, but then CPPs are needed. Ultimately, it's a tradeoff between having multiple sets of source code and using CPPs.
In this case, where possible, I'm trying to use the same code on the mpi and serial directories minus a CPP that is hardcoded into the serial side. ice_global_reductions.F90 and ice_broadcast.F90 are implemented this way. That means a simple "diff" can tell whether the codes are synced up. In practice, what I'm doing is introducing the CPPs while keeping two separate directories. So a bit of the worst of both scenarios. But until we decide to merge the directories and use CPPs, I feel it's a viable way to maintain the code in the two directories.
@@ -3,6 +3,7 @@ runtype = 'initial' | |||
ice_ic = 'default' | |||
grid_format = 'bin' | |||
grid_type = 'tripole' | |||
ns_boundary_type = 'tripole' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There probably should be a check during the initialization that if grid_type = 'tripole', then ns_boundary_type = 'tripole'. I haven't heard of any east-west tripole grids, and I don't think CICE's boundary conditions could handle that anyhow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, I'll add it.
@@ -2948,6 +2951,10 @@ subroutine init_zbgc | |||
endif | |||
if (.NOT. dEdd_algae) nbtrcr_sw = 1 | |||
|
|||
! tcraig, added 6/1/21, why is nbtrcr_sw set here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a quick look at the code and it wasn't obvious. I suspect that there's 2 considerations which conflict with each other, making it confusing:
- The need the count up all of the tracer indices, to make sure we have the right number
- An attempt to keep the BGC code as modular as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes an issue in the current bgc sw implementation, but there are still others are pointed out in #437.
! fieldname='bcint_enh_mam_cice' | ||
fieldname='modalBCabsorptionParameter5band' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to keep this comment ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. I think ultimately we should have a namelist specify the fieldname. Let me implement that now and I'll test and push an update.
@@ -654,6 +658,7 @@ zbgc_nml | |||
"``mu_max_phaeo``", "real", "maximum growth rate phaeocystis per day", "0.851" | |||
"``mu_max_sp``", "real", "maximum growth rate small plankton per day", "0.851" | |||
"``nitratetype``", "real", "mobility type between stationary and mobile nitrate", "-1.0" | |||
"``optics_file``", "string", "optics file associated with modal aersols", "unknown_optics_file" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/aersols/aerosols/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you.
use ice_constants, only: field_loc_center | ||
use ice_constants, only: field_loc_Nface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand why this is changed here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sumchk is testing the ice_global_reduction infrastructure. If you look at that infrastructure, it relies heavily on the information contained in the grid and decomposition including the location and size of the halo, whether the grid is a tripole grid, and so forth. For non-tripole grids, the global reductions operate on the grids (sans halo areas) and the field_loc value makes no difference. However, for the tripole grids, there is special logic for field_loc_Nface and field_loc_NEface in order to remove redundant points on the tripole zipper.
I have added a test of sumchk with the tx1 grid and in order to trigger the tripole logic in the global reduction, I have to set the field location as Nface or NEface. I picked Nface arbitrarily. This has no impact on non-tripole grids. If I stick with "center" for tripole grids, I don't trigger the tripole logic. One could argue I should test center and Nface for tripole and non-tripole grids, but I decided to just do Nface for both. Maybe I'll revise that.
- add grid_type and ns_boundary_type tripole check - update sumchk unit test to check both Nface and center points. these are treated differently for tripole grids. - update documentation of unit tests
I updated the documentation, added optics_file_fieldname namelist, and extended the sumchk test to include center and Nface field locations. A brief description of each unit test is now at the top of each unit test .F90 file and in the user guide, https://cice-consortium-cice--606.org.readthedocs.build/en/606/user_guide/ug_testing.html#unit-testing. |
'forcing_diag' was renamed to 'debug_forcing' in d6eb125 (Add new unit tests sumchk and bcstchk and update tests (CICE-Consortium#606), 2021-06-09), but this instance was not renamed in the doc.
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests (CICE-Consortium#606), 2021-06-09), the namelist variables 'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed to print diagnostics for the grid point specified by those variables, if they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The "Diagnostic files" subsection of the "Implementation" section of the User guide was updated, but not the "Debugging hints" subsection of the "Troubleshooting" section, where the various namelist flags and subroutines useful for debugging are mentioned. Adjust the description of 'debug_model' there, and also mention that in contrast to 'print_points', this option outputs diagnostics several times per time step.
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests (CICE-Consortium#606), 2021-06-09), the namelist variables 'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed to print diagnostics for the grid point specified by those variables, if they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The "Diagnostic files" subsection of the "Implementation" section of the User guide was updated, but not the "Debugging hints" subsection of the "Troubleshooting" section, where the various namelist flags and subroutines useful for debugging are mentioned. Adjust the description of 'debug_model' there, and also mention that in contrast to 'print_points', this option outputs diagnostics several times per time step. While at it, mention that 'print_points' output is every 'diagfreq' time steps.
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests (CICE-Consortium#606), 2021-06-09), the namelist variables 'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed to print diagnostics for the grid point specified by those variables, if they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The "Diagnostic files" subsection of the "Implementation" section of the User guide was updated, but not the "Debugging hints" subsection of the "Troubleshooting" section, where the various namelist flags and subroutines useful for debugging are mentioned. Adjust the description of 'debug_model' there, and also mention that in contrast to 'print_points', this option outputs diagnostics several times per time step. While at it, mention that 'print_points' output is every 'diagfreq' time steps.
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests (CICE-Consortium#606), 2021-06-09), the namelist variables 'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed to print diagnostics for the grid point specified by those variables, if they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The "Diagnostic files" subsection of the "Implementation" section of the User guide was updated, but not the "Debugging hints" subsection of the "Troubleshooting" section, where the various namelist flags and subroutines useful for debugging are mentioned. Adjust the description of 'debug_model' there, and also mention that in contrast to 'print_points', this option outputs diagnostics several times per time step. While at it, mention that 'print_points' output is every 'diagfreq' time steps.
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests (CICE-Consortium#606), 2021-06-09), the namelist variables 'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed to print diagnostics for the grid point specified by those variables, if they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The "Diagnostic files" subsection of the "Implementation" section of the User guide was updated, but not the "Debugging hints" subsection of the "Troubleshooting" section, where the various namelist flags and subroutines useful for debugging are mentioned. Adjust the description of 'debug_model' there, and also mention that in contrast to 'print_points', this option outputs diagnostics several times per time step. While at it, mention that 'print_points' output is every 'diagfreq' time steps.
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests (CICE-Consortium#606), 2021-06-09), the namelist variables 'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed to print diagnostics for the grid point specified by those variables, if they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The "Diagnostic files" subsection of the "Implementation" section of the User guide was updated, but not the "Debugging hints" subsection of the "Troubleshooting" section, where the various namelist flags and subroutines useful for debugging are mentioned. Adjust the description of 'debug_model' there, and also mention that in contrast to 'print_points', this option outputs diagnostics several times per time step. While at it, mention that 'print_points' output is every 'diagfreq' time steps.
PR checklist
Update testing and add sumchk and bcstchk unit tests
apcraig
Full tests suites run on cheyenne on 3 compilers, https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#d7d59a49665d432dfd06216d16915c3b52d4b98f. Note that test results for tx1 change because the ns_boundary_type was changed from 'open' to 'tripole'.
Update testing