You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A user was trying to run with ladjust_bury_coeff in user_nl_marbl (which is not a very common configuration); he was also trying to get 100+ SYPD out of the gx3v7 grid (which is not a very common requirement), so he was running with 288 ocean tasks. gen_pop_decomp was giving a layout that creating 290 blocks, and reported the model crashing in ecosys_driver.F90:513 at
it turns out the issue is that marbl_instances is size max_blocks_clinic (2, in his configuration) and we only want these loops running through nblocks_clinic (1 on most tasks), so ladjust_bury_coeff currently can't be true if any block has nblocks_clinic < max_blocks_clinic. Fixing that moved the error to ecosys_driver:640:
(I think the third dimension of land_mask and TAREA are both max_blocks_clinic while the allocate() statement for glo_avg_area_masked in line 638 shows it uses nblocks_clinic instead.)
As you can tell, I've started working on a fix for this... I think I changed the above block to explicitly use 1:nblocks_clinic for the third dimension of land_mask in 639 and TAREA in 640, but got yet another error elsewhere.
The original user who reported the problem was happy to be given a 252 task layout that keeps max_blocks_clinic=1, so fixing this is not urgent. I'm putting all this detail in the issue ticket because I'm going to set it aside for a few weeks while I focus on more pressing issues, but it would probably be good to eventually come back and fix the bug.
I also think it would be useful to update the test suite to try to explicitly test cases where ladjust_bury_coeff = .true. and either some tasks have more blocks than others, or some tasks have no blocks. I expect both of those tests would fail currently.
Version:
CESM: 2_3_beta09; I believe the first user was running CESM 2.1.x
POP2: cesm_pop_2_1_20220322
Machine/Environment Description:
error was reported on cheyenne and that's also where I reproduced the issue in the latest codebase
Any xml/namelist changes or SourceMods:
The text was updated successfully, but these errors were encountered:
Description of the issue:
A user was trying to run with
ladjust_bury_coeff
inuser_nl_marbl
(which is not a very common configuration); he was also trying to get 100+ SYPD out of thegx3v7
grid (which is not a very common requirement), so he was running with 288 ocean tasks.gen_pop_decomp
was giving a layout that creating 290 blocks, and reported the model crashing inecosys_driver.F90:513
atit turns out the issue is that
marbl_instances
is sizemax_blocks_clinic
(2, in his configuration) and we only want these loops running throughnblocks_clinic
(1 on most tasks), soladjust_bury_coeff
currently can't be true if any block hasnblocks_clinic < max_blocks_clinic
. Fixing that moved the error toecosys_driver:640
:(I think the third dimension of
land_mask
andTAREA
are bothmax_blocks_clinic
while theallocate()
statement forglo_avg_area_masked
in line 638 shows it usesnblocks_clinic
instead.)As you can tell, I've started working on a fix for this... I think I changed the above block to explicitly use
1:nblocks_clinic
for the third dimension ofland_mask
in 639 andTAREA
in 640, but got yet another error elsewhere.The original user who reported the problem was happy to be given a 252 task layout that keeps
max_blocks_clinic=1
, so fixing this is not urgent. I'm putting all this detail in the issue ticket because I'm going to set it aside for a few weeks while I focus on more pressing issues, but it would probably be good to eventually come back and fix the bug.I also think it would be useful to update the test suite to try to explicitly test cases where
ladjust_bury_coeff = .true.
and either some tasks have more blocks than others, or some tasks have no blocks. I expect both of those tests would fail currently.Version:
2_3_beta09
; I believe the first user was running CESM 2.1.xcesm_pop_2_1_20220322
Machine/Environment Description:
error was reported on cheyenne and that's also where I reproduced the issue in the latest codebase
Any xml/namelist changes or SourceMods:
The text was updated successfully, but these errors were encountered: