Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USGS_GFmz grid fails to run #273

Closed
ekluzek opened this issue Mar 31, 2022 · 3 comments
Closed

USGS_GFmz grid fails to run #273

ekluzek opened this issue Mar 31, 2022 · 3 comments
Assignees
Labels

Comments

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 31, 2022

The USGS_GFmz grid fails to run because of a known issue with there being a different number of river segments to HRU's. @nmizukami tried a simple fix for this that allowed it to work, but had impact on other grids.

Three tests that fail are:

ERS.nldas2_rUSGS_mnldas2.I2000Clm50SpMizGs.cheyenne_gnu.mizuroute-default RUN
ERS.nldas2_rUSGS_mnldas2.I2000Clm50SpMizGs.cheyenne_intel.mizuroute-default RUN
SMS_D.nldas2_rUSGS_mnldas2.I2000Clm50SpMizGs.cheyenne_intel.mizuroute-default RUN

The intel and nag ERS tests are also expected to fail on izumi.

ERS.nldas2_rUSGS_mnldas2.I2000Clm50SpMizGs.izumi_intel.mizuroute-default RUN
ERS.nldas2_rUSGS_mnldas2.I2000Clm50SpMizGs.izumi_inag.mizuroute-default RUN

On cheyenne the SMS_D test seems to fail in datm with the following backtrace in the cesm.log file..

5:MPT: #1  0x00002b67a1552306 in mpi_sgi_system (
5:MPT: #2  MPI_SGI_stacktraceback (
5:MPT:     header=header@entry=0x7ffe0f65f3d0 "MPT ERROR: Rank 5(g:5) received signal SIGFPE(8).\n\tProcess ID: 63742, Host: r13i0n2, Program: /glade/scratch/erik/tests_ctsm51d86mizum/SMS_D.nldas2_rUSGS_mnldas2.I2000Clm50SpMizGs.cheyenne_intel.mizur"...) at sig.c:340
5:MPT: #3  0x00002b67a15524ff in first_arriver_handler (signo=signo@entry=8, 
5:MPT:     stack_trace_sem=stack_trace_sem@entry=0x2b67b0e60080) at sig.c:489
5:MPT: #4  0x00002b67a1552793 in slave_sig_handler (signo=8, siginfo=<optimized out>, 
5:MPT:     extra=<optimized out>) at sig.c:565
5:MPT: #5  <signal handler called>
5:MPT: #6  0x00000000009b00ec in datm_datamode_clmncep_mod::datm_datamode_clmncep_advance (mainproc=.FALSE., logunit=6, mpicom=20, rc=0)
5:MPT:     at /glade/work/erik/ctsm_mizuRoute/components/cdeps/datm/datm_datamode_clmncep_mod.F90:422
5:MPT: #7  0x00000000009a2bd6 in atm_comp_nuopc::datm_comp_run (importstate=..., 
5:MPT:     exportstate=..., target_ymd=20000101, target_tod=0, target_mon=1, 
5:MPT:     orbeccen=0.016703660392765603, orbmvelpp=4.9374577904881578, 
5:MPT:     orblambm0=-0.032472495661529328, orbobliqr=0.40910112257977893, 
5:MPT:     restart_write=.FALSE., rc=0)
5:MPT:     at /glade/work/erik/ctsm_mizuRoute/components/cdeps/datm/atm_comp_nuopc.F90:630
5:MPT: #8  0x00000000009a07ee in atm_comp_nuopc::initializerealize (gcomp=..., 
5:MPT:     importstate=..., exportstate=..., clock=..., rc=0)
5:MPT:     at /glade/work/erik/ctsm_mizuRoute/components/cdeps/datm/atm_comp_nuopc.F90:411
5:MPT: #9  0x00002b679aead489 in ESMCI::FTable::callVFuncPtr (this=0xc1f5e00, 
5:MPT:     name=0xc19a540 "InitializeIC07P", vm_pointer=0xc1f5c80, 
5:MPT:     userrc=0x7ffe0f663088)
5:MPT:     at /glade/p/cesmdata/cseg/PROGS/build/28560/esmf-8.2.0b23/src/Superstructure/Component/src/ESMCI_FTable.C:2167
5:MPT: #10 0x00002b679aea8c31 in ESMCI_FTableCallEntryPointVMHop (vm=0xc1f5c80, 
5:MPT:     cargoCast=0xc19a540)

The gnu test shows a more sensible error inside of mizuRoute

36:MPT: 	add-auto-load-safe-path /glade/u/apps/ch/opt/gnu/10.1.0/lib64/libstdc++.so.6.0.28-gdb.py
36:MPT: line to your configuration file "/glade/u/home/erik/.gdbinit".
36:MPT: To completely disable this security protection add
36:MPT: 	set auto-load safe-path /
36:MPT: line to your configuration file "/glade/u/home/erik/.gdbinit".
36:MPT: For more information about this security protection see the
36:MPT: "Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
36:MPT: 	info "(gdb)Auto-loading safe path"
36:MPT: Missing separate debuginfos, use: zypper install glibc-debuginfo-2.22-49.16.x86_64
36:MPT: (gdb) #0  0x00002ae3b05fb6da in waitpid ()
36:MPT:    from /glade/u/apps/ch/os/lib64/libpthread.so.0
36:MPT: #1  0x00002ae3b0941c66 in mpi_sgi_system (
36:MPT: #2  MPI_SGI_stacktraceback (
36:MPT:     header=header@entry=0x7ffced91f9d0 "MPT ERROR: Rank 36(g:36) received signal SIGABRT/SIGIOT(6).\n\tProcess ID: 7973, Host: r13i3n11, Program: /glade/scratch/erik/tests_ctsm51d86mizum/ERS.nldas2_rUSGS_mnldas2.I2000Clm50SpMizGs.cheyenne_gnu"...) at sig.c:340
36:MPT: #3  0x00002ae3b0941e66 in first_arriver_handler (signo=signo@entry=6, 
36:MPT:     stack_trace_sem=stack_trace_sem@entry=0x2ae3bdca0080) at sig.c:489
36:MPT: #4  0x00002ae3b09420f3 in slave_sig_handler (signo=6, siginfo=<optimized out>, 
36:MPT:     extra=<optimized out>) at sig.c:565
36:MPT: #5  <signal handler called>
36:MPT: #6  0x00002ae3b10d28d7 in raise () from /glade/u/apps/ch/os/lib64/libc.so.6
36:MPT: #7  0x00002ae3b10d3caa in abort () from /glade/u/apps/ch/os/lib64/libc.so.6
36:MPT: #8  0x00002ae3b11101b4 in __libc_message ()
36:MPT:    from /glade/u/apps/ch/os/lib64/libc.so.6
36:MPT: #9  0x00002ae3b11159d6 in malloc_printerr ()
36:MPT:    from /glade/u/apps/ch/os/lib64/libc.so.6
36:MPT: #10 0x0000000000aa8c15 in __mpi_process_MOD_comm_ntopo_data ()
36:MPT: #11 0x0000000000a7c95a in __init_model_data_MOD_init_ntopo_data ()
36:MPT: #12 0x0000000000a74a34 in __rtmmod_MOD_route_ini ()
36:MPT: #13 0x0000000000a65d6b in __rof_comp_nuopc_MOD_initializerealize ()

Similarly the ERS intel test shows an error inside of mizuRoute for MPI

36:MPT: #20 0x00000000014c3ac0 in for_alloc_allocatable ()
36:MPT: #21 0x0000000000eebdb1 in mpi_utils_mp_shr_mpi_allgatherlogicalv_ ()
36:MPT:     at /glade/work/erik/ctsm_mizuRoute/components/mizuRoute/route/build/src/mpi_utils.f90:521
36:MPT: #22 0x0000000000ec9f8e in mpi_process_mp_comm_ntopo_data_ ()
36:MPT:     at /glade/work/erik/ctsm_mizuRoute/components/mizuRoute/route/build/src/mpi_process.f90:410
36:MPT: #23 0x0000000000ec07ab in init_model_data_mp_init_ntopo_data_ ()
36:MPT:     at /glade/work/erik/ctsm_mizuRoute/components/mizuRoute/route/build/src/init_model_data.f90:228
36:MPT: #24 0x0000000000eb1e04 in rtmmod_mp_route_ini_ ()
36:MPT:     at /glade/work/erik/ctsm_mizuRoute/components/mizuRoute/route/build/cpl/RtmMod.F90:177
36:MPT: #25 0x0000000000ea0975 in rof_comp_nuopc_mp_initializerealize_ ()
36:MPT:     at /glade/work/erik/ctsm_mizuRoute/components/mizuRoute/route/build/cpl/nuopc/rof_comp_nuopc.F90:521
36:MPT: #26 0x00002ac1d95b3dee in ESMCI::FTable::callVFuncPtr(char const*, ESMCI::VM*, int*) ()
@ekluzek ekluzek added the bug label Mar 31, 2022
ekluzek added a commit to ekluzek/mizuRoute that referenced this issue Mar 31, 2022
@nmizukami nmizukami self-assigned this Apr 1, 2022
@nmizukami
Copy link
Collaborator

Current data structure - rtmCTL - assumes HRU (catchment) where ROF receive areal mean runoff from LND and river reach has one-to-one relationship, allow both elements to have the same domain decomposition information. For example, gridded network, one grid-box is catchment, but also considered as river reach (actually this concept does not exist in gridded network).

mizuRoute can handle many HRUs to one reach relationship. USGS network has two HRU to one reach relationship, and HRU and reach has separate domain decompositions.

To accommodate this, rtmCTL%discharge, and rtmCTL%volr need to be populated with domain decomposition based on the reach not HRU.

For two-way coupling (possibly future), need to distribute discharge, volume and flood from one reach to (many) HRU(s) somehow.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Apr 26, 2022

We think that @nmizukami fixed the underlying issues in mizuRoute with this. However, the overall problem remains, but we believe it's a CTSM or CDEPS/datm issue. See the issue in CTSM about this

ESCOMP/CTSM#1724

@ekluzek
Copy link
Collaborator Author

ekluzek commented May 21, 2023

The underlying CTSM and CDEPS issues are resolved and this is now working.

@ekluzek ekluzek closed this as completed May 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants