Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

32-bit physics with FV3_RAP #1215

Merged
merged 95 commits into from
Jul 19, 2022

Conversation

SamuelTrahanNOAA
Copy link
Collaborator

@SamuelTrahanNOAA SamuelTrahanNOAA commented May 12, 2022

  • This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.

  • This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR

  • An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
    are specified below.

  • Results for one or more of the regression tests change and the reasons for the changes are understood and explained below. There are new regression tests, but no tests should change results. It is critical that all tests be run against their existing baselines, to ensure 64-bit physics does not change results on any platform.

  • New or updated input data is required by this PR. If checked, please work with the code managers to update input data sets on all platforms.

Description

This PR gets 32-bit physics working in FV3 within UFS. The ccpp portion of this is based on work done by NRL on the Neptune model, merged a few weeks ago. With much help from @DusanJovic-NOAA @grantfirl and @climbfuji, I've gotten it working in UFS as well. What works is:

  1. All suites compile with 32-bit physics and either 64-bit or 32-bit dynamics.
  2. The FV3_RAP and FV3_HRRR suites can run and produce output in either of those modes.
  3. There are regression tests that ensure those don't break.
  4. No existing tests change results.

For the purposes of testing, it is critical that all existing regression tests run against their current baselines, to ensure 64-bit physics do not change answers on any platforms. The only tests that need new baselines are the 32-bit physics tests, which I have copied to 32bit.conf and 32bit_gnu.conf for your convenience. (They are also in rt.conf and rt_gnu.conf.)

Testing

How were these changes tested? What compilers / HPCs was it tested with? Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Have regression tests and unit tests (utests) been run? On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)

  • hera.intel
  • hera.gnu
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • opnReqTest for newly added/changed feature
  • CI

Issues

Fixes #1288

Dependencies

NOAA-PSL/stochastic_physics#59
NOAA-EMC/fv3atm#534
NCAR/ccpp-physics#930
NOAA-GFDL/GFDL_atmos_cubed_sphere#193
earth-system-radiation/rte-rrtmgp#180 this was already merged

@DusanJovic-NOAA
Copy link
Collaborator

I can not clone this branch from Hera:

$ git clone --recursive --branch sing_prec_from_main https://github.com/SamuelTrahanNOAA/ufs-weather-model
Cloning into 'ufs-weather-model'...
remote: Enumerating objects: 19882, done.
remote: Counting objects: 100% (69/69), done.
remote: Compressing objects: 100% (60/60), done.
remote: Total 19882 (delta 17), reused 14 (delta 9), pack-reused 19813
Receiving objects: 100% (19882/19882), 61.24 MiB | 28.03 MiB/s, done.
Resolving deltas: 100% (16007/16007), done.

 . . .

Cloning into '/scratch2/NCEPDEV/fv3-cam/Dusan.Jovic/ufs/32bit/ufs-weather-model/FV3'...
Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of 'ssh://[email protected]/SamuelTrahanNOAA/fv3atm' into submodule path '/scratch2/NCEPDEV/fv3-cam/Dusan.Jovic/ufs/32bit/ufs-weather-model/FV3' failed
Failed to clone 'FV3' a second time, aborting

Please change submodule URL's to https.

@SamuelTrahanNOAA
Copy link
Collaborator Author

Please change submodule URL's to https

Done. The entire tree now uses https.

@DusanJovic-NOAA
Copy link
Collaborator

Please change submodule URL's to https

Done. The entire tree now uses https.

Thanks.

@climbfuji
Copy link
Collaborator

Please change submodule URL's to https

Done. The entire tree now uses https.

Thanks.

Thanks. I had the same problem and had to switch to https locally as well.

@DusanJovic-NOAA
Copy link
Collaborator

I made this changes in FV3:

diff --git a/ccpp/data/GFS_typedefs.F90 b/ccpp/data/GFS_typedefs.F90
index f125489..320f9cf 100644
--- a/ccpp/data/GFS_typedefs.F90
+++ b/ccpp/data/GFS_typedefs.F90
@@ -1,6 +1,6 @@
 module GFS_typedefs
 
-   use machine,                  only: kind_phys, kind_dbl_prec
+   use machine,                  only: kind_phys, kind_dbl_prec, kind_sngl_prec
    use physcons,                 only: con_cp, con_fvirt, con_g,                       &
                                        con_hvap, con_hfus, con_pi, con_rd, con_rv,     &
                                        con_t0c, con_cvap, con_cliq, con_eps, con_epsq, &
diff --git a/module_fcst_grid_comp.F90 b/module_fcst_grid_comp.F90
index 886f23a..a713fbe 100644
--- a/module_fcst_grid_comp.F90
+++ b/module_fcst_grid_comp.F90
@@ -38,6 +38,8 @@ if (rc /= ESMF_SUCCESS) write(0,*) 'rc=',rc,__FILE__,__LINE__; if(ESMF_LogFoundE
                                 atmos_model_exchange_phase_2,              &
                                 addLsmask2grid, atmos_model_get_nth_domain_info
 
+  use GFS_typedefs,       only: kind_phys, kind_sngl_prec
+
   use constants_mod,      only: constants_init
   use fms_mod,            only: error_mesg, fms_init, fms_end,             &
                                 write_version_number, uppercase
@@ -161,6 +163,7 @@ if (rc /= ESMF_SUCCESS) write(0,*) 'rc=',rc,__FILE__,__LINE__; if(ESMF_LogFoundE
     integer,dimension(2,6):: decomptile                  !define delayout for the 6 cubed-sphere tiles
     integer,dimension(2)  :: regdecomp                   !define delayout for the nest grid
     type(ESMF_Decomp_Flag):: decompflagPTile(2,6)
+    type(ESMF_TypeKind_Flag) :: grid_typekind
     character(3)          :: myGridStr
     type(ESMF_DistGrid)   :: distgrid
     type(ESMF_Array)      :: array
@@ -188,6 +191,12 @@ if (rc /= ESMF_SUCCESS) write(0,*) 'rc=',rc,__FILE__,__LINE__; if(ESMF_LogFoundE
     call ESMF_InfoGet(info, key="layout", values=layout, rc=rc); ESMF_ERR_ABORT(rc)
     if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
 
+    if (kind_phys == kind_sngl_prec) then
+      grid_typekind = ESMF_TYPEKIND_R4
+    else
+      grid_typekind = ESMF_TYPEKIND_R8
+    endif
+
     if (trim(name)=="global") then
       ! global domain
       call ESMF_InfoGet(info, key="tilesize", value=tilesize, rc=rc); ESMF_ERR_ABORT(rc)
@@ -200,6 +209,7 @@ if (rc /= ESMF_SUCCESS) write(0,*) 'rc=',rc,__FILE__,__LINE__; if(ESMF_LogFoundE
       enddo
       grid = ESMF_GridCreateCubedSphere(tileSize=tilesize, &
                                         coordSys=ESMF_COORDSYS_SPH_RAD, &
+                                        coordTypeKind=grid_typekind, &
                                         regDecompPTile=decomptile, &
                                         decompflagPTile=decompflagPTile, &
                                         name="fcst_grid", rc=rc)
@@ -215,6 +225,7 @@ if (rc /= ESMF_SUCCESS) write(0,*) 'rc=',rc,__FILE__,__LINE__; if(ESMF_LogFoundE
                                       maxIndex=(/nx,ny/), &
                                       gridAlign=(/-1,-1/), &
                                       coordSys=ESMF_COORDSYS_SPH_RAD, &
+                                      coordTypeKind=grid_typekind, &
                                       decompflag=(/ESMF_DECOMP_SYMMEDGEMAX,ESMF_DECOMP_SYMMEDGEMAX/), &
                                       name="fcst_grid", &
                                       indexflag=ESMF_INDEX_DELOCAL, &

When I run rap_control tests model passes the grid creation in fcst grid comp, but fails somewhere in physics, with this error:

  0:  -1431655766           0         255 -1431655766 -1431655766                                                      
  0:  lat/lon grid                                                                                                     
  0:  imax,jmax,ijmax,dlon,dlat,ijordr,wlon,rnlat=                                                                     
  0:          180          91       16380   2.00000000000000                                                           
  0:   -2.00000000000000      T  0.000000000000000E+000   90.0000076293945                                             
  0:   FATAL ERROR: la2ga called                                                                                       
  0:   with lmask=true. But bad rslmsk                                                                                 
  0:   or slmask given.                                                                                                

@DusanJovic-NOAA
Copy link
Collaborator

There are still type/kind mismatches in physics. For example, arrays in gcycle like xlon_d/xlat_d are declared as kind_phys (4 bytes now) while corresponding dummy arrays in sfccycle like rlo, rla are still declared as kind_io8 (8 bytes). There are many other mismatches.

My suggestions:

  • get rid of all kinds in ccpp physics except:
    kind_sngl_prec fixed to 4 bytes
    kind_dbl_prec fixed to 8 bytes
    kind_phys, working precision set to either kind_sngl_prec or kind_dbl_prec at build time depending on the build option
  • update (almost) all declarations to use kind_phys, except in very few cases where variable must be either 4 or 8 bytes all the time, like an I/O arrays in sfccycle or when interfacing with library routines that require specific kind (w3nco or sp).
  • convert all files to be modules, no standalone routines anymore, like sfccycle
  • get rid of compiler flags that ignore type mismatches

Time for some serious cleanup.

@DusanJovic-NOAA
Copy link
Collaborator

Quick and dirty changes just to make the rap_control test 'work'. By 'work' I mean it executes without crashing.

See run directory: /scratch1/NCEPDEV/stmp2/Dusan.Jovic/rt_rap/rap_control

diff --git a/physics/machine.F b/physics/machine.F
index 9b09d235..ac881bc6 100644
--- a/physics/machine.F
+++ b/physics/machine.F
@@ -23,7 +23,7 @@
      &,                     kind_INTEGER = 4                             ! -,,-
 
 #else
-      integer, parameter :: kind_io4  = 4, kind_io8  = 8 , kind_ior = 8 &
+      integer, parameter :: kind_io4  = 4, kind_io8  = 4 , kind_ior = 8 &
      &,                     kind_evod = 4, kind_dbl_prec = 8            &
      &,                     kind_sngl_prec = 4
 # ifdef __PGI
diff --git a/physics/sfcsub.F b/physics/sfcsub.F
index 78e5201b..fbd7f49a 100644
--- a/physics/sfcsub.F
+++ b/physics/sfcsub.F
@@ -2736,7 +2736,7 @@
 !>\ingroup mod_sfcsub
       subroutine fixrdg(lugb,idim,jdim,fngrib,                          &
      &                  kpds5,gdata,gaus,blno,blto,me)
-      use machine      , only : kind_io8,kind_io4
+      use machine      , only : kind_io8,kind_dbl_prec,kind_sngl_prec
       use sfccyc_module, only : mdata
       implicit none
       integer lgrib,n,lskip,jret,j,ndata,lugi,jdim,idim,lugb,
@@ -2747,8 +2747,8 @@
       real (kind=kind_io8) gdata(idim*jdim)
       logical gaus
       real (kind=kind_io8) blno,blto
-      real (kind=kind_io8), allocatable :: data8(:)
-      real (kind=kind_io4), allocatable :: data4(:)
+      real (kind=kind_dbl_prec), allocatable :: data8(:)
+      real (kind=kind_sngl_prec), allocatable :: data4(:)
 !
       logical*1, allocatable :: lbms(:)
 !
@@ -2815,7 +2815,7 @@
         allocate(data4(1:idim*jdim))
         call getgb(lugb,lugi,kdata,lskip,jpds,jgds,ndata,lskip,
      &             kpds,kgds,lbms,data4,jret)
-        data8 = real(data4, kind=kind_io8)
+        data8 = real(data4, kind=kind_dbl_prec)
         deallocate(data4)
       else
         write(0,*)' FATAL ERROR: Invalid w3kindreal'
@@ -6165,7 +6165,7 @@
       subroutine setrmsk(kpds5,slmask,igaul,jgaul,wlon,rnlat,           &
      &                   data,imax,jmax,rlnout,rltout,lmask,rslmsk      &
      &,                  gaus,blno, blto, kgds1, kpds4, lbms)
-      use machine , only : kind_io8,kind_io4
+      use machine , only : kind_io8,kind_io4,kind_dbl_prec
       use sfccyc_module
       implicit none
       real (kind=kind_io8) blno,blto,wlon,rnlat,crit,data_max
@@ -6177,7 +6177,8 @@
       real (kind=kind_io8)    slmask(igaul,jgaul)
       real (kind=kind_io8)    data(imax,jmax),rslmsk(imax,jmax)
      &,                       rlnout(imax), rltout(jmax)
-      real (kind=kind_io8)    a(jmax), w(jmax), radi, dlat, dlon
+      real (kind=kind_io8)    radi, dlat, dlon
+      real (kind=kind_dbl_prec) a(jmax), w(jmax)
       logical lmask, gaus
 !
 !     set the longitude and latitudes for the grib file
@@ -6650,7 +6651,7 @@
 !! This subroutine interpolates from lat/lon grid to other lat/lon grid.
       subroutine ga2la(gauin,imxin,jmxin,regout,imxout,jmxout,          &
      &                 wlon,rnlat,rlnout,rltout,gaus,blno, blto)
-      use machine , only : kind_io8,kind_io4
+      use machine , only : kind_io8,kind_io4,kind_dbl_prec
       use sfccyc_module , only : num_threads
       implicit none
       integer i1,i2,j2,ishft,i,jj,j1,jtem,jmxout,imxin,jmxin,imxout,    &
@@ -6673,7 +6674,7 @@
       data    jmxsav/0/
       save    jmxsav, gaul, dlati
       real (kind=kind_io8) radi
-      real (kind=kind_io8) a(jmxin), w(jmxin)
+      real (kind=kind_dbl_prec) a(jmxin), w(jmxin)
 !
 !
       logical first
@@ -6917,11 +6918,12 @@ cjfe
 !>\ingroup mod_sfcsub
       subroutine gaulat(gaul,k)
 !
-      use machine , only : kind_io8,kind_io4
+      use machine , only : kind_io8,kind_io4,kind_dbl_prec
       implicit none
       integer n,k
       real (kind=kind_io8) radi
-      real (kind=kind_io8) a(k), w(k), gaul(k)
+      real (kind=kind_io8) gaul(k)
+      real (kind=kind_dbl_prec) a(k), w(k)
 !
       call splat(4, k, a, w)
 !
@@ -8305,7 +8307,7 @@ cjfe
      &                 gdata,len,iret                                   &
      &,                imsk, jmsk, slmskh, gaus,blno, blto              &
      &,                outlat, outlon, me)
-      use machine ,      only : kind_io8,kind_io4
+      use machine ,      only : kind_io8,kind_dbl_prec,kind_sngl_prec
       use sfccyc_module, only : mdata
       implicit none
       integer imax,jmax,ijmax,i,j,n,jret,inttyp,iret,imsk,              &
@@ -8321,8 +8323,8 @@ cjfe
 !
       real (kind=kind_io8) gdata(len), slmask(len)
       real (kind=kind_io8), allocatable :: data(:,:), rslmsk(:,:)
-      real (kind=kind_io8), allocatable :: data8(:)
-      real (kind=kind_io4), allocatable :: data4(:)
+      real (kind=kind_dbl_prec), allocatable :: data8(:)
+      real (kind=kind_sngl_prec), allocatable :: data4(:)
       real (kind=kind_io8), allocatable :: rlngrb(:), rltgrb(:)
 !
       logical lmask, yr2kc, gaus, ijordr
@@ -8398,7 +8400,7 @@ cjfe
         allocate(data4(1:mdata))
         call getgb(lugb,lugi,mdata,lskip,jpds,jgds,ndata,lskip,
      &             kpds,kgds,lbms,data4,jret)
-        data8 = real(data4, kind=kind_io8)
+        data8 = real(data4, kind=kind_dbl_prec)
         deallocate(data4)
       endif
       if (me .eq. 0) write(6,*) ' input grib file dates=',
@@ -8479,7 +8481,7 @@ cjfe
      &                  iy,im,id,ih,fh,gdata,len,iret                   &
      &,                 imsk, jmsk, slmskh, gaus,blno, blto             &
      &,                 outlat, outlon, me)
-      use machine      , only : kind_io8,kind_io4
+      use machine      , only : kind_io8,kind_dbl_prec,kind_sngl_prec
       use sfccyc_module, only : mdata
       implicit none
       integer nrepmx,nvalid,imo,iyr,idy,jret,ihr,nrept,lskip,lugi,      &
@@ -8506,8 +8508,8 @@ cjfe
 !
       real (kind=kind_io8) gdata(len), slmask(len)
       real (kind=kind_io8), allocatable :: data(:,:),rslmsk(:,:)
-      real (kind=kind_io8), allocatable :: data8(:)
-      real (kind=kind_io4), allocatable :: data4(:)
+      real (kind=kind_dbl_prec), allocatable :: data8(:)
+      real (kind=kind_sngl_prec), allocatable :: data4(:)
       real (kind=kind_io8), allocatable :: rlngrb(:), rltgrb(:)
 !
       logical lmask, yr2kc, gaus, ijordr
@@ -8645,7 +8647,7 @@ cjfe
         allocate (data4(1:mdata))
         call getgb(lugb,lugi,mdata,lskip,jpds,jgds,ndata,lskip,
      &             kpds,kgds,lbms,data4,jret)
-        data8 = real(data4, kind=kind_io8)
+        data8 = real(data4, kind=kind_dbl_prec)
         deallocate(data4)
       endif
       if (me .eq. 0) write(6,*) ' input grib file dates=',

@SamuelTrahanNOAA
Copy link
Collaborator Author

I'm testing Dusan's changes against the regression tests. Wish me luck.

The next step is to get FV3_HRRR working so I can give this to the RRFS_dev1 people for testing.

@DusanJovic-NOAA
Copy link
Collaborator

There are still few inconsistencies in how real arrays are passed to w3nco subroutines. ccpp-physics depends on w3nco_d (double precision version). Now that most of the real variables are either 4 or 8 bytes wide, each w3 call that has real arguments must either pass 8-byte reals, or detect which version of the library it's been linked against and use the correct arguments, which is done at most places, but not everywhere. For example in GFS_time_vary_pre.scm.F90, GFS_time_vary_pre.fv3.F90 it's currently passing 4-byte real to w3nco_d, which is wrong.

Since most of the w3 routines used in ccpp are dealing either with date/time computations or reading grib1 data, the precision of real arguments is not critical, in my opinion the best/simplest solution is to always link with w3nco_4 version and hard-code all real arguments to 4-byte kind. The alternative approach currently used (detecting kind of real values in w3nco) works, but produces argument mismatch warnings with GNU v10+ and forces us to use compiler flag to ignore all mismatches, which is bad.

@SamuelTrahanNOAA
Copy link
Collaborator Author

This definitely doesn't work yet. When I run in debug mode, there are a ton of errors. Thus far, they have all been related to w3.

@DusanJovic-NOAA
Copy link
Collaborator

@SamuelTrahanNOAA
Copy link
Collaborator Author

I have merged the CCPP PR, but we need to wait until NOAA-GFDL/GFDL_atmos_cubed_sphere#193 is merged before we can merge the NOAA-EMC/fv3atm#534

@jkbk2004 You should post a message at NOAA-GFDL/GFDL_atmos_cubed_sphere#193 asking them to merge it.

@jkbk2004
Copy link
Collaborator

I have merged the CCPP PR, but we need to wait until NOAA-GFDL/GFDL_atmos_cubed_sphere#193 is merged before we can merge the NOAA-EMC/fv3atm#534

@jkbk2004 You should post a message at NOAA-GFDL/GFDL_atmos_cubed_sphere#193 asking them to merge it.

Sure! I will go to the PR there, stochastic pr59 as well.

@SamuelTrahanNOAA
Copy link
Collaborator Author

It appears the NOAA-GFDL/GFDL_atmos_cubed_sphere#193 PR is waiting for a review from @junwang-noaa and @DomHeinzeller, requested by the repository's code managers.

@DeniseWorthen
Copy link
Collaborator

@jkbk2004 The log for Hera.gnu was run against a local baseline generated by Sam. Do we know for sure that the additional baselines for GNU have been added to the current develop baseline on Hera?

@SamuelTrahanNOAA
Copy link
Collaborator Author

@jkbk2004 I'd like someone else to independently run all tests. If you haven't run the hera.gnu tests, could you please do so? Those tests don't take long since they're few in number and don't wait for Intel compiler licenses.

@jkbk2004
Copy link
Collaborator

@SamuelTrahanNOAA let me double check hera-gnu

@SamuelTrahanNOAA
Copy link
Collaborator Author

The chances of the hera.gnu tests failing are extraordinarily low since I've tested those many times myself. However, I would like independent testing since this PR has such broad implications.

@BrianCurtis-NOAA
Copy link
Collaborator

The chances of the hera.gnu tests failing are extraordinarily low since I've tested those many times myself. However, I would like independent testing since this PR has such broad implications.

Yes, they should always be run by EPIC separate from the owner of the PR. It's very helpful to know what machines the PR owners tests on as well.

@SamuelTrahanNOAA
Copy link
Collaborator Author

This branch now points to the authoritative repositories for FV3 and stochastic_physics. It is ready for final review and merge.

@SamuelTrahanNOAA
Copy link
Collaborator Author

Hold on. I need to delete two files first.

@SamuelTrahanNOAA
Copy link
Collaborator Author

Okay. NOW the branch is ready for final review and merge.

@SamuelTrahanNOAA
Copy link
Collaborator Author

I need one more review before this can be merged.

@jkbk2004
Copy link
Collaborator

@DusanJovic-NOAA @DeniseWorthen @ChunxiZhang-NOAA Finally, everything is ready to get final review and approval.

@DusanJovic-NOAA
Copy link
Collaborator

fv3atm submodule hash needs to be NOAA-EMC/fv3atm@9743346

@BrianCurtis-NOAA
Copy link
Collaborator

fv3atm submodule hash needs to be NOAA-EMC/fv3atm@9743346

@SamuelTrahanNOAA ^^

@SamuelTrahanNOAA
Copy link
Collaborator Author

@BrianCurtis-NOAA I just updated the FV3 hash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
No Baseline Change No Baseline Change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

32-bit physics