Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advanced snow physics driver #621

Merged
merged 47 commits into from
Aug 12, 2021
Merged

Conversation

eclare108213
Copy link
Contributor

@eclare108213 eclare108213 commented Aug 4, 2021

  • Short (1 sentence) summary of your PR:
    CICE implementation of advanced snow physics in Icepack PR #360

  • Developer(s):
    E. Hunke, N. Jeffery, T. Craig

  • Suggest PR reviewers from list in the column to the right.

  • Please copy the PR test results link or provide a summary of testing completed below.

Preliminary tests on cheyenne with an earlier hash:
https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#33070e484a80cb090fecb2229730d006c863f244

Final test results:
https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#d25a3274a7aa1643d1160ad2d13eef6798c1a3cc

This PR requires QC testing due to non-BFB changes in the default configuration: QC passes.

  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on Icepack or any other models?
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please provide any additional information or relevant details below:

This PR adds changes to the CICE driver needed to run with the new "advanced snow physics" capability in Icepack #360, including the netcdf lookup table derived from SNICAR. Setting tr_snow=.true. turns on the new snow code, with additional flags controlling various options (snow redistribution by wind, snow grain metamorphosis, use of snow liquid content for melt ponds and radiation).

In addition to testing of Icepack itself, the new snow model has been tested in CICE as follows, comparing results with a control run with tr_snow=.false.:

  1. set tr_snow=.true. and all other snow options to .false. (BFB)
  2. set tr_snow=.true. with snwredist='bulk' and snwlvlfac=0 (BFB)
  3. set tr_snow=.true. with snwredist='ITDrdg', and manually set wind=0 and flost=0 in the snow code (not BFB even with -O0, but differences appear to be due to roundoff)
  4. would like to sanity-check the snow metamorphism code too (ideas welcome)

Note that the new snow tests set nslyr=5, while the default configuration still uses nslyr=1.

Icepack will need to be updated here for QC testing and before merging.

Remaining to do:

  • check fsloss description in the code - seems to be inconsistent (fraction vs flux)
  • update all drivers as in standalone/cice?
  • update Icepack
  • full regression testing
  • check history output from longer runs (> 1 year)
  • QC tests
  • add basic info to CICE documentation (primary info is in Icepack docs)

proteanplanet and others added 30 commits August 27, 2020 16:02
Fix minor problems on snow1 branch in io binary and pnetcdf implementation
@apcraig
Copy link
Contributor

apcraig commented Aug 6, 2021

I do think we should attempt to update the other drivers if it's reasonable. Of course, we cannot test them. I also think it's reasonable to not and then make sure that's highlighted in the Recent Changes. Up to now, we've updated drivers when changes came into the standalone model, but at some point, that might not be particularly successful or reasonable. At that point, we probably need to have some separate documentation that tracks each standalone driver change.

@eclare108213
Copy link
Contributor Author

The latest test results from @apcraig:

https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#d25a3274a7aa1643d1160ad2d13eef6798c1a3cc

All tests pass, most tests fail regression.

The QC test passed as well,

./cice.t-test.py /glade/scratch/tcraig/CICE_RUNS/cheyenne_intel_smoke_gx1_36x1_medium_qc.qc_base /glade/scratch/tcraig/CICE_RUNS/cheyenne_intel_smoke_gx1_36x1_medium_qc.qc_test
INFO:main:Running QC test on the following directories:
INFO:main: /glade/scratch/tcraig/CICE_RUNS/cheyenne_intel_smoke_gx1_36x1_medium_qc.qc_base
INFO:main: /glade/scratch/tcraig/CICE_RUNS/cheyenne_intel_smoke_gx1_36x1_medium_qc.qc_test
INFO:main:Number of files: 1460
INFO:main:2 Stage Test Passed
INFO:main:Quadratic Skill Test Passed for Northern Hemisphere
INFO:main:Quadratic Skill Test Passed for Southern Hemisphere
WARNING:main:Error loading necessary Python modules in plot_data function
WARNING:main:Error loading necessary Python modules in plot_data function
WARNING:main:Error loading necessary Python modules in plot_data function
INFO:main:
INFO:main:Quality Control Test PASSED

The sandbox I was using would be the same as after both isnow1 and snow1 are merged to the trunk.

@eclare108213 eclare108213 marked this pull request as ready for review August 11, 2021 16:07
@eclare108213
Copy link
Contributor Author

This snow PR is ready for review. @apcraig @proteanplanet @njeffery or anyone else interested...

Copy link
Contributor

@apcraig apcraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks reasonable to me.

Copy link
Contributor

@proteanplanet proteanplanet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we wait for the standard testing to complete: if possible, it would be good to see some model results with this driver as a plausibility test - say a 10-year stand-alone simulation average. The changes are quite extensive between this and the snow physics PR I approved last night. Is it possible to share some plots that demonstrate basic health of the model with and without these changes included?

@eclare108213
Copy link
Contributor Author

This PR is for the new snow driver implementation in CICE -- the Icepack submodule is being updated here and is now the same as in the Icepack PR you approved yesterday (thank you!).

There are some 1-year results from a run with both wind redistribution of snow and snow metamorphosis turned on, here, checking order-of-magnitude reasonableness. @njeffery plans to test the Icepack changes in E3SM, so we'll be able to see how different this version is. I've not been able to get badger to cooperate since yesterday, so if someone else wants to launch some longer runs somewhere else, I'd appreciate it. Here are the tests I've been running during development:

default configuration with 1 snow layer:
cice.setup -m badger -s gx1prod,medium,icdefault -p 32x1 -g gx1 --case cntrl

default configuration but with 5 snow layers -- set nslyr=5 in ice_in
cice.setup -m badger -s gx1prod,medium,icdefault -p 32x1 -g gx1 --case cntrl5

new snow tests:
cice.setup -m badger -s gx1prod,medium,icdefault,snwgrain -p 32x1 -g gx1 --case snwgrain
cice.setup -m badger -s gx1prod,medium,icdefault,snwITDrdg -p 32x1 -g gx1 --case snwredistITD
cice.setup -m badger -s gx1prod,medium,icdefault,snw30percent -p 32x1 -g gx1 --case snwredist30
cice.setup -m badger -s gx1prod,medium,icdefault,snwITDrdg,snwgrain -p 32x1 -g gx1 --case snwITDgrain

@eclare108213
Copy link
Contributor Author

Note: The cice.setup commands above only run for 1 year. Both the run length in years and the wallclock time in hours would need to be adjusted for 5- or 10-year runs.

@apcraig
Copy link
Contributor

apcraig commented Aug 11, 2021

@proteanplanet, are you interested in seeing results with the snow mods on or off? With the snow mods off, a QC test passes which indicates the climate is the same. I have the 5 year QC runs if anyone wants to look at both the baseline (current master) and the snow version with snow features off. Separately, I'd be happy to run one of the snow tests suggested by @eclare108213. Is there one or two in particular that would be valuable to have?

@proteanplanet
Copy link
Contributor

@proteanplanet, are you interested in seeing results with the snow mods on or off? With the snow mods off, a QC test passes which indicates the climate is the same. I have the 5 year QC runs if anyone wants to look at both the baseline (current master) and the snow version with snow features off. Separately, I'd be happy to run one of the snow tests suggested by @eclare108213. Is there one or two in particular that would be valuable to have?

Thanks @apcraig, I know @eclare108213 has results extending beyond 5 years, and wondered if at least 10 years could be added, if available, since this falls into the category of a major addition to the code (Category IV: non-BFB and climate changing new model configuration requiring separate scientific assessment). It would be good to have a record of basic functionality to fall back on in case problems are encountered upon implementation in coupled forecast or climate models.

@eclare108213
Copy link
Contributor Author

I haven't done any 5-year runs -- I'm compute limited here at LANL at the moment and should look into using E3SM computing resources elsewhere. @apcraig, if you have time, a 10-year run of the snwITDgrain case would be most helpful, and either a 1- or 5-layer-snow default run to compare with (5 would be a cleaner comparison, but 1 is our default config). If anything looks suspicious, we can dig a little deeper with more refined configurations. The snwredist30 run uses an altogether different wind redistribution scheme but is less likely to be used, so lower priority in my opinion.

@apcraig
Copy link
Contributor

apcraig commented Aug 11, 2021

OK, I'll fire up a few runs on cheyenne.

if (aicen(i,j,1,iblk)> c0) then
if (tr_brine) phinS1(n) = trcrn(i,j,nt_fbri,1,iblk) &
* vicen(i,j,1,iblk)/aicen(i,j,1,iblk)
phbrn(n) = (c1 - rhosi/rhow)*vice(i,j,iblk)/aice(i,j,iblk) &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this account for variable snow density when the tracer is turned on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly. The changes that I made to this section of the code were only refactoring for better performance and shouldn't affect the overall logic or physics/diagnostics as previously coded at all. There are quite a lot of places in the code where variable snow density could be used instead of rhos, beyond the new snow physics routines. That's a whole new science project.

@apcraig
Copy link
Contributor

apcraig commented Aug 11, 2021

I just noticed that both the QC tests failed after 4 years. They both failed on Dec 31, 2008 with errors like

istep1:     35040    idate:  20081231    sec:         0
 (JRA55_data) reading forcing file 1st ts =
 /glade/p/cesm/pcwg_dev/CICE_data/forcing/gx1/JRA55/8XDAILY/JRA55_03hr_forcing_2
 005.nc
 (JRA55_data) reading forcing file 2nd ts =
 /glade/p/cesm/pcwg_dev/CICE_data/forcing/gx1/JRA55/8XDAILY/JRA55_03hr_forcing_2
 005.nc
  (zap_snow_temperature)zap_snow_temperature: temperature out of bounds!
  (zap_snow_temperature)k:           1
  (zap_snow_temperature)zTsn:  -105.364029298144
  (zap_snow_temperature)Tmin:  -100.000000000000
  (zap_snow_temperature)Tmax:  7.944590571799608E-007
  (zap_snow_temperature)zqsn:  -183445893.081624

We probably need to update the QC validation to check whether there is a complete set of output as part of the process. This looks a lot like the problems we have with the evp1d validation which suggests the problem is not the 1devp or the snow. It's something in our basic setup these days. I will try to setup some longer runs and see if they also fail. At this point, I suggest we work to merge the snow mods and then identify why the basic solution is not stable right now. That should be high priority.

@apcraig
Copy link
Contributor

apcraig commented Aug 11, 2021

The QC test cycles 2005 data. Is that how we want to run these 10 year runs?

Copy link
Contributor

@proteanplanet proteanplanet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to proceed as required, and we can use Compy under the ESMD project label to run longer tests. In the interests of finalizing this before @eclare108213 leaves, I'm fine with accepting this PR, and then testing even 30 years cycled with 2005 forcing once the larger Consortium QC issues are fixed - which is, to my mind, the highest priority. I would like to suggest that the point @dabail10 has raised with regards to variable snow density be included a new issue that also addresses 30-year plausibility runs of the new snow physics. At least, that's one way we can deal with today's time pressure.

@dabail10
Copy link
Contributor

So, this is typical when the snow energy/enthalpy goes crazy. Note the threshold here is -100C, so it likely goes bad much earlier in the run.

Dave

I just noticed that both the QC tests failed after 4 years. They both failed on Dec 31, 2008 with errors like

istep1:     35040    idate:  20081231    sec:         0
 (JRA55_data) reading forcing file 1st ts =
 /glade/p/cesm/pcwg_dev/CICE_data/forcing/gx1/JRA55/8XDAILY/JRA55_03hr_forcing_2
 005.nc
 (JRA55_data) reading forcing file 2nd ts =
 /glade/p/cesm/pcwg_dev/CICE_data/forcing/gx1/JRA55/8XDAILY/JRA55_03hr_forcing_2
 005.nc
  (zap_snow_temperature)zap_snow_temperature: temperature out of bounds!
  (zap_snow_temperature)k:           1
  (zap_snow_temperature)zTsn:  -105.364029298144
  (zap_snow_temperature)Tmin:  -100.000000000000
  (zap_snow_temperature)Tmax:  7.944590571799608E-007
  (zap_snow_temperature)zqsn:  -183445893.081624

We probably need to update the QC validation to check whether there is a complete set of output as part of the process. This looks a lot like the problems we have with the evp1d validation which suggests the problem is not the 1devp or the snow. It's something in our basic setup these days. I will try to setup some longer runs and see if they also fail. At this point, I suggest we work to merge the snow mods and then identify why the basic solution is not stable right now. That should be high priority.

@dabail10
Copy link
Contributor

dabail10 commented Aug 11, 2021

This is fine. The diagnostic budgets might not quite add up, but this looks like a great first step.

I'm happy to proceed as required, and we can use Compy under the ESMD project label to run longer tests. In the interests of finalizing this before @eclare108213 leaves, I'm fine with accepting this PR, and then testing even 30 years cycled with 2005 forcing once the larger Consortium QC issues are fixed - which is, to my mind, the highest priority. I would like to suggest that the point @dabail10 has raised with regards to variable snow density be included a new issue that also addresses 30-year plausibility runs of the new snow physics. At least, that's one way we can deal with today's time pressure.

@apcraig apcraig mentioned this pull request Aug 11, 2021
@apcraig
Copy link
Contributor

apcraig commented Aug 11, 2021

I'm suspecting the QC issue is a calendar issue. We have leap years on but are cycling 2005 data and the failure is on Dec 31, 2008. See also #623. I'll follow up. We might want to think about having leap years off if the default is cycling non leap data. I'll do some debugging and see whether we can do something with forcing too, like repeating the last day if the year runs out of data. We'll have to decide how we want this to work.

@eclare108213
Copy link
Contributor Author

eclare108213 commented Aug 11, 2021

@apcraig that's a good hypothesis. You could run a QC on the code immediately before the new calendar came in, then on the calendar PR to see if that's when it started, but I think you're probably right. And it's a relief that this might just be a calendar/forcing issue, not something more fundamental to the physics.

We don't need to rush this PR just because I'm leaving, but if everyone's fine with it even with these outstanding questions, I'm happy to merge. There's always more to do...

@apcraig
Copy link
Contributor

apcraig commented Aug 11, 2021

I think we should merge. @eclare108213, do you want to do the honors. If not, I'll take care of it tomorrow unless we hear otherwise.

@apcraig
Copy link
Contributor

apcraig commented Aug 12, 2021

I reran the QC tests with use_leap_year=.false. and everything completed fine and now the tests pass correctly with 1825 files.

>./cice.t-test.py /glade/scratch/tcraig/CICE_RUNS/cheyenne_intel_smoke_gx1_36x1_medium_qc.qc_base2 /glade/scratch/tcraig/CICE_RUNS/cheyenne_intel_smoke_gx1_36x1_medium_qc.qc_test2
INFO:__main__:Running QC test on the following directories:
INFO:__main__:  /glade/scratch/tcraig/CICE_RUNS/cheyenne_intel_smoke_gx1_36x1_medium_qc.qc_base2
INFO:__main__:  /glade/scratch/tcraig/CICE_RUNS/cheyenne_intel_smoke_gx1_36x1_medium_qc.qc_test2
INFO:__main__:Number of files: 1825
INFO:__main__:2 Stage Test Passed
INFO:__main__:Quadratic Skill Test Passed for Northern Hemisphere
INFO:__main__:Quadratic Skill Test Passed for Southern Hemisphere
WARNING:__main__:Error loading necessary Python modules in plot_data function
WARNING:__main__:Error loading necessary Python modules in plot_data function
WARNING:__main__:Error loading necessary Python modules in plot_data function
INFO:__main__:
INFO:__main__:Quality Control Test PASSED

Separately, I have been debugging the QC abort problem and have a good handle on it. I will be creating a separate PR to fix that issue. So we take up discussion separately on that problem.

@apcraig
Copy link
Contributor

apcraig commented Aug 12, 2021

I have run the 5 layer control default case and the snwITDgrain case for 8 years each. Would you like me to share some data? The leap year sync issue is a problem in gx1prod as well. We start at 2005 and cycle the forcing data for 5 years. When we hit 2012, we're using the 2007 forcing and on Dec 31, 2012 the model fails because 2012 is a leap year in the model but not in the forcing. That's why I have only 8 years. We'll have to figure out how we want to setup the gx1prod configuration to cycle successfully. One way is to turn off the leap years. Another would be to cycle over 4 years instead of 5. Another would be to come up with a scheme where we use the last day of the forcing to fill missing days at the end of the year somehow. Pluses and minuses to all approaches.

Anyway, I have 8 years and can do something with it if you like. I can also rerun with leap years off for 10 years if that would be useful, for instance.

@proteanplanet
Copy link
Contributor

Hi Tony, via email, please point me to the data and I'll run a few plots.

@eclare108213
Copy link
Contributor Author

We'll have to figure out how we want to setup the gx1prod configuration to cycle successfully.

In the options script for gx1prod, perhaps the best thing is to cycle through just 4 years. However, gx1prod implies a longer "production" style simulation, so really it should be set up to start with whatever the initial year of JRA-55 forcing is (mid-20th-century) and cycle several decades. If a user doesn't have all the data, then they can change the forcing configuration to match what they have. We could provide examples for how to do that in the docs.

@apcraig probably has a more elegant way to handle this in his upcoming PR, but to help users avoid leapyear/forcing inconsistencies, it would be a good idea to check for them at initialization and abort if e.g. leapyear is on, run length is more than one year, and the forcing year cycle is not an even multiple of 4. Restarts complicate matters. For shorter runs (1-3 years) it makes sense to me to recommend turning off leapyear.

We start at 2005 and cycle the forcing data for 5 years.

Let's see what the output looks like after 8 years (thanks @proteanplanet). A good rule of thumb for gx1prod type runs is to run for at least 10 years, preferably longer, using reanalysis forcing that does not repeat (unless such forcing doesn't exist for the whole length of the run). I.e. run 2005-2015 using data for 2005-2015. Repeating one year of forcing over and over results in really strange looking output because of the persistent/repetitive weather patterns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants