-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restart reproducibility for S2SW (was Target improvements/changes for Prototype 5) #739
Comments
Additional details:
|
We have a NOAA-EMC fork of the CICE-Consorium which I have been keeping up to date w/ master: NOAA-EMC/CICE |
@AvichalMehra-NOAA @JessicaMeixner-NOAA may I ask some questions:
|
For (1), reproducible restarts will not be an initial focus for prototype 5. We will consider them once we have finalized the role of mediator with CMEPS for atm-ocn-ice-wav coupling. Yes on (2). |
@junwang-noaa Benchmark 5 will include four components: FV3-MOM6-CICE6-WW3 with CMEPS mediator used for exchanges between FV3-MOM6-CICE6 and WW3 connecting to the other models via connectors. |
|
Sorry, I think it was auto-closed when I merged. |
According to Avichal at coupled model meeting on 6/3/2020, changing ww3 connector to CMEPS mediator is not a requirement for benchmark5. We will work on it if all the benchmark5 required components are finished before the deadline, otherwise it will be in benchmark 6. |
Working in close collaboration w/ @mvertens at NCAR, I've been able to run a version of the
When testing restarts, the |
So we have the first coupled runs with components coupling through CMEPS,
nice work!
…On Mon, Nov 1, 2021 at 8:12 AM Denise Worthen ***@***.***> wrote:
I've been able to run a version of the cpld_control_wave_p7 through the
CMEPS mediator using :
- a modified CESM WW3 nuopc cap using ifdef CESMCOUPLED to separate
code specific to CESM
- a 2/3deg MOM6 tripole grid for WW3 derived from the current CESM
configuration
- import of the following fields, all mapped with
nearest-source-to-destination conservative mapping
- u10m,v10m and Tbot from the ATM
- u,v and SST from the OCN
- ice fraction from the ICE
- export of the wave roughness length to the ATM (shown below after 12
hours in the coupler history):
[image: Screen Shot 2021-10-31 at 10 13 38 AM]
<https://user-images.githubusercontent.com/40498404/139595392-f3eda68c-1bc9-4e1e-a3ea-c8a1fd04598c.png>
When testing restarts, the Sw_z0 does not reproduce. I made an attempt to
add the field to the ww3 restart file; the model restarts but the initial
value of Sw_z0 is still not reproducing. I will continue working on this.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#739 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI7D6TLIBB34ZJXIVQ7NJATUJZ73DANCNFSM5BWPYTDQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
I've be able to achieve restart reproducibility (6hr/6hr/12hr) in the current set up by adding conditional logic for restart to the The bug fix for deactivated sea points in I'll also note that for the 12h test I've been running, the current setup is giving a consistent wall clock time of ~340s, about 100s fast than the typical wall time for the current cpld_control_wave_p7, even though the ocean resolution is higher (2/3 deg vs 1deg). |
That's awesome @DeniseWorthen !!!! The difference in timing might be explained by the spectral resolution instead of the geographic resolution. Also, multi has a small overhead compared to shel, but given the timings you mention, I'd suspect its spectral resolution -- which needs to be decreased (#822) |
Nice Work, Denise! May I ask what resolutions are used in this CMEPS test
and the cpld_control_wave_p7?
…On Thu, Nov 4, 2021 at 8:26 AM Jessica Meixner ***@***.***> wrote:
That's awesome @DeniseWorthen <https://github.com/DeniseWorthen> !!!!
The difference in timing might be explained by the spectral resolution
instead of the geographic resolution. Also, multi has a small overhead
compared to shel, but given the timings you mention, I'd suspect its
spectral resolution -- which needs to be decreased (#822
<#822>)
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#739 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI7D6TI6PK6FRSBP4ZQC4N3UKKCXNANCNFSM5BWPYTDQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
The test I'm running is C96mx100 for the ATM-OCN-ICE but the WAV model is 2/3deg MOM6 grid instead of the 1deg rectilinear grid used in our standard test. |
Thanks, what about the spectral resolution used in the two tests?
…On Thu, Nov 4, 2021 at 9:09 AM Denise Worthen ***@***.***> wrote:
The test I'm running is C96mx100 for the ATM-OCN-ICE but the WAV model is
2/3deg MOM6 grid instead of the 1deg rectilinear grid used in our standard
test.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#739 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI7D6TNKQLYLTAQPBT5FNX3UKKHZZANCNFSM5BWPYTDQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
I believe I am using a spectral resolution of 25 in the CMEPS test; the cpld_control_p7_wave is currently using 50. |
Great news @DeniseWorthen, did |
@aliabdolali I don't think WRST is necessarily used/needed due to the different NUOPC cap & mediator @DeniseWorthen that will explain the difference. We are going to put together a PR with updated 1 deg spectral resolution and a fix for the other bugs so that the new resolution will be available for everyone to use, we should not have 50 there. |
Speaking of new WW3 grids, @DeniseWorthen it's still unclear to me if you needed the 1 deg WW3 to just needed the WW3 1 deg rectilinear grid to have fewer masked out regions, or if you wanted a curvilinear 1deg WW3 grid that matched MOM6's tricolor grid, but stop at 80 deg N or something like that. |
@aliabdolali As Jessica notes, it appears that My change in the @SMoorthi-emc Does the ATM apply some ice fraction cutoff to the values it receives? |
From what I see, Atm only allows the zorl values <0.1 when the ocean frac
…0, it does not have ice fraction cutoff for zorl.
On Thu, Nov 4, 2021 at 10:00 AM Denise Worthen ***@***.***> wrote:
@aliabdolali <https://github.com/aliabdolali> As Jessica notes, it
appears that WRST is not required for restart reproducibility when you're
coupling through a mediator.
My change in the CalcRoughl SR did result in roughness lengths over
ice>0.5 having a uniform value of 968.6. This does result in strange
(large) values getting mapped to the ATM. I need to look also at how the
ATM is actually applying the roughness length near the sea ice edge. Does
the ATM apply some ice fraction cutoff to the values it receives?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#739 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI7D6TKX5EM63TIOL4MQDFLUKKNYFANCNFSM5BWPYTDQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
@JessicaMeixner-NOAA As to the grids, I think it is up to how the coupled group wishes to validate WW3 coupling through the mediator vs connectors. Ultimately I think it makes the most sense to run waves on the same grid as the ocean and ice. But since we're currently using a rectilinear grid for waves (w/ the connectors), is that an acceptable difference for validation? Or would you prefer being able to do a "clean" validation where the only difference was connector vs mediator? |
@junwang-noaa Thanks for the explanation on zorl. I guess that explains why it didn't blow up w/ the large mapped values. |
Hi Denise,
This is a fantastic result! Great work.
Mariana
…On Thu, Nov 4, 2021 at 8:17 AM Denise Worthen ***@***.***> wrote:
@junwang-noaa <https://github.com/junwang-noaa> Thanks for the
explanation on zorl. I guess that explains why it didn't blow up w/ the
large mapped values.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#739 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4XCE52JFTQKE47O6HVUWTUKKPZZANCNFSM5BWPYTDQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
--
Dr. Mariana Vertenstein
CESM Software Engineering Group Head
National Center for Atmospheric Research
Boulder, Colorado
Office 303-497-1349
Email: ***@***.***
|
I've updated my test configuration to run on the MOM6 1-deg tripole grid and turned on export of the stokes drift partition fields to the MOM6. The 6h/6h/12h restart still passes, except for 6 points where the roughness length passed to the ATM does not reproduce. Those 6 points are however "out-of-range" for what the ATM uses (they are ~1000). Since they are not used by ATM, restart reproducibility is maintained. Using the same 1-deg MOM6 Tripole grid, I ran the current connector version also after enabling the export field writing. The following figures show the export fields from WAV at the end of 6 hours (not all exported fields are shown): |
@aliabdolali @JessicaMeixner-NOAA I noticed some behaviour when testing my latest code updates. I then went and ran the cpld_control_p7 from the current develop and noticed the same thing (I turned on the field dumping in wmesmf). Basically, the stokes3 (eg, y3pstk) components are now very very small (~10-13) everywhere. Is there an explanation for this? |
Did your latest code updates update WW3? If so what were the before/after differences and code versions? I have not noticed this but haven't printed out the values in a while either. |
My code is behaving the same as the current develop branch, using the cpld_control_p7 test. A run directory is here: /scratch1/NCEPDEV/stmp2/Denise.Worthen/FV3_RT/rt_9619/cpld_control_p7 |
@DeniseWorthen Okay I double checked what I suspected and confirmed it's just the lack of ICs. I checked this by running the benchmark reg test twice, Without wave ICs: /scratch1/NCEPDEV/stmp2/Jessica.Meixner/FV3_RT/rt_239203/cpld_bmark_p7 The run with ICs is has much higher values of the partitioned Stoke's drift, I think everything you are seeing is normal. If you want me to create an IC for your testing, just send me the ww3_grind.inp file and IC date of choice and it'll take me about a day or so. |
@DeniseWorthen wind seas spin up much quicker than swells. Also, I actually think this is more of an issue that we reduced the frequency space for speed of the tests, and the three bands are based on frequency. If I increase the number of frequency, we get more in the 3rd partition. See my test here: /scratch1/NCEPDEV/stmp2/Jessica.Meixner/FV3_RT/rt_5412/cpld_control_p7 which uses more frequency and direction points. |
Thanks, that makes more sense. I understand about reducing the number of frequencies to speed up the tests. I want to understand more what is happening though. If I read the "option 2" description in CALC_U3STOKES and look at the code, I see that partitions don't need to match the spectral freq grid of WW3, that it will essentially "bin" the contributions into the three partitions given. So when you reduce the number of frequencies, does it truncate the spectrum rather than just making each "band" in the spectrum wider? In other words, in the glo_1deg grid.inp, I see where it sets "freq increment factor, first freq and number of frequencies". If I look in the older glo_1deg inp file in the input data, I see 1.07 0.035 50 36 0.5 and the newer one I see 1.07 0.035 25 24 0.5 So the "increment" is the same, but the spectrum only extends out half as far (25 vs 50). So the third partition ends up being empty. Is that how it works? I'm sure I'm getting some of the terminology wrong, hopefully I'm making sense. |
Yes, you can see the description of the variables here: so the first frequency is 0.035 and then the second frequency is 0.03*1.07. The Stoke's frequency bands were chosen off the standard frequency, so it's not perfect for this but I don't think we should change them right now either. Especially as we do not have a consistent way to deal with the various values across the different places. |
I have built wave-on versions of the c96,c192 and c384 regression control/restart tests where in each case waves are running on the MOM6 tripole grid (1 deg, 1/2deg and 1/4 deg). All control/restart tests pass if the ww3 restart file itself is not used in the file comparison. However, for the mx050 and mx025 case, the wave restart file itself does not reproduce in the restart run (but all other restart files do reproduce). I believe something similar is noted in this issue thread. My branch is up-to-date with the current dev/ufs-weather-model branch of WW3. I also did a memory profiling test for the c96mx100 case. This test can be compared to the one in Discussion 779, though it is not a strictly 'apples-to-apples' comparison. . |
Remove the "WRST" line https://github.com/NOAA-EMC/WW3/blob/develop/model/esmf/switch#L9 in the switch file WW3/model/esmf/switch and you should be able to also compare WW3 restarts successfully. That's a known issue that I will expand on why later and that variable/feature that WRST brings should not be needed when coupled via CMEPS. |
I am not using the WRST switch. |
Is it just the first restart file during the second restart run or all of the wave restart files? |
Right now, wave restarts are being written every 3 hours for my test setup. For the c192 setup, the restart test produces wave restarts at 20210322.18,22.21,23.0 etc. All are different compared to the same file in the control run. So the first restart that the restart run writes at startup (20210322.180000.restart.ww3), which should be identical to the restart.ww3 file it is using from the control, is actually different. |
The first restart being different ie (20210322.180000.restart.ww3) and restart.ww3 wouldn't surprise me too much - although removing the WRST switch I thought was sufficient. The other ones being different do surprise me a little but we've recently had a similar issue: NOAA-EMC/WW3#452 that this reminds me of where we have that the restart files are not reproducing but answers are. I'm still looking into this issue, but haven't figured anything out yet. |
To be completed by 08/15/20.
The text was updated successfully, but these errors were encountered: