forked from mom-ocean/MOM6
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates on 04_21 #2
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Merge in latest dev/gfdl updates
Merge in latest MOM6 dev/gfdl updates
Merge in latest commits
Merge in latest MOM dev/gfdl updates
Merge in latest MOM6 updates
Merge in latest dev/gfdl updates
Merge in latest dev/gfdl updates
merge latest updates into dev/gfdl
Merge in latest dev/gfdl commits
Merge in dev/gfdl updates
Merge in latest dev/gfdl updates
Merge in latest dev/gfdl updates
merge in latest dev/gfdl changes
merge in latest dev/gfdl updates
Merge in latest MOM6 updates
Merge in latest dev/gfdl updates
Merge in latest round of dev/gfdl updates
merge in latest dev/gfdl updates
Merge in latest dev/gfdl updates
Merge in latest dev/gfdl updates
Merge in latest dev/gfdl updates
Merge in latest updates from dev/gfdl
Merge in latest dev/gfdl updates
Merge in updates to remap_all_state_vars
Merge in latest updates
Merge in dev/gfdl updates
merge dev/emc
* TC4 integration into test suite This patch renames the tc4 test to activate it in the test suite. It also modifies the Makefile to build the input field test scripts. It also modifies the Python build scripts to be PEP8-conformant. We temporarily disable tc4 in the restart tests, since they currently fail. This needs to be addressed before we can merge this into the main branch. The patch does not enable the necessary Python modules for running on Travis, that will also be addressed later. * Travis python support; tc4 Makefile The custom TC4 Makefile has been added (oops), and the presumed Python Ubuntu packages have been added for Travis. * Verify ENABLE_THERMODYNAMICS is True before posting C_p diagnostic * Make tc4 faster * remove trailing whitespace * add unit scaling * fix restart fail for tc4 and some cleanup * remove trailiny ws * Enable tc4.restart test * +Pass timeesteps to tracer diagnostics in [T] Pass timeesteps to the tracer diagnistics routines post_tracer_diagnostics and postALE_tracer_diagnostics and to adiabatic in units of [T}. All answers are bitwise identical. * +Rescaled tracer advective flux diagnostics Rescaled the internal units of the tracer advective flux diagnostics to units of [conc H L2 T-1] for code simplicity and dimensional consistency testing. Also corrected the units of some tracer fluxes as documented in comments and commented out unused elements of the tracer_type. All answers are bitwise identical. * +Pass timesteps to ALE_main in [T] Pass the timesteps to ALE_main, ALE_main_offline, and ALE_main_accelerated in units of [T] for code simplicity and dimensional consistency testing. This also includes the rescaling of remapping-driven tracer tendencies. All answers and diagnostics are bitwise identical. * +Pass timesteps to tracer column_physics in [T] Pass timesteps to the various tracer column_physics routines in [T] for dimensional consistency testing. Also added a new unit_scale_type argument to these routines. All answers are bitwise identical, but there are minor interface changes to 13 subroutines. * +Pass timesteps to applyTracerBoundaryFluxesInOut in [T] Pass timesteps to applyTracerBoundaryFluxesInOut in [T], and use units of [T-1] for internal source and decay rates for the oil tracer and in fluxes of CFCs. Also modified extract_offline_main to return timesteps as real values with units of [T]. Also there is a new unit_scale_type argument to register_oil_tracer. All answers in the MOM6_examples test cases and regression tests are bitwise identical. * Simplified expressions in MOM_PointAccel Simplified expressions inside of MOM_PointAccel, taking into account that all velocities use the same units of [L T-1]. All answers are bitwise identical. * Corrected dimensional epsilons in downscaling Added distinct negligible volumes, face areas, horizonal areas and lengths with proper dimensional rescaling in the downsample field routines. With these changes, downscaled diagnostics should now pass the dimensional rescaling tests, whereas previously there would have been a problem when the numbers used to represent lengths are smaller than about 1e-8 times their MKS values. All answers are bitwise identical without dimensional rescaling. * Simplified expressions in MOM_offline_aux Simplified expressions in distribute_residual_uh_barotropic. All answers are bitwise identical. * Revised wave_speed to return speed in [L T-1] Revised wave_speed to return the internal wave speed in units of [L T-1] and to use mono_N2_depth in units of [Z] for code simplification and expanded dimensional consistency testing. Also revised the internal units of some related diagnostics in calculate_diagnostic_fields. All answers and diagnostics are bitwise identical. * Rescaled internal variables in wave_speed Rescale internal calculations in wave_speed and wave_speeds for greater robustness and dimensional consistency testing. All answers are bitwise identical and pass dimensional scaling tests. * +Changed the units of minimum_forcing_depth to [H] Changed the units of minimum_forcing_depth passed to applyBoundaryFluxesInOut and applyTracerBoundaryFluxesInOut to [H]. All answers are bitwise identical. * Correction of documented units in comments Corrected some units in comments and eliminated some unused variables. All answers are bitwise identical. * Adiabatic clock ID bugfix This patch fixes an initialization bug of the diabatic timer, which was being used to measure adiabatic time but was never initialized if the experiment was configured as adiabatic. We fix this by introducing a separate timer for the adiabatic solver. Although we could have reused the diabatic timer, the addition of a new variable should not add any overhead on modern compilers. * Corrected an OMP declaration Added a variable to an OMP declaration. All answers are bitwise identical, and a recently added compile-time error with openMP was fixed. * Update MOM.F90 Fixed Alistair's embarrassing error. * Dimensional rescaling in MOM_open_boundary.F90 Added rescaling for dimensional consistency testing in MOM_open_boundary.F90, including splitting variables with different units that had previously shared the same variable and adding more extensive documentation of variables. Also changed the dimensions of the timesteps passed to radiation_open_bdry_conds and update_segment_tracer_reservoirs to [T] and added vertical_grid_type and unit_scale_type arguments to open_boundary_init and open_boundary_test_extern_h. All answers are bitwise identical, although some probably bugs have been noted in comments and there are new or altered arguments to several routines. * (*)Fixed invariance bugs in MOM_open_boundary.F90 Corrected dimensional consistency bugs in update_segment_tracer_reservoirs and horizontal indexing and related bugs in gradient_at_q_points with oblique_grad OBCs. These will both change answers in test cases that use some open boundary condition options, but not in any of the MOM6-examples test cases.
* TC4 integration into test suite This patch renames the tc4 test to activate it in the test suite. It also modifies the Makefile to build the input field test scripts. It also modifies the Python build scripts to be PEP8-conformant. We temporarily disable tc4 in the restart tests, since they currently fail. This needs to be addressed before we can merge this into the main branch. The patch does not enable the necessary Python modules for running on Travis, that will also be addressed later. * Travis python support; tc4 Makefile The custom TC4 Makefile has been added (oops), and the presumed Python Ubuntu packages have been added for Travis. * Verify ENABLE_THERMODYNAMICS is True before posting C_p diagnostic * Make tc4 faster * remove trailing whitespace * add unit scaling * fix restart fail for tc4 and some cleanup * remove trailiny ws * Enable tc4.restart test * +Pass timeesteps to tracer diagnostics in [T] Pass timeesteps to the tracer diagnistics routines post_tracer_diagnostics and postALE_tracer_diagnostics and to adiabatic in units of [T}. All answers are bitwise identical. * +Rescaled tracer advective flux diagnostics Rescaled the internal units of the tracer advective flux diagnostics to units of [conc H L2 T-1] for code simplicity and dimensional consistency testing. Also corrected the units of some tracer fluxes as documented in comments and commented out unused elements of the tracer_type. All answers are bitwise identical. * +Pass timesteps to ALE_main in [T] Pass the timesteps to ALE_main, ALE_main_offline, and ALE_main_accelerated in units of [T] for code simplicity and dimensional consistency testing. This also includes the rescaling of remapping-driven tracer tendencies. All answers and diagnostics are bitwise identical. * +Pass timesteps to tracer column_physics in [T] Pass timesteps to the various tracer column_physics routines in [T] for dimensional consistency testing. Also added a new unit_scale_type argument to these routines. All answers are bitwise identical, but there are minor interface changes to 13 subroutines. * +Pass timesteps to applyTracerBoundaryFluxesInOut in [T] Pass timesteps to applyTracerBoundaryFluxesInOut in [T], and use units of [T-1] for internal source and decay rates for the oil tracer and in fluxes of CFCs. Also modified extract_offline_main to return timesteps as real values with units of [T]. Also there is a new unit_scale_type argument to register_oil_tracer. All answers in the MOM6_examples test cases and regression tests are bitwise identical. * Simplified expressions in MOM_PointAccel Simplified expressions inside of MOM_PointAccel, taking into account that all velocities use the same units of [L T-1]. All answers are bitwise identical. * Corrected dimensional epsilons in downscaling Added distinct negligible volumes, face areas, horizonal areas and lengths with proper dimensional rescaling in the downsample field routines. With these changes, downscaled diagnostics should now pass the dimensional rescaling tests, whereas previously there would have been a problem when the numbers used to represent lengths are smaller than about 1e-8 times their MKS values. All answers are bitwise identical without dimensional rescaling. * Simplified expressions in MOM_offline_aux Simplified expressions in distribute_residual_uh_barotropic. All answers are bitwise identical. * Revised wave_speed to return speed in [L T-1] Revised wave_speed to return the internal wave speed in units of [L T-1] and to use mono_N2_depth in units of [Z] for code simplification and expanded dimensional consistency testing. Also revised the internal units of some related diagnostics in calculate_diagnostic_fields. All answers and diagnostics are bitwise identical. * Rescaled internal variables in wave_speed Rescale internal calculations in wave_speed and wave_speeds for greater robustness and dimensional consistency testing. All answers are bitwise identical and pass dimensional scaling tests. * +Changed the units of minimum_forcing_depth to [H] Changed the units of minimum_forcing_depth passed to applyBoundaryFluxesInOut and applyTracerBoundaryFluxesInOut to [H]. All answers are bitwise identical. * Correction of documented units in comments Corrected some units in comments and eliminated some unused variables. All answers are bitwise identical. * Adiabatic clock ID bugfix This patch fixes an initialization bug of the diabatic timer, which was being used to measure adiabatic time but was never initialized if the experiment was configured as adiabatic. We fix this by introducing a separate timer for the adiabatic solver. Although we could have reused the diabatic timer, the addition of a new variable should not add any overhead on modern compilers. * Corrected an OMP declaration Added a variable to an OMP declaration. All answers are bitwise identical, and a recently added compile-time error with openMP was fixed. * Update MOM.F90 Fixed Alistair's embarrassing error. * Dimensional rescaling in MOM_open_boundary.F90 Added rescaling for dimensional consistency testing in MOM_open_boundary.F90, including splitting variables with different units that had previously shared the same variable and adding more extensive documentation of variables. Also changed the dimensions of the timesteps passed to radiation_open_bdry_conds and update_segment_tracer_reservoirs to [T] and added vertical_grid_type and unit_scale_type arguments to open_boundary_init and open_boundary_test_extern_h. All answers are bitwise identical, although some probably bugs have been noted in comments and there are new or altered arguments to several routines. * (*)Fixed invariance bugs in MOM_open_boundary.F90 Corrected dimensional consistency bugs in update_segment_tracer_reservoirs and horizontal indexing and related bugs in gradient_at_q_points with oblique_grad OBCs. These will both change answers in test cases that use some open boundary condition options, but not in any of the MOM6-examples test cases.
This reverts commit 855d706.
Added additional domain routine interfaces that are needed by SIS2, including the new function same_domain, which tests whether two domains use the same layout and conforming computational domain sizes, and a new 4d-array variant of redistribute_array because SIS2 uses thickness categories as a 4th dimension. All answers are bitwise identical.
Remove inappropriate timelevel arguments
- previous code had averaging instead of summation for SPP (x:sum,y:point,z:point) diagnostics - corrects an issue where these diagnostics were incorrect by approximately a factor of 2. - Orginially found when analyzing the depth-integrated temperature advection diagnostic (T_ady_2d)
MOM_hor_visc: Variables moved to stack
This patch restructures the CorAdCalc function so that the loops are more easily vectorized on a broader range of systems. The number of memory access has also been slightly reduced. We observed a 1.75x speedup on a modern consumer AMD processor (Ryzen 5 2600) and a 1.24x speedup on Gaea's Intel Xeons (E5-2697 v4). Description There are two major changes: - An if-block testing for `Area_q` was removed, and the `h_neglect * Area_q` term was replaced with a new `vol_neglect` term. This term is intended to prevent division by zero when the hArea_q is zero. Otherwise, it is meant to be below roundoff and have no impact on the calculation. Previously, a zero value of Area_q would force a division by zero. Using vol_neglect ensures that the denominator is always nonzero. The value is set to use `H_subroundoff` times an area of 0.1 mm2, suggested by Robert Hallberg as a hypothetical Kolmogorov scale. Numerical results are intended to be independent of this choice. - Two separated loops associated with the bounded Coriolis term were combined into a single loop, which reduced both the number of internal if-blocks and avoided redundant memory load/stores. Other if-blocks inside of do-loops were moved outside of the loops. I can provide two potential explanations for the difference in Intel and AMD performance: * Masking instructions have a lower latency on Intel CPUs, which permit limited vectorization of if-blocks. Similar vectorization is not possible on AMD CPUs. So Intel is less likely to benefit from if-block re-ordering. * Our Intel nodes on Gaea have a lower RAM bandwidth, and see a smaller benefit from vectorization, which must required greater bandwidth. This speedup may be greater on a more modern Intel platform. Although the code has been vectorized on both Intel and AMD platforms, there are still many memory accesses per operation, which is limiting performance. The changes below are not expected to change any answers, and none were detected. But since we are changing a core component, I'd suggest reviewing this carefully. Sample timings are provided below. Runtime measurements -------------------- AMD Before: (Ocean Coriolis & mom advection) 1.091571 (Ocean Coriolis & mom advection) 1.086183 (Ocean Coriolis & mom advection) 1.091197 (Ocean Coriolis & mom advection) 1.087709 (Ocean Coriolis & mom advection) 1.086990 AMD After: (Ocean Coriolis & mom advection) 0.619346 (Ocean Coriolis & mom advection) 0.624106 (Ocean Coriolis & mom advection) 0.625438 (Ocean Coriolis & mom advection) 0.630169 (Ocean Coriolis & mom advection) 0.621736 ---- Intel Before: (Ocean Coriolis & mom advection) 0.981367 (Ocean Coriolis & mom advection) 0.982316 (Ocean Coriolis & mom advection) 0.986633 (Ocean Coriolis & mom advection) 0.981260 (Ocean Coriolis & mom advection) 0.982810 Intel After: (Ocean Coriolis & mom advection) 0.788747 (Ocean Coriolis & mom advection) 0.797684 (Ocean Coriolis & mom advection) 0.786874 (Ocean Coriolis & mom advection) 0.792120 (Ocean Coriolis & mom advection) 0.795373
…03-26 dev/gfdl main candidate 2021-03-26
- Fixes for SSP (x:sum;y:sum,z:point) and PSP (x:point,y:sum,z:point) diagnostics - Removed unused `total_weight` arrays in these cases
Corrected comments describing the various CT_copy_data routines, following suggestions in a review by Keith Lindsay. All answers are bitwise identical.
…to SIS2_infra_support
+Add infrastructure interfaces needed by SIS2
Fixed downsampling for x:sum y:point z:point diags
Coriolis: Improved coradcalc vectorization
Added support for new IO capabilities that are needed by SIS2 to use the MOM6 framework and infrastructure code, but should also be useful within MOM6 itself. These new capabilities include writing global attributes to files, using create_file named axes that are not derived from a MOM6 grid type, and new options and elements in the vardesc type to support a wider array of axes and to provide the position of the grid staggering via an integer position variable instead of the short character strings that had been used. As a part of this commit, there are the new opaques type axis_info and attribute_info, and the new routines set_axis_info, delete_axis_info, set_attribute_info and delete_attribute_info to facilitate these new capabilities, as well as the publicly visible function position_from_horgrid to translate the vardesc%hor_grid character strings into the integer position flag used elsewhere in the MOM6 and FMS codes. Within the MOM_io_infra, there is a new variant of the overloaded interface write_meta to handle writing global attributes. There are also two new optional arguments to create_file and reopen_file, and two new optional arguments to var_desc, modify_vardesc, and query_vardesc. All answers and output are bitwise identical.
Restructured the code slightly so that the output files that are generated when input_filename = 'F' is exactly the same as if it is 'n' if there are no restart files in the restart_input_dir, or as if it is 'r' if the restart files are there. Previously, the solutions with 'F' worked this way, but no ocean_geometry.nc or Vertical_grid.nc files were written when WRITE_GEOM=1, regardless of the presence or absence of the restart files, and the MOM_parameter_doc.all files differed slightly between the 'n' and 'F' or 'r' cases. As a part of these changes, the determination of whether this is a new run is moved earlier in the algorithm, and now sits outside of MOM_initialize_state. All solutions are bitwise identical, but there are changes in the position of the PARALLEL_RESTARTFILES and REFERENCE_HEIGHT entries in most MOM_parameter_doc.all files.
+Enhanced support for novel axes in MOM_io
Change the types returned from the 5 rotated_field_chksum from integer to integer(kind=int64), so that the full 64-byte checksums are returned. Without this change, the checksums that are written to MOM6 restart files or interpreted from them are truncated to the latter half of their length. This changes the checksums that are written to the restart files, but both before and after this change the values that are written are the same as those that are generated after reading the restart with the same version of the code. The code can run across this change by setting RESTART_CHECKSUMS_REQUIRED = False for the run segment where the transition occurs. The solutions themselves are bitwise identical.
Changed the declarations of the vardesc and fields arrays to allocatable in write_ocean_geometry_files, primarily to get one of the TC test cases to run properly with the gcc compiler by shifting the memory for these arrays from stack to heap. The reason why this change works is not clear. Some comments describing these variables were also added. All answers are bitwise identical.
+More consistent treatment of input_filename = 'F'
(*)Write full checksums to restarts
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.