From 4291733909552f33bfbb25a9ba61469f424e8e30 Mon Sep 17 00:00:00 2001 From: ashjbarnes Date: Fri, 12 Apr 2024 19:41:59 +0200 Subject: [PATCH 01/24] Include section on file structure in docs --- docs/file-structure.md | 57 ++++++++++++++++++++++++++++++++++++++++++ docs/index.rst | 1 + 2 files changed, 58 insertions(+) create mode 100644 docs/file-structure.md diff --git a/docs/file-structure.md b/docs/file-structure.md new file mode 100644 index 00000000..8213ed4c --- /dev/null +++ b/docs/file-structure.md @@ -0,0 +1,57 @@ +MOM6 file structure +============ + +This section describes all of the various files that `MOM6_regional` produces, and explains how they fit in. A better understanding of what these files do will help with troubleshooting and more advanced customisation + +## The run directory + +This folder, specified by the `mom_run_dir` path that's given to the `experiment` class, contains only text files that configure MOM6 and are read at model initialisation. You can see examples of these files in the pre-made run directories folder. In no particular order, these files are + +`input.nml` +This contains high level information to be passed directly to each component of your MOM6 setup. Here you'll the paths to the `SIS` and `MOM` input directories, and outputs. Importantly, the `coupler` section turns on or off different model components, and specifies how long to run the experiment for. + +`diag_table` +The diagnostics to save from your model. You can't keep everything! Consider the things that are most important for your experiment - you can fill up disk space very fast if you save too much. Different lines in the diagnostic table either specify a new output *file* and its associated characteristics, or a new output *variable* and the matching file that it should go in (which needs to already have been specified!). If you're not sure which diagnostics to pick, you can run the model for 1 hour and look in the output folder. Here there'll be a file called `available_diags` which lists every possible diagnostic for your model configuration. Here it will also tell you which grids you're allowed to output them on. Aside from the native model grid, you can create your own set of vertical coordinates. To output on your custom vertical coordinate, create a netcdf containing all of the vertical points (be they depths or densities) go to the `MOM_input` file and specify additional diagnostic coordinates there. Then, you can pick these coordinates in the diag table. + +Documentation as to how to format the file can be found [here](https://mom6.readthedocs.io/en/dev-gfdl/api/generated/pages/Diagnostics.html). + +`data_table` +The data table is read by the coupler to provide different model components with inputs. For example, for our ocean only model runs, atmospheric forcing data ought to be provided. This can either be a constant value, or a dataset as in the reanalysis-forced demo. As more model components are included, the data table has less of a role to play. However, unless you want risk freezing or boiling the ocean you'll usually need to provide solar fluxes at a minimum! + +Documentation as to how to format the file can be found [here](https://mom6.readthedocs.io/en/dev-gfdl/forcing.html). + +`MOM_input / SIS_input` +These files provide the basic settings for the core MOM and SIS code. The settings themselves are reasonably well documented. After running the experiment for a short amount of time, you can find a `MOM_parameter_doc.all` file which lists every possible setting your can modify for your experiment. The MOM_regional package can copy and modify a default set of input files to work with your experiment. There's too much in these files to explain here. The aforementioned vertical diagnostic coordinates are specified here, as are all of the different parameterisation schemes and hyperparameters used by the model. Some really important ones are the timesteps which will likely need to be fiddled with to get your model running quickly but stably. However, it can be more helpful to specify these in the `MOM_override` file instead. + +Another important part section for regional modelling is the specification of open boundary segments. You need to include a separate line for each boundary in your domain, and specify any additional tracers that need be included. + +`MOM_override` +This file serves to override settings chosen in other input files. This is helpful for making a distinction between the thing you're fiddling with and 95% of the settings that you'll always be leaving alone. For instance, you might need to temporarily change your baroclinic (`DT`), thermal (`DT_THERM`) or baroclinic (`DT_BT`) timesteps, or are doing perturbation experiments that requires you to switch between different bathymetry files. + +`config file` +This file is machine dependent and environment dependent. For instance, if you're using Australia's National Computational Infrastructure (NCI), then you're likely using the `payu` framework, and you'll have a `config.yml` file. Regardless of what it looks like, this file should contain information that points to the executable, your input directory (aka the `mom_input_dir` you specified), the computational resources you'd like to request and other various settings. + +The package does come with a premade `config.yml` file for payu users which is automatically copied and modified when the appropriate flag is passed to the `setup_rundir` method. If you find this package useful and you use a different machine, I'd encourage you to provide an example config file for your institution! Then this could be copied into. + + +## The run directory +This is the folder referred to by the `mom_input_dir` path. Here we have mostly NetCDF files that are read by MOM6 at runtime. These files can be big, so it's usually helpful to store them somewhere where disk space isn't an issue. + +`hgrid.nc` +This is the horizontal grid that the model runs on. Known as the 'supergrid', it contains twice as many x and y points as you might expect. This is because *all* points on the Arakawa C grid are included. Since you're running a regional experiment, you'll be using the 'symmetric memory' configuration of the MOM6 executable. This means that the horizontal grids boundary must be entirely composed of cell edge points (like those used by velocities). So, if you have a model with 300 x cells, the `nx` dimension will be 601 wide. + +The `nx` and `ny` points are where data is stored, whereas `nxp` and `nyp` here define the spaces between points used to compute area. The x and y variables in `hgrid` refer to the longitude and latitude. Importantly, x and y both depend on `nyx` and `nyp` meaning that the grid doesn't have to follow lines of constant latitude or longitude. If you make your own custom horizontal and vertical grids, you can simply set `read_existing_grid` to `True` when creating the experiment object. + +`vcoord.nc` +This specifies the values of the vertical coordinate. By default this package sets up a `z*` vertical coordinate but others can be provided and the `MOM_input` file adjusted appropriately. If you want to customise the vertical coordinate, you can initialise an `experiment` object to begin with, then modify and re-save the `vcoord.nc`. You can provide more vertical coordinates (giving them a different name of course) for diagnostic purposes. These allow your diagnostics to be remapped and output on this coordinate at runtime. + +`bathymetry.nc` +Fairly self explanatory, but can be the source of some difficulty. The package automatically attempts to remove "non advective cells". These are small enclosed lakes at the boundary that can cause numerical problems whereby water might flow in but have no way to flow out. Likewise, there can be issues with very shallow (only 1 or 2 layers) or very narrow (1 cell wide) channels. If your model runs for a while but then gets extreme sea surface height values, it could be caused by an unlucky combination of boundary and bathymetry. + +Another thing to note is that the bathymetry interpolation can be computationally intensive. If using a high resolution dataset like GEBCO and a large domain, you might not be able to execute the `.setup_bathymetry` method in a Jupyter notebook if such notebooks have restricted computational capacity. Instructions for running the interpolation via `mpirun` are printed on execution of the `.setup_bathymetry` method in case this is an issue. + +`forcing/init_*.nc` +These are the initial conditions bunched into velocities, tracers and the free surface height (`eta`). + +`forcing/forcing_segment*` +These are the boundary forcing segments, numbered the same way as in MOM_input. The dimensions and coordinates are fairly confusing, and getting them wrong can likewise cause some cryptic error messages! These boundaries don't have to follow lines of constant longitude and latitude, but it is much easier to set things up if they do. For an example of a curved boundary, see this [Northwest Atlantic experiment](https://github.com/jsimkins2/nwa25/tree/main). \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index 2c0144eb..9588a97d 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -12,6 +12,7 @@ configurations for the `Modular Ocean Model 6`_. installation demos + file-structure api contributing From 87cfc589f7aa7beb51a3fd971a4adb31d2d545b9 Mon Sep 17 00:00:00 2001 From: ashjbarnes Date: Fri, 12 Apr 2024 20:06:54 +0200 Subject: [PATCH 02/24] update readme --- README.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 75b85152..4fc69c50 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,15 @@ Users just need to provide some information about where, when, and how big their The idea behind this package is that it should the user sidestep some of the tricky issues with getting the model to run in the first place. This removes some of the steep learning curve for people new to working with the model. Note that the resultant model configuration might still need some tweaking (e.g., fiddling with timestep to avoid CFL-related numerical stability issues or fiddling with bathymetry to deal with very narrow fjords or channels that may exist). -Limitations: Currently the package supports only one horizontal grid type (that is equally spaced in longitude); there are plans to add more grid options. We have designed the package in a way that it is modular so, for example, one needs to implement just another method for a different type of grid and the rest should be good to go. +**Features** +- Automatic grid generation at your chosen vertical and horizontal grid spacing +- Finds and removes non-advective cells from your bathymetry that cause the model to crash +- Handles the slicing across 'seams' in your input datasets (eg. at 0,360 or -180,180) +- Handles metadata encoding +- Modifies pre-made configuration files to match your experiment +- Handles interpolation and interpretation of input data. Limited pre-processing of your forcing data required! + +Limitations: Currently the package only comes with one function for generating a horizontal grid, namely one that's equally spaced in longitude and latitude. However, users can BYO a grid, or ideally open a PR with their desired grid generation function and we'll include it as an option! Further, only boundary segments parallel to longitude or latitude lines are currently supported. If you find this package useful and have any suggestions please feel free to open an [issue](https://github.com/COSIMA/regional-mom6/issues) or a [discussion](https://github.com/COSIMA/regional-mom6/discussions). We'd love to have [new contributors](https://regional-mom6.readthedocs.io/en/latest/contributing.html) and we are very keen to help you out along the way! @@ -60,8 +68,7 @@ pip install git+https://github.com/COSIMA/regional-mom6.git@061b0ef80c7cbc04de05 ## MOM6 Configuration and Version Requirements -The package and demos assume a coupled SIS2-MOM6 configuration. -The examples could work for an ocean-only MOM6 run but this has not been tested. +The package and demos assume a coupled SIS2-MOM6 configuration, but also works for a MOM6 ocean-only case too given appropriate changes to the `input.nml` and `MOM_input` files. For regional models, the executable must always be compiled with *symmetric* memory. The current release of this package assumes the latest source code of all components needed to run MOM6 as of January 2024. A forked version of the [`setup-mom6-nci`](https://github.com/ashjbarnes/setup-mom6-nci) repository @@ -80,9 +87,3 @@ Please ensure that you can get at least one of these working on your setup with You can download the notebooks [from Github](https://github.com/COSIMA/regional-mom6/tree/ncc/installation/demos) or by clicking on the download download button, e.g., at the top-right of the [regional tasmania forced by ERA5 example](https://regional-mom6.readthedocs.io/en/latest/demo_notebooks/reanalysis-forced.html). -**Note** - -The `xesmf` the package attempts to regrid in parallel, and if it's not able to do so, it throws a warning and -runs in serial. You also get a print out of the relevant `mpirun` command which you could use as a backup. -Depending on your setup of your machine, you may need to write scripts that implement the package to access more -computational resources than might be available, e.g., on the HPC machine of you are working on. From 8af104b2c6f494627e1a9da358d4ee0cec487825 Mon Sep 17 00:00:00 2001 From: Ashley Barnes <53282288+ashjbarnes@users.noreply.github.com> Date: Sun, 14 Apr 2024 12:02:44 +0200 Subject: [PATCH 03/24] Update README.md Co-authored-by: Navid C. Constantinou --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index affcd4d3..0809c663 100644 --- a/README.md +++ b/README.md @@ -68,7 +68,9 @@ pip install git+https://github.com/COSIMA/regional-mom6.git@061b0ef80c7cbc04de05 ## MOM6 Configuration and Version Requirements -The package and demos assume a coupled SIS2-MOM6 configuration, but also works for a MOM6 ocean-only case too given appropriate changes to the `input.nml` and `MOM_input` files. For regional models, the executable must always be compiled with *symmetric* memory. +The package and demos assume a coupled MOM6-SIS2 configuration, but also work for MOM6 ocean-only configuration after appropriate changes in the `input.nml` and `MOM_input` files. + +Additionally, regional configurations require that the MOM6 executable _must_ be compiled with **symmetric memory**. The current release of this package assumes the latest source code of all components needed to run MOM6 as of January 2024. A forked version of the [`setup-mom6-nci`](https://github.com/ashjbarnes/setup-mom6-nci) repository From f587aafa4b812512668ba7eed1e21a8c3a531d78 Mon Sep 17 00:00:00 2001 From: Ashley Barnes <53282288+ashjbarnes@users.noreply.github.com> Date: Sun, 14 Apr 2024 12:03:26 +0200 Subject: [PATCH 04/24] Update README.md Co-authored-by: Navid C. Constantinou --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 0809c663..c7ed0147 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,7 @@ Users just need to provide some information about where, when, and how big their The idea behind this package is that it should let the user sidestep some of the tricky issues with getting the model to run in the first place. This removes some of the steep learning curve for people new to working with MOM6. Note that the resultant model configuration might still need some tweaking (e.g., fiddling with timestep to avoid CFL-related numerical stability issues or fiddling with bathymetry to deal with very narrow fjords or channels that may exist). **Features** -- Automatic grid generation at your chosen vertical and horizontal grid spacing +- Automatic grid generation at chosen vertical and horizontal grid spacing. - Finds and removes non-advective cells from your bathymetry that cause the model to crash - Handles the slicing across 'seams' in your input datasets (eg. at 0,360 or -180,180) - Handles metadata encoding From 27c4750c665a95403b4c7a1fda4bb3d303223227 Mon Sep 17 00:00:00 2001 From: Ashley Barnes <53282288+ashjbarnes@users.noreply.github.com> Date: Sun, 14 Apr 2024 12:03:34 +0200 Subject: [PATCH 05/24] Update README.md Co-authored-by: Navid C. Constantinou --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index c7ed0147..6758c9ca 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@ The idea behind this package is that it should let the user sidestep some of the **Features** - Automatic grid generation at chosen vertical and horizontal grid spacing. -- Finds and removes non-advective cells from your bathymetry that cause the model to crash +- Automatic removal of non-advective cells from the bathymetry that cause the model to crash. - Handles the slicing across 'seams' in your input datasets (eg. at 0,360 or -180,180) - Handles metadata encoding - Modifies pre-made configuration files to match your experiment From 9d8473a068be2fd915ebbcb605d860c93df84032 Mon Sep 17 00:00:00 2001 From: Ashley Barnes <53282288+ashjbarnes@users.noreply.github.com> Date: Sun, 14 Apr 2024 12:03:51 +0200 Subject: [PATCH 06/24] Update README.md Co-authored-by: Navid C. Constantinou --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 6758c9ca..2fd0e2ce 100644 --- a/README.md +++ b/README.md @@ -9,6 +9,7 @@ Users just need to provide some information about where, when, and how big their The idea behind this package is that it should let the user sidestep some of the tricky issues with getting the model to run in the first place. This removes some of the steep learning curve for people new to working with MOM6. Note that the resultant model configuration might still need some tweaking (e.g., fiddling with timestep to avoid CFL-related numerical stability issues or fiddling with bathymetry to deal with very narrow fjords or channels that may exist). **Features** + - Automatic grid generation at chosen vertical and horizontal grid spacing. - Automatic removal of non-advective cells from the bathymetry that cause the model to crash. - Handles the slicing across 'seams' in your input datasets (eg. at 0,360 or -180,180) From bb7ee1550dc3cb989d45126eb9d4a8e1ac0192ba Mon Sep 17 00:00:00 2001 From: Ashley Barnes <53282288+ashjbarnes@users.noreply.github.com> Date: Sun, 14 Apr 2024 12:04:16 +0200 Subject: [PATCH 07/24] Update README.md Co-authored-by: Navid C. Constantinou --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2fd0e2ce..cb0d5f7b 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,7 @@ The idea behind this package is that it should let the user sidestep some of the - Automatic grid generation at chosen vertical and horizontal grid spacing. - Automatic removal of non-advective cells from the bathymetry that cause the model to crash. -- Handles the slicing across 'seams' in your input datasets (eg. at 0,360 or -180,180) +- Handle slicing across 'seams' in of the forcing input datasets (e.g., when the regional configuration spans the longitude 180 of a global dataset that spans [-180, 180]). - Handles metadata encoding - Modifies pre-made configuration files to match your experiment - Handles interpolation and interpretation of input data. Limited pre-processing of your forcing data required! From bf1568238be9cfd8e71aaca596c4dfa38c88ee57 Mon Sep 17 00:00:00 2001 From: Ashley Barnes <53282288+ashjbarnes@users.noreply.github.com> Date: Sun, 14 Apr 2024 12:04:27 +0200 Subject: [PATCH 08/24] Update README.md Co-authored-by: Navid C. Constantinou --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index cb0d5f7b..388a9cc5 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@ The idea behind this package is that it should let the user sidestep some of the - Automatic grid generation at chosen vertical and horizontal grid spacing. - Automatic removal of non-advective cells from the bathymetry that cause the model to crash. - Handle slicing across 'seams' in of the forcing input datasets (e.g., when the regional configuration spans the longitude 180 of a global dataset that spans [-180, 180]). -- Handles metadata encoding +- Handles metadata encoding. - Modifies pre-made configuration files to match your experiment - Handles interpolation and interpretation of input data. Limited pre-processing of your forcing data required! From bcac87b44042e5965de1ad4e80d10485347c55ce Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Sun, 14 Apr 2024 14:52:22 +0300 Subject: [PATCH 09/24] Update README.md Co-authored-by: Ashley Barnes <53282288+ashjbarnes@users.noreply.github.com> --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 388a9cc5..730a0dbb 100644 --- a/README.md +++ b/README.md @@ -17,7 +17,7 @@ The idea behind this package is that it should let the user sidestep some of the - Modifies pre-made configuration files to match your experiment - Handles interpolation and interpretation of input data. Limited pre-processing of your forcing data required! -Limitations: Currently the package only comes with one function for generating a horizontal grid, namely one that's equally spaced in longitude and latitude. However, users can BYO a grid, or ideally open a PR with their desired grid generation function and we'll include it as an option! Further, only boundary segments parallel to longitude or latitude lines are currently supported. +Limitations: Currently the package only comes with one function for generating a horizontal grid, namely one that's equally spaced in longitude and latitude. However, users can provide their own grid, or ideally open a PR with their desired grid generation function and we'll include it as an option! Further, only boundary segments parallel to longitude or latitude lines are currently supported. If you find this package useful and have any suggestions please feel free to open an [issue](https://github.com/COSIMA/regional-mom6/issues) or a [discussion](https://github.com/COSIMA/regional-mom6/discussions). We'd love to have [new contributors](https://regional-mom6.readthedocs.io/en/latest/contributing.html) and we are very keen to help you out along the way! From 12e4de77e435b32efb26551e49573205b5d7952a Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 10:32:13 +0300 Subject: [PATCH 10/24] more specific features --- README.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 730a0dbb..8d8c2247 100644 --- a/README.md +++ b/README.md @@ -8,20 +8,23 @@ Users just need to provide some information about where, when, and how big their The idea behind this package is that it should let the user sidestep some of the tricky issues with getting the model to run in the first place. This removes some of the steep learning curve for people new to working with MOM6. Note that the resultant model configuration might still need some tweaking (e.g., fiddling with timestep to avoid CFL-related numerical stability issues or fiddling with bathymetry to deal with very narrow fjords or channels that may exist). -**Features** + +## Features - Automatic grid generation at chosen vertical and horizontal grid spacing. - Automatic removal of non-advective cells from the bathymetry that cause the model to crash. - Handle slicing across 'seams' in of the forcing input datasets (e.g., when the regional configuration spans the longitude 180 of a global dataset that spans [-180, 180]). - Handles metadata encoding. -- Modifies pre-made configuration files to match your experiment -- Handles interpolation and interpretation of input data. Limited pre-processing of your forcing data required! +- Creates directory structure with the configuration files as expected by MOM6. +- Handles interpolation and interpretation of input data. No pre-processing of forcing datasets is required. (In some cases, slicing the forcing dataset before helps with hitting limitations related to the machine's available memory.) Limitations: Currently the package only comes with one function for generating a horizontal grid, namely one that's equally spaced in longitude and latitude. However, users can provide their own grid, or ideally open a PR with their desired grid generation function and we'll include it as an option! Further, only boundary segments parallel to longitude or latitude lines are currently supported. If you find this package useful and have any suggestions please feel free to open an [issue](https://github.com/COSIMA/regional-mom6/issues) or a [discussion](https://github.com/COSIMA/regional-mom6/discussions). We'd love to have [new contributors](https://regional-mom6.readthedocs.io/en/latest/contributing.html) and we are very keen to help you out along the way! + ## What you need to get started: + 1. a cool idea for a new regional MOM6 domain, 2. a working MOM6 executable on a machine of your choice, 3. a bathymetry file that at least covers your domain, @@ -31,6 +34,7 @@ If you find this package useful and have any suggestions please feel free to ope Check out the [documentation](https://regional-mom6.readthedocs.io/en/latest/) and browse through the [demos](https://regional-mom6.readthedocs.io/en/latest/demos.html). + ## Installation We can install `regional_mom6` via `pip` from GitHub. A prerequisite is the binary `esmpy` @@ -81,8 +85,8 @@ directory lists the particular commits that were used to compile MOM6 and its su Note that the commits used for MOM6 submodules (e.g., Flexible Modelling System (FMS), coupler, SIS2) are _not_ necessarily those used by the GFDL's [`MOM6_examples`](https://github.com/NOAA-GFDL/MOM6-examples) repository. -## Getting started +## Getting started The [example notebooks](https://regional-mom6.readthedocs.io/en/latest/demos.html) walk you through how to use the package using two different sets of input datasets. From d4e4f2b0800a4ded0f351fc3fe9de88d6af978e3 Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 10:50:51 +0300 Subject: [PATCH 11/24] bit of cleanup and formatting --- docs/file-structure.md | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/docs/file-structure.md b/docs/file-structure.md index 8213ed4c..8c0d6be6 100644 --- a/docs/file-structure.md +++ b/docs/file-structure.md @@ -1,17 +1,22 @@ MOM6 file structure ============ -This section describes all of the various files that `MOM6_regional` produces, and explains how they fit in. A better understanding of what these files do will help with troubleshooting and more advanced customisation +This section describes the various directories and files that `regional-mom6` package produces. +A better understanding of what these files do will help with troubleshooting and more advanced customisations. -## The run directory +## The `run` directory -This folder, specified by the `mom_run_dir` path that's given to the `experiment` class, contains only text files that configure MOM6 and are read at model initialisation. You can see examples of these files in the pre-made run directories folder. In no particular order, these files are +The directory, specified by the `mom_run_dir` path keyword argument in the `experiment` class, contains only text files that configure MOM6 and are used at model initialisation. +You can see examples of these files in the `premade_run_directories`. +These files are: -`input.nml` -This contains high level information to be passed directly to each component of your MOM6 setup. Here you'll the paths to the `SIS` and `MOM` input directories, and outputs. Importantly, the `coupler` section turns on or off different model components, and specifies how long to run the experiment for. +* `input.nml`: + High-level information that is passed directly to each component of your MOM6 setup. + The paths of to the `SIS` and `MOM` input directories and outputs are included. + The `coupler` section turns on or off different model components, and specifies how long to run the experiment for. -`diag_table` -The diagnostics to save from your model. You can't keep everything! Consider the things that are most important for your experiment - you can fill up disk space very fast if you save too much. Different lines in the diagnostic table either specify a new output *file* and its associated characteristics, or a new output *variable* and the matching file that it should go in (which needs to already have been specified!). If you're not sure which diagnostics to pick, you can run the model for 1 hour and look in the output folder. Here there'll be a file called `available_diags` which lists every possible diagnostic for your model configuration. Here it will also tell you which grids you're allowed to output them on. Aside from the native model grid, you can create your own set of vertical coordinates. To output on your custom vertical coordinate, create a netcdf containing all of the vertical points (be they depths or densities) go to the `MOM_input` file and specify additional diagnostic coordinates there. Then, you can pick these coordinates in the diag table. +* `diag_table`: + The diagnostics to save from your model. You can't keep everything! Consider the things that are most important for your experiment - you can fill up disk space very fast if you save too much. Different lines in the diagnostic table either specify a new output *file* and its associated characteristics, or a new output *variable* and the matching file that it should go in (which needs to already have been specified!). If you're not sure which diagnostics to pick, you can run the model for 1 hour and look in the output folder. Here there'll be a file called `available_diags` which lists every possible diagnostic for your model configuration. Here it will also tell you which grids you're allowed to output them on. Aside from the native model grid, you can create your own set of vertical coordinates. To output on your custom vertical coordinate, create a netcdf containing all of the vertical points (be they depths or densities) go to the `MOM_input` file and specify additional diagnostic coordinates there. Then, you can pick these coordinates in the diag table. Documentation as to how to format the file can be found [here](https://mom6.readthedocs.io/en/dev-gfdl/api/generated/pages/Diagnostics.html). From 38882123ab5ef427a1dc17fba019b331efad219a Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 10:59:28 +0300 Subject: [PATCH 12/24] bit of cleanup and formatting --- docs/file-structure.md | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/docs/file-structure.md b/docs/file-structure.md index 8c0d6be6..594c96ff 100644 --- a/docs/file-structure.md +++ b/docs/file-structure.md @@ -16,17 +16,25 @@ These files are: The `coupler` section turns on or off different model components, and specifies how long to run the experiment for. * `diag_table`: - The diagnostics to save from your model. You can't keep everything! Consider the things that are most important for your experiment - you can fill up disk space very fast if you save too much. Different lines in the diagnostic table either specify a new output *file* and its associated characteristics, or a new output *variable* and the matching file that it should go in (which needs to already have been specified!). If you're not sure which diagnostics to pick, you can run the model for 1 hour and look in the output folder. Here there'll be a file called `available_diags` which lists every possible diagnostic for your model configuration. Here it will also tell you which grids you're allowed to output them on. Aside from the native model grid, you can create your own set of vertical coordinates. To output on your custom vertical coordinate, create a netcdf containing all of the vertical points (be they depths or densities) go to the `MOM_input` file and specify additional diagnostic coordinates there. Then, you can pick these coordinates in the diag table. + The diagnostics to save from your model run. + Choose wisely the quantities that are relevant to your experiment and the analysis you plan to do otherwise you can fill up disk space very fast. + Different lines in the * `diag_table` either specify a new output *file* and its associated characteristics, or a new output *variable* and the matching file that it should go in (which needs to already have been specified). + If uncertain regarding which diagnostics to pick, try running the model for a short period (e.g., 1 hour) and look in the output folder. + There, you'll find a file `available_diags` that lists every available diagnostic for your model configuration also mentioning which grids the quantity can be output on. + Aside from the native model grid, we can create our own custom vertical coordinates to output on. + To output on a custom vertical coordinate, create a netCDF that contains all of the vertical points (in the coordinate of your choice) and then edit the `MOM_input` file to specify additional diagnostic coordinates. + After that, we are able to select the custom vertical coordinate in the `diag_table`. -Documentation as to how to format the file can be found [here](https://mom6.readthedocs.io/en/dev-gfdl/api/generated/pages/Diagnostics.html). + Instructions for how to format the `diag_table` are included in the [MOM6 documentation](https://mom6.readthedocs.io/en/dev-gfdl/api/generated/pages/Diagnostics.html). -`data_table` -The data table is read by the coupler to provide different model components with inputs. For example, for our ocean only model runs, atmospheric forcing data ought to be provided. This can either be a constant value, or a dataset as in the reanalysis-forced demo. As more model components are included, the data table has less of a role to play. However, unless you want risk freezing or boiling the ocean you'll usually need to provide solar fluxes at a minimum! +* `data_table` + The data table is read by the coupler to provide different model components with inputs. + With more model components we need more inputs. -Documentation as to how to format the file can be found [here](https://mom6.readthedocs.io/en/dev-gfdl/forcing.html). + Instructions for how to format the `data_table` are included in the [MOM6 documentation](https://mom6.readthedocs.io/en/dev-gfdl/forcing.html). -`MOM_input / SIS_input` -These files provide the basic settings for the core MOM and SIS code. The settings themselves are reasonably well documented. After running the experiment for a short amount of time, you can find a `MOM_parameter_doc.all` file which lists every possible setting your can modify for your experiment. The MOM_regional package can copy and modify a default set of input files to work with your experiment. There's too much in these files to explain here. The aforementioned vertical diagnostic coordinates are specified here, as are all of the different parameterisation schemes and hyperparameters used by the model. Some really important ones are the timesteps which will likely need to be fiddled with to get your model running quickly but stably. However, it can be more helpful to specify these in the `MOM_override` file instead. +* `MOM_input / SIS_input` + These files provide the basic settings for the core MOM and SIS code. The settings themselves are reasonably well documented. After running the experiment for a short amount of time, you can find a `MOM_parameter_doc.all` file which lists every possible setting your can modify for your experiment. The MOM_regional package can copy and modify a default set of input files to work with your experiment. There's too much in these files to explain here. The aforementioned vertical diagnostic coordinates are specified here, as are all of the different parameterisation schemes and hyperparameters used by the model. Some really important ones are the timesteps which will likely need to be fiddled with to get your model running quickly but stably. However, it can be more helpful to specify these in the `MOM_override` file instead. Another important part section for regional modelling is the specification of open boundary segments. You need to include a separate line for each boundary in your domain, and specify any additional tracers that need be included. From 5e0eb6d560f6ffc56fb369b20848d1c2db2142c9 Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 11:04:11 +0300 Subject: [PATCH 13/24] bit of cleanup and formatting --- docs/file-structure.md | 45 ++++++++++++++++++++++++------------------ 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/docs/file-structure.md b/docs/file-structure.md index 594c96ff..bd114a77 100644 --- a/docs/file-structure.md +++ b/docs/file-structure.md @@ -34,37 +34,44 @@ These files are: Instructions for how to format the `data_table` are included in the [MOM6 documentation](https://mom6.readthedocs.io/en/dev-gfdl/forcing.html). * `MOM_input / SIS_input` - These files provide the basic settings for the core MOM and SIS code. The settings themselves are reasonably well documented. After running the experiment for a short amount of time, you can find a `MOM_parameter_doc.all` file which lists every possible setting your can modify for your experiment. The MOM_regional package can copy and modify a default set of input files to work with your experiment. There's too much in these files to explain here. The aforementioned vertical diagnostic coordinates are specified here, as are all of the different parameterisation schemes and hyperparameters used by the model. Some really important ones are the timesteps which will likely need to be fiddled with to get your model running quickly but stably. However, it can be more helpful to specify these in the `MOM_override` file instead. + Basic settings for the core MOM and SIS code with reasonably-well documentation. + After running the experiment for a short amount of time, you can find a `MOM_parameter_doc.all` file which lists every possible setting your can modify for your experiment. + The `regional-mom6` package can copy and modify a default set of input files to work with your experiment. + There's too much in these files to explain here. + The aforementioned vertical diagnostic coordinates are specified here, as are all of the different parameterisation schemes and hyperparameters used by the model. + Some of these parameters are important, e.g., the timesteps, which will likely need to be fiddled with to get your model running quickly but stably. + However, it can be more helpful to specify these in the `MOM_override` file instead. -Another important part section for regional modelling is the specification of open boundary segments. You need to include a separate line for each boundary in your domain, and specify any additional tracers that need be included. + Another important part section for regional modelling is the specification of open boundary segments. + A separate line for each boundary in our domain is included and also any additional tracers need to be specified here. -`MOM_override` -This file serves to override settings chosen in other input files. This is helpful for making a distinction between the thing you're fiddling with and 95% of the settings that you'll always be leaving alone. For instance, you might need to temporarily change your baroclinic (`DT`), thermal (`DT_THERM`) or baroclinic (`DT_BT`) timesteps, or are doing perturbation experiments that requires you to switch between different bathymetry files. +* `MOM_override` + This file serves to override settings chosen in other input files. This is helpful for making a distinction between the thing you're fiddling with and 95% of the settings that you'll always be leaving alone. For instance, you might need to temporarily change your baroclinic (`DT`), thermal (`DT_THERM`) or baroclinic (`DT_BT`) timesteps, or are doing perturbation experiments that requires you to switch between different bathymetry files. -`config file` -This file is machine dependent and environment dependent. For instance, if you're using Australia's National Computational Infrastructure (NCI), then you're likely using the `payu` framework, and you'll have a `config.yml` file. Regardless of what it looks like, this file should contain information that points to the executable, your input directory (aka the `mom_input_dir` you specified), the computational resources you'd like to request and other various settings. +* `config file` + This file is machine dependent and environment dependent. For instance, if you're using Australia's National Computational Infrastructure (NCI), then you're likely using the `payu` framework, and you'll have a `config.yml` file. Regardless of what it looks like, this file should contain information that points to the executable, your input directory (aka the `mom_input_dir` you specified), the computational resources you'd like to request and other various settings. -The package does come with a premade `config.yml` file for payu users which is automatically copied and modified when the appropriate flag is passed to the `setup_rundir` method. If you find this package useful and you use a different machine, I'd encourage you to provide an example config file for your institution! Then this could be copied into. + The package does come with a premade `config.yml` file for payu users which is automatically copied and modified when the appropriate flag is passed to the `setup_rundir` method. If you find this package useful and you use a different machine, I'd encourage you to provide an example config file for your institution! Then this could be copied into. ## The run directory This is the folder referred to by the `mom_input_dir` path. Here we have mostly NetCDF files that are read by MOM6 at runtime. These files can be big, so it's usually helpful to store them somewhere where disk space isn't an issue. -`hgrid.nc` -This is the horizontal grid that the model runs on. Known as the 'supergrid', it contains twice as many x and y points as you might expect. This is because *all* points on the Arakawa C grid are included. Since you're running a regional experiment, you'll be using the 'symmetric memory' configuration of the MOM6 executable. This means that the horizontal grids boundary must be entirely composed of cell edge points (like those used by velocities). So, if you have a model with 300 x cells, the `nx` dimension will be 601 wide. +* `hgrid.nc` + This is the horizontal grid that the model runs on. Known as the 'supergrid', it contains twice as many x and y points as you might expect. This is because *all* points on the Arakawa C grid are included. Since you're running a regional experiment, you'll be using the 'symmetric memory' configuration of the MOM6 executable. This means that the horizontal grids boundary must be entirely composed of cell edge points (like those used by velocities). So, if you have a model with 300 x cells, the `nx` dimension will be 601 wide. -The `nx` and `ny` points are where data is stored, whereas `nxp` and `nyp` here define the spaces between points used to compute area. The x and y variables in `hgrid` refer to the longitude and latitude. Importantly, x and y both depend on `nyx` and `nyp` meaning that the grid doesn't have to follow lines of constant latitude or longitude. If you make your own custom horizontal and vertical grids, you can simply set `read_existing_grid` to `True` when creating the experiment object. + The `nx` and `ny` points are where data is stored, whereas `nxp` and `nyp` here define the spaces between points used to compute area. The x and y variables in `hgrid` refer to the longitude and latitude. Importantly, x and y both depend on `nyx` and `nyp` meaning that the grid doesn't have to follow lines of constant latitude or longitude. If you make your own custom horizontal and vertical grids, you can simply set `read_existing_grid` to `True` when creating the experiment object. -`vcoord.nc` -This specifies the values of the vertical coordinate. By default this package sets up a `z*` vertical coordinate but others can be provided and the `MOM_input` file adjusted appropriately. If you want to customise the vertical coordinate, you can initialise an `experiment` object to begin with, then modify and re-save the `vcoord.nc`. You can provide more vertical coordinates (giving them a different name of course) for diagnostic purposes. These allow your diagnostics to be remapped and output on this coordinate at runtime. +* `vcoord.nc` + This specifies the values of the vertical coordinate. By default this package sets up a `z*` vertical coordinate but others can be provided and the `MOM_input` file adjusted appropriately. If you want to customise the vertical coordinate, you can initialise an `experiment` object to begin with, then modify and re-save the `vcoord.nc`. You can provide more vertical coordinates (giving them a different name of course) for diagnostic purposes. These allow your diagnostics to be remapped and output on this coordinate at runtime. -`bathymetry.nc` -Fairly self explanatory, but can be the source of some difficulty. The package automatically attempts to remove "non advective cells". These are small enclosed lakes at the boundary that can cause numerical problems whereby water might flow in but have no way to flow out. Likewise, there can be issues with very shallow (only 1 or 2 layers) or very narrow (1 cell wide) channels. If your model runs for a while but then gets extreme sea surface height values, it could be caused by an unlucky combination of boundary and bathymetry. +* `bathymetry.nc` + Fairly self explanatory, but can be the source of some difficulty. The package automatically attempts to remove "non advective cells". These are small enclosed lakes at the boundary that can cause numerical problems whereby water might flow in but have no way to flow out. Likewise, there can be issues with very shallow (only 1 or 2 layers) or very narrow (1 cell wide) channels. If your model runs for a while but then gets extreme sea surface height values, it could be caused by an unlucky combination of boundary and bathymetry. -Another thing to note is that the bathymetry interpolation can be computationally intensive. If using a high resolution dataset like GEBCO and a large domain, you might not be able to execute the `.setup_bathymetry` method in a Jupyter notebook if such notebooks have restricted computational capacity. Instructions for running the interpolation via `mpirun` are printed on execution of the `.setup_bathymetry` method in case this is an issue. + Another thing to note is that the bathymetry interpolation can be computationally intensive. If using a high resolution dataset like GEBCO and a large domain, you might not be able to execute the `.setup_bathymetry` method in a Jupyter notebook if such notebooks have restricted computational capacity. Instructions for running the interpolation via `mpirun` are printed on execution of the `.setup_bathymetry` method in case this is an issue. -`forcing/init_*.nc` -These are the initial conditions bunched into velocities, tracers and the free surface height (`eta`). +* `forcing/init_*.nc` + These are the initial conditions bunched into velocities, tracers and the free surface height (`eta`). -`forcing/forcing_segment*` -These are the boundary forcing segments, numbered the same way as in MOM_input. The dimensions and coordinates are fairly confusing, and getting them wrong can likewise cause some cryptic error messages! These boundaries don't have to follow lines of constant longitude and latitude, but it is much easier to set things up if they do. For an example of a curved boundary, see this [Northwest Atlantic experiment](https://github.com/jsimkins2/nwa25/tree/main). \ No newline at end of file +* `forcing/forcing_segment*` + These are the boundary forcing segments, numbered the same way as in MOM_input. The dimensions and coordinates are fairly confusing, and getting them wrong can likewise cause some cryptic error messages! These boundaries don't have to follow lines of constant longitude and latitude, but it is much easier to set things up if they do. For an example of a curved boundary, see this [Northwest Atlantic experiment](https://github.com/jsimkins2/nwa25/tree/main). From 3c6fd09f51f9fc719ab6cd217612c8599e4d75c4 Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 11:35:49 +0300 Subject: [PATCH 14/24] Update README.md --- README.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 09355f4a..1fa70909 100644 --- a/README.md +++ b/README.md @@ -24,9 +24,13 @@ The idea behind this package is that it should let the user sidestep some of the - Creates directory structure with the configuration files as expected by MOM6. - Handles interpolation and interpretation of input data. No pre-processing of forcing datasets is required. (In some cases, slicing the forcing dataset before helps with hitting limitations related to the machine's available memory.) -Limitations: Currently the package only comes with one function for generating a horizontal grid, namely one that's equally spaced in longitude and latitude. However, users can provide their own grid, or ideally open a PR with their desired grid generation function and we'll include it as an option! Further, only boundary segments parallel to longitude or latitude lines are currently supported. +## Limitations -If you find this package useful and have any suggestions please feel free to open an [issue](https://github.com/COSIMA/regional-mom6/issues) or a [discussion](https://github.com/COSIMA/regional-mom6/discussions). We'd love to have [new contributors](https://regional-mom6.readthedocs.io/en/latest/contributing.html) and we are very keen to help you out along the way! +Currently the package only supports one type of regional horizontal grid, namely one that's equally spaced in longitude and latitude. Users can provide their own grid, or ideally [open a PR](https://github.com/COSIMA/regional-mom6/pulls) with a method that implements another type of horizontal grid! Furthermore, currently only boundary segments that are parallel to either lines of constant longitude or constant latitude lines are supported. + +## We want to hear from you + +If you have any suggestions please feel free to open an [issue](https://github.com/COSIMA/regional-mom6/issues) or start a [discussion](https://github.com/COSIMA/regional-mom6/discussions). We welcome any [new contributors](https://regional-mom6.readthedocs.io/en/latest/contributing.html) and we are very keen to help you out along the way! ## What you need to get started: From c40e26acfd827ddca090ea120c3f97e2c5af7eaa Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 14:14:07 +0300 Subject: [PATCH 15/24] regional_mom6 -> regional-mom6 --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 1fa70909..dbd512eb 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# regional_mom6 +# regional-mom6 *Python package for automatic generation of regional configurations for the [Modular Ocean Model 6](https://github.com/mom-ocean/MOM6).* @@ -72,7 +72,7 @@ conda install -c conda-forge esmpy Alternatively, to install `esmpy` in a Conda-free way, follow the instructions for [installing ESMPy from source](https://earthsystemmodeling.org/esmpy_doc/release/latest/html/install.html#installing-esmpy-from-source). -With `esmpy` available, we can then install `regional_mom6` via pip. (If we don't have have pip, then +With `esmpy` available, we can then install `regional-mom6` via pip. (If we don't have have pip, then `conda install pip` should do the job.) With `esmpy` installed we can now install `regional-mom6` via [`pip`](https://pypi.org/project/regional-mom6/): From e405dee76e0250560635487a0e4480b4606f840e Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 14:14:24 +0300 Subject: [PATCH 16/24] add some intro in Docs --- docs/index.rst | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/docs/index.rst b/docs/index.rst index 9588a97d..ca93d536 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,10 +1,63 @@ Regional MOM6 Documentation =========================== -``regional_mom6`` is a Python package for automatic generation of regional -configurations for the `Modular Ocean Model 6`_. +*Python package for automatic generation of regional configurations for the `Modular Ocean Model 6`_.* + + +In brief... +----------- + +Users just need to provide some information about where, when, and how big their domain +is and also where raw input forcing files are. The package sorts out all the boring details +and creates a set of MOM6-friendly input files along with setup directories ready to go! + +The idea behind this package is that it should let the user sidestep some of the tricky +issues with getting the model to run in the first place. This removes some of the steep +learning curve for people new to working with MOM6. Note that the resultant model configuration +might still need some tweaking (e.g., fiddling with timestep to avoid CFL-related numerical +stability issues or fiddling with bathymetry to deal with very narrow fjords or channels that may exist). + + +Features +-------- + +- Automatic grid generation at chosen vertical and horizontal grid spacing. +- Automatic removal of non-advective cells from the bathymetry that cause the model to crash. +- Handle slicing across 'seams' in of the forcing input datasets (e.g., when the regional + configuration spans the longitude 180 of a global dataset that spans [-180, 180]). +- Handles metadata encoding. +- Creates directory structure with the configuration files as expected by MOM6. +- Handles interpolation and interpretation of input data. No pre-processing of forcing datasets + is required. (In some cases, slicing the forcing dataset before helps with hitting limitations + related to the machine's available memory.) + + +Limitations +------------ + +- Only supports one type of regional horizontal grid, namely one that's equally spaced in longitude + and latitude. Users can provide their own grid, or ideally `open a pull request`_ with a method + that implements another type of horizontal grid! +- Only boundary segments that are parallel to either lines of constant longitude or constant latitude + lines are supported. + + +What you need to get started +---------------------------- + +1. a cool idea for a new regional MOM6 domain, +2. a working MOM6 executable on a machine of your choice, +3. a bathymetry file that at least covers your domain, +4. 3D ocean forcing files *of any resolution* on your choice of A, B, or C Arakawa grid, +5. surface forcing files (e.g., from ERA or JRA reanalysis), and +6. `GFDL's FRE tools `_ be downloaded and compiled on the machine you are using. + +Browse through the `demos `_. + .. _Modular Ocean Model 6: https://github.com/mom-ocean/MOM6 +.. _open a pull request: https://github.com/COSIMA/regional-mom6/pulls + .. toctree:: :maxdepth: 1 From e802b5479b2dd6e70bccc50b7b524bd0d51e2070 Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 14:15:22 +0300 Subject: [PATCH 17/24] a primer on mom6 files --- docs/index.rst | 2 +- ...cture.md => mom6-file-structure-primer.md} | 53 +++++++++++++------ 2 files changed, 38 insertions(+), 17 deletions(-) rename docs/{file-structure.md => mom6-file-structure-primer.md} (57%) diff --git a/docs/index.rst b/docs/index.rst index ca93d536..06322066 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -65,7 +65,7 @@ Browse through the `demos `_. installation demos - file-structure + mom6-file-structure-primer api contributing diff --git a/docs/file-structure.md b/docs/mom6-file-structure-primer.md similarity index 57% rename from docs/file-structure.md rename to docs/mom6-file-structure-primer.md index bd114a77..a4dbb920 100644 --- a/docs/file-structure.md +++ b/docs/mom6-file-structure-primer.md @@ -1,10 +1,10 @@ -MOM6 file structure -============ +A primer on MOM6 file structure +=============================== -This section describes the various directories and files that `regional-mom6` package produces. -A better understanding of what these files do will help with troubleshooting and more advanced customisations. +Here we describe the various directories and files that `regional-mom6` package produces alongside with +userful insights that, hopefully, will help users deal with troubleshooting and more advanced customisations. -## The `run` directory +## `run` directory The directory, specified by the `mom_run_dir` path keyword argument in the `experiment` class, contains only text files that configure MOM6 and are used at model initialisation. You can see examples of these files in the `premade_run_directories`. @@ -18,9 +18,9 @@ These files are: * `diag_table`: The diagnostics to save from your model run. Choose wisely the quantities that are relevant to your experiment and the analysis you plan to do otherwise you can fill up disk space very fast. - Different lines in the * `diag_table` either specify a new output *file* and its associated characteristics, or a new output *variable* and the matching file that it should go in (which needs to already have been specified). + Different lines in the `diag_table` either specify a new output *file* and its associated characteristics, or a new output *variable* and the matching file that it should go in (which needs to already have been specified). If uncertain regarding which diagnostics to pick, try running the model for a short period (e.g., 1 hour) and look in the output folder. - There, you'll find a file `available_diags` that lists every available diagnostic for your model configuration also mentioning which grids the quantity can be output on. + There, we find a file `available_diags` that lists every available diagnostic for your model configuration also mentioning which grids the quantity can be output on. Aside from the native model grid, we can create our own custom vertical coordinates to output on. To output on a custom vertical coordinate, create a netCDF that contains all of the vertical points (in the coordinate of your choice) and then edit the `MOM_input` file to specify additional diagnostic coordinates. After that, we are able to select the custom vertical coordinate in the `diag_table`. @@ -54,24 +54,45 @@ These files are: The package does come with a premade `config.yml` file for payu users which is automatically copied and modified when the appropriate flag is passed to the `setup_rundir` method. If you find this package useful and you use a different machine, I'd encourage you to provide an example config file for your institution! Then this could be copied into. -## The run directory -This is the folder referred to by the `mom_input_dir` path. Here we have mostly NetCDF files that are read by MOM6 at runtime. These files can be big, so it's usually helpful to store them somewhere where disk space isn't an issue. +## `input` directory -* `hgrid.nc` - This is the horizontal grid that the model runs on. Known as the 'supergrid', it contains twice as many x and y points as you might expect. This is because *all* points on the Arakawa C grid are included. Since you're running a regional experiment, you'll be using the 'symmetric memory' configuration of the MOM6 executable. This means that the horizontal grids boundary must be entirely composed of cell edge points (like those used by velocities). So, if you have a model with 300 x cells, the `nx` dimension will be 601 wide. +The directory referred to by as `mom_input_dir` path that hosts mostly netCDF files that are read by MOM6 at runtime. +These files can be big, so it is usually helpful to store them somewhere without any disk limitations. - The `nx` and `ny` points are where data is stored, whereas `nxp` and `nyp` here define the spaces between points used to compute area. The x and y variables in `hgrid` refer to the longitude and latitude. Importantly, x and y both depend on `nyx` and `nyp` meaning that the grid doesn't have to follow lines of constant latitude or longitude. If you make your own custom horizontal and vertical grids, you can simply set `read_existing_grid` to `True` when creating the experiment object. +* `hgrid.nc` + The horizontal grid that the model runs on. Known as the 'supergrid', it contains twice as many points in each + horizontal dimension as one would expect from the domain extent and the chosen resolution. This is because *all* + points on the Arakawa C grid are included: both velocity and tracer points live in the 'supergrid'. For a regional + configuration, we need to use the 'symmetric memory' configuration of the MOM6 executable. This implies that the + horizontal grids boundary must be entirely composed of cell edge points (like those used by velocities). Therefore, + for example, a model configuration that is 20-degrees wide in longitude and has 0.5 degrees longitudinal resolution,would imply that it has 40 cells in the `x` dimension and thus a supergrid with `nx = 41`. + + The `nx` and `ny` points are where data is stored, whereas `nxp` and `nyp` here define the spaces between points + used to compute area. The `x` and `y` variables in `hgrid` refer to the longitude and latitude. Importantly, `x` + and `y` are both two-dimensional (they both depend on both `nyx` and `nyp`) meaning that the grid does not have + to follow lines of constant latitude or longitude. Users that create their own own custom horizontal and vertical + grids can `read_existing_grid` to `True` when creating an experiment. * `vcoord.nc` - This specifies the values of the vertical coordinate. By default this package sets up a `z*` vertical coordinate but others can be provided and the `MOM_input` file adjusted appropriately. If you want to customise the vertical coordinate, you can initialise an `experiment` object to begin with, then modify and re-save the `vcoord.nc`. You can provide more vertical coordinates (giving them a different name of course) for diagnostic purposes. These allow your diagnostics to be remapped and output on this coordinate at runtime. + The values of the vertical coordinate. By default, `regional-mom6` sets up a `z*` vertical coordinate but other + coordinates may be provided after appropriate adjustments in the `MOM_input` file. Users that would like to + customise the vertical coordinate can initialise an `experiment` object to begin with, then modify and the `vcoord.nc` + file and save. Users can provide additional vertical coordinates (under different names) for diagnostic purposes. + These additional vertical coordinates allow diagnostics to be remapped and output during runtime. * `bathymetry.nc` Fairly self explanatory, but can be the source of some difficulty. The package automatically attempts to remove "non advective cells". These are small enclosed lakes at the boundary that can cause numerical problems whereby water might flow in but have no way to flow out. Likewise, there can be issues with very shallow (only 1 or 2 layers) or very narrow (1 cell wide) channels. If your model runs for a while but then gets extreme sea surface height values, it could be caused by an unlucky combination of boundary and bathymetry. - Another thing to note is that the bathymetry interpolation can be computationally intensive. If using a high resolution dataset like GEBCO and a large domain, you might not be able to execute the `.setup_bathymetry` method in a Jupyter notebook if such notebooks have restricted computational capacity. Instructions for running the interpolation via `mpirun` are printed on execution of the `.setup_bathymetry` method in case this is an issue. + Another thing to note is that the bathymetry interpolation can be computationally intensive. For a high-resolution + dataset like GEBCO and a large domain, one might not be able to execute the `.setup_bathymetry` method within + a Jupyter notebook. In that case, instructions for running the interpolation via `mpirun` will be printed upon + executing the `setup_bathymetry` method. * `forcing/init_*.nc` - These are the initial conditions bunched into velocities, tracers and the free surface height (`eta`). + The initial conditions bunched into velocities, tracers, and the free surface height. * `forcing/forcing_segment*` - These are the boundary forcing segments, numbered the same way as in MOM_input. The dimensions and coordinates are fairly confusing, and getting them wrong can likewise cause some cryptic error messages! These boundaries don't have to follow lines of constant longitude and latitude, but it is much easier to set things up if they do. For an example of a curved boundary, see this [Northwest Atlantic experiment](https://github.com/jsimkins2/nwa25/tree/main). + The boundary forcing segments, numbered the same way as in `MOM_input`. The dimensions and coordinates are fairly + confusing, and getting them wrong can likewise cause some cryptic error messages! These boundaries do not have to + follow lines of constant longitude and latitude, but it is much easier to set things up if they do. For an example + of a curved boundary, see this [Northwest Atlantic experiment](https://github.com/jsimkins2/nwa25/tree/main). From 6e128aab7bed99f0df516d0bd46d9cb076cd6ae7 Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 14:36:05 +0300 Subject: [PATCH 18/24] update README --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index dbd512eb..90911bb9 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,11 @@ The idea behind this package is that it should let the user sidestep some of the ## Limitations -Currently the package only supports one type of regional horizontal grid, namely one that's equally spaced in longitude and latitude. Users can provide their own grid, or ideally [open a PR](https://github.com/COSIMA/regional-mom6/pulls) with a method that implements another type of horizontal grid! Furthermore, currently only boundary segments that are parallel to either lines of constant longitude or constant latitude lines are supported. +- Only supports one type of regional horizontal grid, namely one that's equally spaced in longitude + and latitude. Users can provide their own grid, or ideally [open a pull request](https://github.com/COSIMA/regional-mom6/pulls) with a method that implements another type of horizontal grid! +- Only boundary segments that are parallel to either lines of constant longitude or constant latitude + lines are supported. + ## We want to hear from you From a16bfcd0ff2e5bf7c817a6a3afb6c791a4f85c9a Mon Sep 17 00:00:00 2001 From: ashjbarnes Date: Tue, 16 Apr 2024 14:57:33 +0200 Subject: [PATCH 19/24] more verbose kwargs for segment class --- regional_mom6/regional_mom6.py | 83 +++++++++++++++++----------------- 1 file changed, 42 insertions(+), 41 deletions(-) diff --git a/regional_mom6/regional_mom6.py b/regional_mom6/regional_mom6.py index aa2113bf..79cece3e 100644 --- a/regional_mom6/regional_mom6.py +++ b/regional_mom6/regional_mom6.py @@ -878,13 +878,13 @@ def rectangular_boundary( print("Processing {} boundary...".format(orientation), end="") seg = segment( - self.hgrid, - path_to_bc, # location of raw boundary - self.mom_input_dir, - varnames, - "segment_{:03d}".format(segment_number), - orientation, # orienataion - self.date_range[0], + hgrid = self.hgrid, + infile = path_to_bc, # location of raw boundary + outfolder = self.mom_input_dir, + varnames = varnames, + segment_name = "segment_{:03d}".format(segment_number), + orientation=orientation, # orienataion + startdate=self.date_range[0], gridtype=arakawa_grid, repeat_year_forcing=self.repeat_year_forcing, ) @@ -1647,7 +1647,7 @@ class segment: standard naming convension of this pipeline, e.g., ``{"xq": "longitude, "yh": "latitude", "salt": "salinity", ...}``. Key "tracers" points to nested dictionary of tracers to include in boundary. - seg_name (str): Name of the segment, e.g., ``'segment_001'``. + segment_name (str): Name of the segment, e.g., ``'segment_001'``. orientation (str): Cardinal direction (lowercase) of the boundary segment. startdate (str): The starting date to use in the segment calendar. gridtype (Optional[str]): Arakawa staggering of input grid, one of ``'A'``, ``'B'``, @@ -1665,11 +1665,12 @@ class segment: def __init__( self, + *, hgrid, infile, outfolder, varnames, - seg_name, + segment_name, orientation, startdate, gridtype="A", @@ -1706,7 +1707,7 @@ def __init__( self.orientation = orientation.lower() ## might not be needed? NSEW self.grid = gridtype self.hgrid = hgrid - self.seg_name = seg_name + self.segment_name = segment_name self.tidal_constituents = tidal_constituents self.repeat_year_forcing = repeat_year_forcing @@ -1746,11 +1747,11 @@ def rectangular_brushcut(self): self.interp_grid = xr.Dataset( { "lat": ( - [f"{self.parallel}_{self.seg_name}"], + [f"{self.parallel}_{self.segment_name}"], self.hgrid_seg.y.squeeze().data, ), "lon": ( - [f"{self.parallel}_{self.seg_name}"], + [f"{self.parallel}_{self.segment_name}"], self.hgrid_seg.x.squeeze().data, ), } @@ -1876,9 +1877,9 @@ def rectangular_brushcut(self): # fill in NaNs segment_out = ( segment_out.ffill(self.z) - .interpolate_na(f"{self.parallel}_{self.seg_name}") - .ffill(f"{self.parallel}_{self.seg_name}") - .bfill(f"{self.parallel}_{self.seg_name}") + .interpolate_na(f"{self.parallel}_{self.segment_name}") + .ffill(f"{self.parallel}_{self.segment_name}") + .bfill(f"{self.parallel}_{self.segment_name}") ) time = np.arange( @@ -1899,10 +1900,10 @@ def rectangular_brushcut(self): "time": { "dtype": "double", }, - f"nx_{self.seg_name}": { + f"nx_{self.segment_name}": { "dtype": "int32", }, - f"ny_{self.seg_name}": { + f"ny_{self.segment_name}": { "dtype": "int32", }, } @@ -1925,28 +1926,28 @@ def rectangular_brushcut(self): ) in ( allfields ): ## Replace with more generic list of tracer variables that might be included? - v = f"{var}_{self.seg_name}" + v = f"{var}_{self.segment_name}" ## Rename each variable in dataset segment_out = segment_out.rename({allfields[var]: v}) ## Rename vertical coordinate for this variable - segment_out[f"{var}_{self.seg_name}"] = segment_out[ - f"{var}_{self.seg_name}" - ].rename({self.z: f"nz_{self.seg_name}_{var}"}) + segment_out[f"{var}_{self.segment_name}"] = segment_out[ + f"{var}_{self.segment_name}" + ].rename({self.z: f"nz_{self.segment_name}_{var}"}) ## Replace the old depth coordinates with incremental integers - segment_out[f"nz_{self.seg_name}_{var}"] = np.arange( - segment_out[f"nz_{self.seg_name}_{var}"].size + segment_out[f"nz_{self.segment_name}_{var}"] = np.arange( + segment_out[f"nz_{self.segment_name}_{var}"].size ) ## Re-add the secondary dimension (even though it represents one value..) segment_out[v] = segment_out[v].expand_dims( - f"{self.perpendicular}_{self.seg_name}", axis=self.axis_to_expand + f"{self.perpendicular}_{self.segment_name}", axis=self.axis_to_expand ) ## Add the layer thicknesses segment_out[f"dz_{v}"] = ( - ["time", f"nz_{v}", f"ny_{self.seg_name}", f"nx_{self.seg_name}"], + ["time", f"nz_{v}", f"ny_{self.segment_name}", f"nx_{self.segment_name}"], da.broadcast_to( dz.data[None, :, None, None], segment_out[v].shape, @@ -1971,40 +1972,40 @@ def rectangular_brushcut(self): } ## appears to be another variable just with integers?? - encoding_dict[f"nz_{self.seg_name}_{var}"] = {"dtype": "int32"} + encoding_dict[f"nz_{self.segment_name}_{var}"] = {"dtype": "int32"} ## Treat eta separately since it has no vertical coordinate. Do the same things as for the surface variables above - segment_out = segment_out.rename({self.eta: f"eta_{self.seg_name}"}) - encoding_dict[f"eta_{self.seg_name}"] = { + segment_out = segment_out.rename({self.eta: f"eta_{self.segment_name}"}) + encoding_dict[f"eta_{self.segment_name}"] = { "_FillValue": netCDF4.default_fillvals["f8"], } - segment_out[f"eta_{self.seg_name}"] = segment_out[ - f"eta_{self.seg_name}" + segment_out[f"eta_{self.segment_name}"] = segment_out[ + f"eta_{self.segment_name}" ].expand_dims( - f"{self.perpendicular}_{self.seg_name}", axis=self.axis_to_expand - 1 + f"{self.perpendicular}_{self.segment_name}", axis=self.axis_to_expand - 1 ) # Overwrite the actual lat/lon values in the dimensions, replace with incrementing integers - segment_out[f"{self.parallel}_{self.seg_name}"] = np.arange( - segment_out[f"{self.parallel}_{self.seg_name}"].size + segment_out[f"{self.parallel}_{self.segment_name}"] = np.arange( + segment_out[f"{self.parallel}_{self.segment_name}"].size ) - segment_out[f"{self.perpendicular}_{self.seg_name}"] = [0] + segment_out[f"{self.perpendicular}_{self.segment_name}"] = [0] # Store actual lat/lon values here as variables rather than coordinates - segment_out[f"lon_{self.seg_name}"] = ( - [f"ny_{self.seg_name}", f"nx_{self.seg_name}"], + segment_out[f"lon_{self.segment_name}"] = ( + [f"ny_{self.segment_name}", f"nx_{self.segment_name}"], self.hgrid_seg.x.data, ) - segment_out[f"lat_{self.seg_name}"] = ( - [f"ny_{self.seg_name}", f"nx_{self.seg_name}"], + segment_out[f"lat_{self.segment_name}"] = ( + [f"ny_{self.segment_name}", f"nx_{self.segment_name}"], self.hgrid_seg.y.data, ) # Add units to the lat / lon to keep the `categorize_axis_from_units` checker happy - segment_out[f"lat_{self.seg_name}"].attrs = { + segment_out[f"lat_{self.segment_name}"].attrs = { "units": "degrees_north", } - segment_out[f"lon_{self.seg_name}"].attrs = { + segment_out[f"lon_{self.segment_name}"].attrs = { "units": "degrees_east", } @@ -2014,7 +2015,7 @@ def rectangular_brushcut(self): with ProgressBar(): segment_out.load().to_netcdf( - self.outfolder / f"forcing/forcing_obc_{self.seg_name}.nc", + self.outfolder / f"forcing/forcing_obc_{self.segment_name}.nc", encoding=encoding_dict, unlimited_dims="time", ) From 42ccc5c68fb91029c85ec4d67c4c8b6d11e969e8 Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 16:50:56 +0300 Subject: [PATCH 20/24] format black --- regional_mom6/regional_mom6.py | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/regional_mom6/regional_mom6.py b/regional_mom6/regional_mom6.py index 79cece3e..91757925 100644 --- a/regional_mom6/regional_mom6.py +++ b/regional_mom6/regional_mom6.py @@ -878,11 +878,11 @@ def rectangular_boundary( print("Processing {} boundary...".format(orientation), end="") seg = segment( - hgrid = self.hgrid, - infile = path_to_bc, # location of raw boundary - outfolder = self.mom_input_dir, - varnames = varnames, - segment_name = "segment_{:03d}".format(segment_number), + hgrid=self.hgrid, + infile=path_to_bc, # location of raw boundary + outfolder=self.mom_input_dir, + varnames=varnames, + segment_name="segment_{:03d}".format(segment_number), orientation=orientation, # orienataion startdate=self.date_range[0], gridtype=arakawa_grid, @@ -1947,7 +1947,12 @@ def rectangular_brushcut(self): ## Add the layer thicknesses segment_out[f"dz_{v}"] = ( - ["time", f"nz_{v}", f"ny_{self.segment_name}", f"nx_{self.segment_name}"], + [ + "time", + f"nz_{v}", + f"ny_{self.segment_name}", + f"nx_{self.segment_name}", + ], da.broadcast_to( dz.data[None, :, None, None], segment_out[v].shape, From fd7d4d2019481161bcb057dc5907fb04d0d5077b Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 16:53:29 +0300 Subject: [PATCH 21/24] fix typo Co-authored-by: Ashley Barnes <53282288+ashjbarnes@users.noreply.github.com> --- docs/mom6-file-structure-primer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/mom6-file-structure-primer.md b/docs/mom6-file-structure-primer.md index a4dbb920..ee0781d0 100644 --- a/docs/mom6-file-structure-primer.md +++ b/docs/mom6-file-structure-primer.md @@ -34,7 +34,7 @@ These files are: Instructions for how to format the `data_table` are included in the [MOM6 documentation](https://mom6.readthedocs.io/en/dev-gfdl/forcing.html). * `MOM_input / SIS_input` - Basic settings for the core MOM and SIS code with reasonably-well documentation. + Basic settings for the core MOM and SIS code with reasonably good documentation. After running the experiment for a short amount of time, you can find a `MOM_parameter_doc.all` file which lists every possible setting your can modify for your experiment. The `regional-mom6` package can copy and modify a default set of input files to work with your experiment. There's too much in these files to explain here. From c417426251772625c1f6b2ea853ae4ba87ed2597 Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 16:56:35 +0300 Subject: [PATCH 22/24] slight rephrase --- README.md | 3 ++- docs/index.rst | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 90911bb9..52112d49 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,8 @@ The idea behind this package is that it should let the user sidestep some of the - Automatic grid generation at chosen vertical and horizontal grid spacing. - Automatic removal of non-advective cells from the bathymetry that cause the model to crash. -- Handle slicing across 'seams' in of the forcing input datasets (e.g., when the regional configuration spans the longitude 180 of a global dataset that spans [-180, 180]). +- Handle slicing across 'seams' in of the forcing input datasets (e.g., when the regional + configuration includes longitude 180 and the forcing longitude is defined in [-180, 180]). - Handles metadata encoding. - Creates directory structure with the configuration files as expected by MOM6. - Handles interpolation and interpretation of input data. No pre-processing of forcing datasets is required. (In some cases, slicing the forcing dataset before helps with hitting limitations related to the machine's available memory.) diff --git a/docs/index.rst b/docs/index.rst index 06322066..f1534b06 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -24,7 +24,7 @@ Features - Automatic grid generation at chosen vertical and horizontal grid spacing. - Automatic removal of non-advective cells from the bathymetry that cause the model to crash. - Handle slicing across 'seams' in of the forcing input datasets (e.g., when the regional - configuration spans the longitude 180 of a global dataset that spans [-180, 180]). + configuration includes longitude 180 and the forcing longitude is defined in [-180, 180]). - Handles metadata encoding. - Creates directory structure with the configuration files as expected by MOM6. - Handles interpolation and interpretation of input data. No pre-processing of forcing datasets From 1702d8be9fe4bbb2e38c85b500eb98faa42eb829 Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 17:01:06 +0300 Subject: [PATCH 23/24] rephrase --- docs/mom6-file-structure-primer.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/docs/mom6-file-structure-primer.md b/docs/mom6-file-structure-primer.md index ee0781d0..9663155a 100644 --- a/docs/mom6-file-structure-primer.md +++ b/docs/mom6-file-structure-primer.md @@ -16,11 +16,15 @@ These files are: The `coupler` section turns on or off different model components, and specifies how long to run the experiment for. * `diag_table`: - The diagnostics to save from your model run. - Choose wisely the quantities that are relevant to your experiment and the analysis you plan to do otherwise you can fill up disk space very fast. - Different lines in the `diag_table` either specify a new output *file* and its associated characteristics, or a new output *variable* and the matching file that it should go in (which needs to already have been specified). - If uncertain regarding which diagnostics to pick, try running the model for a short period (e.g., 1 hour) and look in the output folder. - There, we find a file `available_diags` that lists every available diagnostic for your model configuration also mentioning which grids the quantity can be output on. + The diagnostics to output at model runtime. + We need to choose wisely the quantities/output frequency that are relevant to our experiment and the + analysis we plan to do otherwise the size of output can quickly grow a lot. + Each line in the `diag_table` either specifies a new output *file* and its associated characteristics, + or a new output *variable* and the matching file that it should go in (which needs to already have been + specified). + If uncertain of the available diagnostics, we can runthe model for a short period (e.g., 1 hour) and then + look in the output directory for `available_diags` that lists every available diagnostic for our + model configuration also mentioning which grids the quantity can be output on. Aside from the native model grid, we can create our own custom vertical coordinates to output on. To output on a custom vertical coordinate, create a netCDF that contains all of the vertical points (in the coordinate of your choice) and then edit the `MOM_input` file to specify additional diagnostic coordinates. After that, we are able to select the custom vertical coordinate in the `diag_table`. From 2f13ff6bac8d31df331e98cbfef760f61ba7f699 Mon Sep 17 00:00:00 2001 From: "Navid C. Constantinou" Date: Tue, 16 Apr 2024 17:03:04 +0300 Subject: [PATCH 24/24] rephrase/clarify --- docs/mom6-file-structure-primer.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/mom6-file-structure-primer.md b/docs/mom6-file-structure-primer.md index 9663155a..503eb6c0 100644 --- a/docs/mom6-file-structure-primer.md +++ b/docs/mom6-file-structure-primer.md @@ -26,14 +26,15 @@ These files are: look in the output directory for `available_diags` that lists every available diagnostic for our model configuration also mentioning which grids the quantity can be output on. Aside from the native model grid, we can create our own custom vertical coordinates to output on. - To output on a custom vertical coordinate, create a netCDF that contains all of the vertical points (in the coordinate of your choice) and then edit the `MOM_input` file to specify additional diagnostic coordinates. - After that, we are able to select the custom vertical coordinate in the `diag_table`. + To output on a custom vertical coordinate, create a netCDF that contains all of the vertical points + (in the coordinate of your choice) and then edit the `MOM_input` file to specify additional diagnostic + coordinates. + After that, we can use this custom vertical coordinate in the `diag_table`. Instructions for how to format the `diag_table` are included in the [MOM6 documentation](https://mom6.readthedocs.io/en/dev-gfdl/api/generated/pages/Diagnostics.html). * `data_table` - The data table is read by the coupler to provide different model components with inputs. - With more model components we need more inputs. + The data table is read by the coupler to provide the different model components with inputs. Instructions for how to format the `data_table` are included in the [MOM6 documentation](https://mom6.readthedocs.io/en/dev-gfdl/forcing.html).