Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated data interpolation for Antarctica mesh generation case #750

Merged
merged 14 commits into from
Feb 29, 2024

Conversation

matthewhoffman
Copy link
Member

@matthewhoffman matthewhoffman commented Dec 22, 2023

This PR adds automates interpolation of observational data from gridded datasets to Antarctic mesh within COMPASS. This takes care of the peculiarities of the current gridded compilation dataset (antarctica_8km_2020_10_20.nc), as well as using conservative remapping directly from the high-resolution BedMachineAntarctica and MeASUReS velocity datasets. There is a fairly heavy degree of pre-processing done to get the BedMachine and MeASUReS datasets ready to be used here. The pre-processing includes renaming variables, setting reasonable _FillValue and missing_value attributes, extrapolating fields to avoid interpolation ramps at ice margins, updating mask values, and raising the bed topography at Lake Vostok to ensure a flat ice surface. Those data files and processing scripts currently live here on Chicoma: /usr/projects/climate/trhille/data. Eventually that pre-processing could be integrated into a new step in COMPASS, or the processed data files could be added to the server on Anvil and downloaded as needed. However, until then, this test case provides a reproducible workflow for setting up Antarctic meshes at varying resolutions

Checklist

  • User's Guide has been updated
  • Developer's Guide has been updated
  • API documentation in the Developer's Guide (api.rst) has any new or modified class, method and/or functions listed
  • Documentation has been built locally and changes look as expected
  • Document (in a comment titled Testing in this PR) any testing that was used to verify the changes

@matthewhoffman
Copy link
Member Author

This PR is a replacement to #408 which is out of date to refactoring of the landice mesh generation workflow.

@matthewhoffman
Copy link
Member Author

matthewhoffman commented Dec 22, 2023

Testing

Running the default options generates a 4-20 km resolution whole-AIS mesh with 384370 cells and a 50 km gutter. On one node of Chicoma this took 41 minutes. Viewing the mesh in Paraview along with our existing production 4-20 km AIS mesh, the two meshes do not have identical cell locations but clearly have the same cell spacing.

By modifying the min and max spacing, I was able to create a consistent 2-10 km resolution mesh with 1263402 cells in a little under 2 hours. A 1-8 km mesh with 4743133 cells took 10:30 hours.

Note that a few steps are redundant, and many of the steps could be separated into a pre-processing test case; the unique operations could probably be done in about half the total time.

@matthewhoffman
Copy link
Member Author

Note that this PR requires MPAS-Dev/MPAS-Tools#544.

The steps to update MPAS-Tools in compass to test this PR are:

  1. Create your compass environment as normal and source the load script
  2. Go to a separate MPAS-Tools repository and checkout Fix bugs in ESMF interpolation method MPAS-Tools#544
  3. From the MPAS-Tools directory: python -m pip install -e conda_package
  4. Proceed setting up test case as normal

@matthewhoffman matthewhoffman added land ice in progress This PR is not ready for review or merging labels Dec 22, 2023
@xylar
Copy link
Collaborator

xylar commented Dec 22, 2023

Because of the MPAS-Tools changes, we will need to:

  1. merge Fix bugs in ESMF interpolation method MPAS-Tools#544
  2. make a release of MPAS-Tools
  3. make a conda-forge package of mpas_tools
  4. update the compass alpha version and the mpas_tools version here

That sounds like a lot but it's pretty routine. I'll do steps 2 and 3 once 1 happens.

@matthewhoffman
Copy link
Member Author

@trhille , this was originally your PR, so it seems weird to have you review it, but look it over and test it if you'd like. I somehow lost your authorship on the main commits (maybe I ended up manually re-creating them for some reason?), but I learned how to give you "co-author" credit in the commit message so Github counts you as an author, FWIW. I rebased on main to force the stuck CI checks to run and resolved conflicts with the recent Anvil file renaming we merged last week, so it should be an easy merge.

Copy link
Collaborator

@trhille trhille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewhoffman, thanks for tackling this! I think it's looking really good. I had a few stylistic changes, and updated some comments. There's one change to the name of the gridded dataset that has to be implemented for the code to run.

This workflow is very expensive and takes a long time to run. I'm wondering if we should either put the interpolation in a separate step, or if we should include a config option to determine whether to do the full interpolation workflow. I can imagine that we'd want to turn interpolation off when with different cell spacing functions, for instance.

docs/developers_guide/landice/api.rst Outdated Show resolved Hide resolved
a flat ice surface there.

Those data files and processing scripts currently live here on Chicoma:
``/usr/projects/climate/trhille/data``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this to the Fanssie project space on Perlmutter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for copying this dir over to Perlmutter. I've updated the cfg and docs accordingly

compass/landice/tests/antarctica/mesh.py Outdated Show resolved Hide resolved
compass/landice/tests/antarctica/mesh.py Outdated Show resolved Hide resolved
compass/landice/mesh.py Outdated Show resolved Hide resolved
compass/landice/mesh.py Outdated Show resolved Hide resolved
compass/landice/mesh.py Outdated Show resolved Hide resolved
compass/landice/mesh.py Outdated Show resolved Hide resolved
compass/landice/mesh.py Outdated Show resolved Hide resolved
compass/landice/mesh.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@trhille trhille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments after finding that my 1 hr job on Chicoma only made it to the step of creating the scrip file for BedMachine.


logger = self.logger

logger.info('creating scrip file for BedMachine dataset')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step is really slow, so we should check for an existing scrip file before proceeding. It should also write to the compass working directory instead of to the data directory if it does create a new scrip file. I'd rather have the data directory untouched.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check if your compass env contains the changes to MPAS-Tools here? MPAS-Dev/MPAS-Tools#544

This step got like 100x faster after that change. I'm wondering if you'd be comfortable skipping the check for an existing file if this step only takes a minute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think I know the answer - the conda env version for this branch would not have the MPAS-Tools update in it, because those were made at the same time as this PR. But I believe that MPAS-Tools PR is part of the current compass env. So try redoing your test after merging this branch locally into main (or rebasing) and creating an updated compass env.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've modified to write the scrip file to the workdir.


logger = self.logger

logger.info('creating scrip file for velocity dataset')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with the BedMachine scrip file, let's check for an existing file in the data directory before writing a huge new one. And it should write any new file to the work directory, leaving the data directory untouched. If the scrip file exists in the data directory, let's just put a symlink to it in the work directory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've modified to write the scrip file to the workdir.

@matthewhoffman matthewhoffman removed the in progress This PR is not ready for review or merging label Feb 28, 2024
Copy link
Collaborator

@trhille trhille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more change!

compass/landice/mesh.py Outdated Show resolved Hide resolved
matthewhoffman and others added 11 commits February 28, 2024 13:35
This commit Interpolates data from gridded datasets to an Antarctic mesh.
This takes care of the peculiarities of the current gridded compilation dataset
(antarctica_8km_2020_10_20.nc), as well as using conservative remapping directly
from the high-resolution BedMachineAntarctica and MeASUReS velocity datasets.
There is a fairly heavy degree of pre-processing done to get the BedMachine and
MeASUReS datasets ready to be used here. The pre-processing includes renaming
variables, setting reasonable _FillValue and missing_value attributes, extrapolating
fields to avoid interpolation ramps at ice margins, updating mask values, and
raising the bed topography at Lake Vostok to ensure a flat ice surface. Those
data files and processing scripts currently live here on Chicoma:
/usr/projects/climate/trhille/data
Eventually that pre-processing could be integrated into a new step in COMPASS,
or the processed data files could be added to the server on Anvil and downloaded
as needed.

The conservative remapping step using ESMF_RegridWeightGen requires multiple
nodes in order to not run out of memory.
This should probably be updated to be a separate step in the future in order to save on cost.

Co-authored-by: Trevor Hillebrand <[email protected]>
This is the configuration used to create the 4km AIS mesh used in
ISMIP6-2300.

Co-authored-by: Trevor Hillebrand <[email protected]>
And set default cull_distance to not be too large

Co-authored-by: Trevor Hillebrand <[email protected]>
corrected

Co-authored-by: Trevor Hillebrand <[email protected]>
The reference gridded dataset has a funky thickness field that includes
non-ice-sheet thin ice along the Antarctic Peninsula and is inconsistent
with the thickness field we ultimately interpolate from using the
BedMachine dataset.  This leads to unusual gutter sizes in parts of the
mesh and may be contributing to some strange culling artifacts.

This commit adds a preprocessing function that replaces the reference
gridded dataset's thickness field with one bilinearly interpolated from
BedMachine.  This is only used for the flood fill and mesh culling steps.

Co-authored-by: Trevor Hillebrand <[email protected]>
adjusted from previous commit

Co-authored-by: Trevor Hillebrand <[email protected]>
Minor corrections from Trevor

Co-authored-by: Trevor Hillebrand <[email protected]>
Copy link
Collaborator

@trhille trhille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to run it on Perlmutter (total runtime 29:55) after transferring the data over from Chicoma.

Most fields look good:
image
image
image
image
image

However, observedSurfaceVelocityUncertainty is 1.0 everywhere in Antarctica.nc. I checked the Antarctica_backup.nc file, and it's 0.0 everywhere, so the issue is in the gridded dataset or the interpolation rather than the cleanup step. I checked the gridded dataset, and vErr is present, so it must be an issue with the interpolation. And indeed, the log file and the netCDF history both show that the interpolation is never performed from the MEaSURES dataset, although no error is thrown. Ah, it turns out that there is a missing check_call at the end of interp_ais_measures.

"""

# Create a backup in case clean-up goes awry
backup_name = f"{fname.split('.')[:-1][0]}_backup,nc"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
backup_name = f"{fname.split('.')[:-1][0]}_backup,nc"
backup_name = f"{fname.split('.')[:-1][0]}_backup.nc"

'-v', 'observedSurfaceVelocityX',
'observedSurfaceVelocityY',
'observedSurfaceVelocityUncertainty']

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interpolation is never performed! Needs another check_call(args, logger=logger)

Add missing check_call for velocity interpolation from MEaSURES
dataset. Also fix small typo.
@trhille
Copy link
Collaborator

trhille commented Feb 29, 2024

After commit 8a8bb4a, velocities and velocity uncertainty are being interpolated and the results look correct. These are from a 2–10km mesh, which ran on Perlmutter in 97:33:
image
image
image
image
image
image

Copy link
Collaborator

@trhille trhille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewhoffman, this looks ready to go to me!

@trhille trhille merged commit 6c2ac36 into MPAS-Dev:main Feb 29, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants