Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atmospheric forcing #41

Merged
merged 40 commits into from
Jul 9, 2024
Merged

Atmospheric forcing #41

merged 40 commits into from
Jul 9, 2024

Conversation

NoraLoose
Copy link
Collaborator

Completes step 4 in #1.

This adds three classes: AtmosphericForcing, SWRCorrrection, and ForcingDataset, where the first two are the ones the user will interact with. I added an example notebook to the documentation that illustrates how the classes can be used in practice.

ToDo

There are two things left to do, and I will open two separate issues to address these:

  • Test performance and profile memory on different machines and clusters, and make recommendations to the user for what kind of dask clusters to use: (1) thread-based vs. process based, (2) how many workers for a given total amount of memory (making sure every worker has enough memory)
  • Change interpolation scheme for zonal and meridional wind, and the SWR correction to mimic the MATLAB results more closely, see below.

Python vs. MATLAB-based results

Both the roms-tools and the MATLAB-based scripts create 7 atmospheric forcing fields that are necessary to run a ROMS simulation. To create these fields both follow the same two-step procedure:

  1. Fill in NaNs over land by spreading values from the ocean into land through diffusion.
  2. Interpolate from ERA5 to ROMS grid

Here is a comparison for all 7 fields for a randomly chosen time step. For each atmospheric field, there are 4 panels.

  • The left lower panel shows the difference between python and matlab over land and ocean. There are significant differences over land due to the fact that python vs. matlab use different boundary conditions for their diffusion-based NaN filling.
  • Since we only care about differences over the ocean, the right lower panel masks out land. This is the important panel to look at.

We make the following observation:

  • For Tair, qair, rain, lwrad, the roms-tools and the MATLAB-based scripts lead to very similar results over the ocean. The reason is that both solutions use linear interpolation.
  • There are significant differences for uwnd, vwnd, and swrad because for the time being roms-tools uses linear interpolation while the MATLAB-based scripts use a modified akima scheme.

comparison_Tair
comparison_qair
comparison_rain
comparison_lwrad
comparison_swrad
comparison_uwnd
comparison_vwnd

This was referenced Jun 18, 2024
@NoraLoose
Copy link
Collaborator Author

@TomNicholas pinging you in case you have any thoughts on this PR. But no rush if you are busy with other stuff.

Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments for now

roms_tools/_version.py Outdated Show resolved Hide resolved
Comment on lines 1 to 13
import xarray as xr
import dask
from dataclasses import dataclass, field
from roms_tools.setup.grid import Grid
from datetime import datetime, timedelta
import glob
import numpy as np
from typing import Optional, Dict, Union
from scipy.sparse import spdiags, coo_matrix
from scipy.sparse.linalg import spsolve
from roms_tools.setup.fill import lateral_fill
import warnings
import calendar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to be running the linting tool isort on this repo, which would automatically organize these according to which are builtins vs external dependencies vs internal imports.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to use pre-commit to run isort and other linting tools? https://pycqa.github.io/isort/docs/configuration/pre-commit.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely! ruff does a lot of the formatting as one tool now though, e.g.

https://github.com/zarr-developers/VirtualiZarr/blob/main/.pre-commit-config.yaml

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have run some linting tools on the main branch (PR #46) as well as this branch (commit 5b558e3). Pre-commit now also runs some linting tools as part of the test suite (see checks below). Thanks for this suggestion - this repo really needed a clean-up!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! @dafyddstephenson you might want to steal what Nora has done here for the C-Star repo.

# Load the dataset
with dask.config.set(**{'array.slicing.split_large_chunks': False}):
# initially, we wawnt time chunk size of 1 to enable quick .nan_check() and .plot() methods for AtmosphericForcing
ds = xr.open_mfdataset(self.filename, combine='nested', concat_dim=self.dim_names["time"], chunks={self.dim_names["time"]: 1})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want coords='minimal', compat='override' here - see pydata/xarray#8778

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Done via commit 0556bf1.

@NoraLoose
Copy link
Collaborator Author

@TomNicholas do we want to wait until the dask deployment issue is fixed to merge this PR? Maybe we should defer this to another PR?

@TomNicholas
Copy link
Member

@TomNicholas do we want to wait until the dask deployment issue is fixed to merge this PR? Maybe we should defer this to another PR?

As long as this code runs successfully for some dask setup somewhere, we can separate out the problem of deploying dask on other machines. So if this runs on Casper/your laptop that's good enough to merge this IMO.

Generally merging something which works first, then creating separate issues to track getting it to run in other setups or improving performance is a good idea.

@NoraLoose
Copy link
Collaborator Author

Sounds good! In my latest test, it actually ran on Perlmutter as well (with the default LocalCluster()), but there are still some weird things happening if other dask deployment strategies are used. I will try to document those in a separate issue.

@NoraLoose NoraLoose merged commit fbbe244 into CWorthy-ocean:main Jul 9, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants