Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ESMF subprocess calls with new parallel implementation of xesmf (0.8.x) directly in Python #73

Merged
merged 43 commits into from
Sep 28, 2023

Conversation

ashjbarnes
Copy link
Collaborator

@ashjbarnes ashjbarnes commented Sep 13, 2023

This is a big PR that grew as a few threads where pulled. Namely, there are several problems with the main branch:

  1. It relies on running the ESMF regridder as a subprocess with mpirun, since it's way faster than doing xe.Regridder(). However this is extremely janky, and it turns out that it doesn't work at all with the default installation instructions in the README. This means that the code simply doesn't work for people trying to use it who aren't on NCI and are able to rely on analysis3 (somehow their implementation of xESMF works with mpirun don't ask me why)

  2. In writing my test function, I found some general bugs in my code which made some unhelpful assumptions. Where you see things being renamed in the ocean_forcing method , this is why. Basically the code worked by accident because the dims happened to fall in the right order, and I've changed it to ensure that it relies on the coordinate names rather than the axis.

  3. There was still some legacy stuff in the demo notebooks

This PR fixes these three things, as well as adding tests for the bathymetry and ocean forcing, updating the demo notebooks and readme to reflect changes. It goes together with #70 and means that a new user could download and install the package in a new environment and run the demo notebooks.

This PR resolves #72

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@ashjbarnes
Copy link
Collaborator Author

Ok so looks like there are still some installation issues that cause the tests to fail. Namely the numpy installation isn't right, leading to the "module not found: bottleneck" issue

@codecov
Copy link

codecov bot commented Sep 14, 2023

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (3bc1f54) 15.96% compared to head (a413b7c) 72.78%.

Additional details and impacted files
@@             Coverage Diff             @@
##             main      #73       +/-   ##
===========================================
+ Coverage   15.96%   72.78%   +56.82%     
===========================================
  Files           3        3               
  Lines         426      452       +26     
===========================================
+ Hits           68      329      +261     
+ Misses        358      123      -235     
Files Coverage Δ
regional_mom6/regional_mom6.py 72.74% <88.37%> (+57.91%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@navidcy navidcy self-requested a review September 14, 2023 06:38
environment-ci.yml Outdated Show resolved Hide resolved
@ashjbarnes
Copy link
Collaborator Author

ashjbarnes commented Sep 27, 2023

I think I lost track of what's going on here. Which environment are you referring to, and what is the bottleneck issue?

Ok here's a rough summary in chronological order

  1. This PR was made because when users follow the installation instructions currently in main branch, the package doesn't work, and never has because of issues with the XESMF install being janky.
  2. In patching this, I found that using the latest version of XESMF allows for a parallel implementation in Python, bypassing the need for the janky subprocess calls I'd ben using for regridding the topography. This fixed the original issue
  3. BUT, there was a further issue with the environment surrounding the bottlneck package required by .bfill. This may have been caused by my changing the environment-ci.yml file or may have been an underlying issue that never arose because it used to break at the regrid step earlier.
  4. Current state: Functions I wrote that test the whole regridding pipeline work, but Angus indicated that he'd like to check that I did the environment stuff right, which is more than fair enough since this was mostly trial end error on my part

Some other minor issues have been raised which should be resolved in future, smaller PRs, while this one should focus on getting the package in a working state for new users

@navidcy navidcy added the bug 🐞 Something isn't working label Sep 27, 2023
@navidcy
Copy link
Contributor

navidcy commented Sep 27, 2023

@ashjbarnes what "extra 8" is this comment referring to?

) ## The extra 8 is a buffer region

@ashjbarnes
Copy link
Collaborator Author

@ashjbarnes what "extra 8" is this comment referring to?

) ## The extra 8 is a buffer region

Idk probably a typo. This part of the code hasn't been changed in a long time

@ashjbarnes
Copy link
Collaborator Author

With the change of the dependency files you made @angus-g it seems like the tests still go through ok. Is there anything else that needs to be checked to make sure the installation is working as expected?

@angus-g
Copy link
Collaborator

angus-g commented Sep 28, 2023

it seems like the tests still go through ok. Is there anything else that needs to be checked to make sure the installation is working as expected?

Does it still work for your environment creation on Gadi, etc.?

@ashjbarnes
Copy link
Collaborator Author

it seems like the tests still go through ok. Is there anything else that needs to be checked to make sure the installation is working as expected?

Does it still work for your environment creation on Gadi, etc.?

I haven't been able to figure out how to get my Gadi-Jupyter notebook kernel to use recognise custom environments. It's very hit and miss! I'll run the test scripts on Gadi though even if not the example notebooks

@angus-g
Copy link
Collaborator

angus-g commented Sep 28, 2023

Have you used the IPython instructions for letting the notebooks know about your different environments?

@ashjbarnes
Copy link
Collaborator Author

ashjbarnes commented Sep 28, 2023

Have you used the IPython instructions for letting the notebooks know about your different environments?

Ok I got it figured out and all working on a notebook on Gadi.

I confirm that the example notebooks work on this environment after following install instructions except that you also need to manually install Matplotlib. This isn't strictly a dependency of the package but everyone using it will want matplotlib to see their domain. Should we include it in the dependencies file?

Likewise since Bokeh is missing you get this message when trying to view the Dask cluster. Should we include that too?

image

@angus-g
Copy link
Collaborator

angus-g commented Sep 28, 2023

This isn't strictly a dependency of the package but everyone using it will want matplotlib to see their domain. Should we include it in the dependencies file?

No, it should remain optional. Suppose somebody wanted to run in batch mode without plotting, they wouldn't want/need matplotlib, and it could complicate headless installs.

Likewise since Bokeh is missing you get this message when trying to view the Dask cluster. Should we include that too?

Definitely not, that's much more dependent on one's environment setup. The way to do this is to provide a conda environment with all the correct packages installed, but that can be difficult to generalise. I don't think providing a working dask cluster setup is in the scope of this project.

@ashjbarnes
Copy link
Collaborator Author

This isn't strictly a dependency of the package but everyone using it will want matplotlib to see their domain. Should we include it in the dependencies file?

No, it should remain optional. Suppose somebody wanted to run in batch mode without plotting, they wouldn't want/need matplotlib, and it could complicate headless installs.

Likewise since Bokeh is missing you get this message when trying to view the Dask cluster. Should we include that too?

Definitely not, that's much more dependent on one's environment setup. The way to do this is to provide a conda environment with all the correct packages installed, but that can be difficult to generalise. I don't think providing a working dask cluster setup is in the scope of this project.

Yes this all makes sense! In that case what are the remaining actionable things in this PR?

@navidcy
Copy link
Contributor

navidcy commented Sep 28, 2023

Yes this all makes sense! In that case what are the remaining actionable things in this PR?

This: #73 (comment) ?

and this: https://github.com/COSIMA/regional-mom6/pull/73/files#r1338061831 ?

@navidcy
Copy link
Contributor

navidcy commented Sep 28, 2023

I pushed my suggestion for the check whether bathymetry goes all along the domain. Let me know if you have objection still on this @ashjbarnes.

Let's see if tests pass. I'm happy at this point. I'll approve.

@navidcy navidcy self-requested a review September 28, 2023 09:50
@navidcy
Copy link
Contributor

navidcy commented Sep 28, 2023

I think @angus-g also need to re-review/approve because of the requested changes.

@navidcy
Copy link
Contributor

navidcy commented Sep 28, 2023

oh yeah, I had forgotten the temp dir fixtures issue! thanks @angus-g!

Copy link
Collaborator

@angus-g angus-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to go now.

@angus-g angus-g merged commit 7c338db into main Sep 28, 2023
5 checks passed
@angus-g angus-g deleted the test-domain-generation branch September 28, 2023 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Installation doesn't (appear to) correctly set up XESMF
3 participants