-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor TropCyclone for external use in other modules #749
Conversation
I very much like the changes. In particular tackling the memory and the physical units points is great and very much needed. Thanks a lot for the hard work @tovogt! From my first quick look it looks good to me, but I've not done a proper review. If you want me to, let me know and I'll see what I can do. If you have enough other people to review then I'm also ok not to dig deeper. |
@bguillod Thanks a lot for your feedback, Benoit! @ThomasRoosli will probably have a look, but your review would be very much appreciated if you can afford the time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review part 1, I hope I have more time early next week to go into the details of the compute_windfields-family of functions...
- Most files just contain the same change of a sparse command. The main changes in the PR are in trop_cyclone.py and test_trop_cyclone.py.
- In test_trop_cyclone.py a lot of the previous test (especially
test_windfield_models
remain unchanged, which ensures that the refactoring did not introduce a different behaviour. - I advise to work on the docstrings of the functions that now accept a single track instead of several items). (see specific comment in the code).
- The main refactoring happens within the method
from_single_track
and its substructure - I find the names
compute_windfield
,_compute_windfields_sparse
,_compute_windfields_sparse_chunked
and_compute_windfields_chunked
confusingly similar. Especially since the they seems to call each other in a circular fashion. Maybe I have a suggestion after a more thorough review. - As a general comment: I wonder if it would make sense to introduce a config entry for
max_memory_gb
so that the memory usage could be configured without changing code. - As an even more general comment: I wonder if we should introduce a "peak_max_memory_gb" configuration value in the climada config, so that
max_memory_gb
could be derived from that and also other memory intensive functions could be configured the same way. This would of course only trigger an additional issue for now, and does not need to be implemented.
Thanks a lot for your feedback, @ThomasRoosli !
Good point! I modified the docstrings accordingly.
This comment made me reconsider the structure. Now I'm back to only three functions:
I think that the names are fine, but if you have a different suggestion, I'm happy to discuss this.
In my case, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tovogt for the adapted changes.
- The restructuring of the
_compute_windfields...
family (and your explanation) helped me alot. Also with the new changes the loop into_compute_windfields_sparse_chunked
is only done once (and it is more directly visible that it does so), which helps the overall understandability alot. - The doc-strings of functions changeing the track-Dataset now contain very useful information about the elements used and added.
- I am happy with leaving the
max_memory_gb
as a parameter to the function without linking it to a global configuration variable - In terms of code quality, understandability and tests and pylint I have nothing else to add.
Thanks for this contribution.
Thanks for your review, @ThomasRoosli ! @emanuel-schmid The failing Jenkins "Post Actions" are nothing to worry about, right? It seems to me like they are not related to this PR. |
For testing, I computed all IBTrACS tracks for 1950-2021, at 300 arc-seconds and 0.5-hourly resolution, and with limited memory. I didn't truncate for high latitudes (the default setting for the wind field computations is truncation at 60° latitude). The chunking worked fine, except from Hurricane FAITH (1966): Note, however, that most people won't do the TC wind field computation north of 60° latitude because that is usually after extratropical transition anyway. And most "reachable" centroids come from that part of the track since the km-distance between centroids becomes very small close to the pole when using lat/lon grids. |
Changes proposed in this PR:
TCRain
in climada_petals (see Implement a new tropical cyclone rain model climada_petals#85).TCRain
code (see Implement a new tropical cyclone rain model climada_petals#85).sparse.csr.csr_matrix
withsparse.csr_matrix
to avoid warning messages.PR Author Checklist
develop
)PR Reviewer Checklist