Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmark for climatology #1552
Add benchmark for climatology #1552
Changes from 3 commits
4821a63
b01a554
4cff7b5
98ce10e
9fc12f5
2830a91
2d64d13
b826999
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if people typically compute climatology with complex-enough calculations that warrant the use of
map_blocks
or typically use calculations that leverage Xarray's higher-level API.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People use rechunk+map_blocks because it usually would not work otherwise. It's basically "manual fusion of tasks" to not overwhelm the scheduler. And convenient because you can use Xarray API in there.
I strongly recommend running a version of this without those steps. Ideally, these tricks should not be necessary for this calculation.
Notice that the workload rechunks back to pancakes... The chunking to pencils is purely to get this to work with dask+distributed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure, what I'm hearing you say is: The computations people generally execute for calculating climatology, be it
mean
,std
,quantile
(orSEEP
?) can be expressed in Xarray's high-level API and do not fundamentally require a custom function mapped over all blocks.In that case, I fully agree that we should avoid
map_blocks
but express these calculations in Xarray. We should probably benchmark both a computation based on a decomposable aggregation likemean
and one based on a holistic aggregation likequantile
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. generally. The trick is to look at the function being mapped: in this case it's all Xarray ops and no for loops. That indicates the map_blocks is a workaround for some Xarray and/or dask/distributed inefficiency.
Yes totally agree with adding quantile. Dask will force a rechunk :)
PS: What is
SEEP
?PPS: I notice this is based on a Xarray-beam workload, it may be that rechunk+map_blocks is the only way to express this in that framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification!
All I can offer is https://github.com/google-research/weatherbench2/blob/47d72575cf5e99383a09bed19ba989b718d5fe30/scripts/compute_climatology.py#L147-L175. Maybe @shoyer can elaborate on
SEEP
and how common similar calculations are? Just looking at the code, this can probably be expressed in Xarray's high-level API as well.Yes, that's what it's based on, but not what we should be limited by.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, rechunk+map_blocks is definitely not the obvious solution. It would be quite nice if we could automatically optimize high-level Xarray code to use a rechunk + map_blocks style implementation when it will be more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we won't beat hand-optimized code. However, a major goal of these benchmarks is to understand how we can improve the performance of code that relies on high-level APIs to improve the end-user experience. We all seem to agree that it would be great if users wouldn't have to worry about rewriting their high-level code in a
rechunk+map_blocks
-style code if it can be avoided. So, wherever possible, I think we should try and benchmark code without optimization and analyze where and how this falls apart.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also add that we want
rechunk+map_blocks
to always succeed, even if slow, so that it can be a dependable fallback, particularly when the algorithms get more complicated. This would be a major improvement over the dask.array experience today.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dcherian: Could you elaborate on that last comment? In what situations does that combination not succeed today?
I can imagine this failing if the block sizes or the temporary memory footprint of the mapped function end up being too large. However, this seems like a situation that is beyond the control of higher-level APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will fail whenever rechunking fails... agree that after that, the blockwise is pretty stable, which is why people use this solution in combination with some kind of batching to avoid the rechunking failure. I don't have an example at hand. This is anecdotal experience from looking at lot of non-expert code at NCAR.