-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Shift(s): bbe75acd
#4845
Comments
Not to self: this memory shift is within one Dask chunk size; run/write a scalability benchmark to work out if it's a problem for larger |
Bad news! The regression still exists at scale. Will need to investigate the Dask and NumPy change logs.
|
I guess the key observation here is However, the dask chunksize is 200Mb, and in my previous experience I think it typically uses about 3 * chunks even when chunked operation is working correctly. It could well be useful (again, in my experience!) to test with a lower chunksize, it's usually then pretty clear whether it is generally blowing memory or not. |
@pp-mo what difference would we be anticipating? |
The Dask changes seem very unlikely to be the cause, as they are the kind of things you would expect in a bugfix release. Unless anyone else thinks differently? |
Well, hopefully, that the total memory cost would reduce, as a multiple of the chunksize. Within the "N" factor is also how many workers it can run in parallel (we expect threads in this case). |
Updates + observations..
When initially run, this claimed that no memory at all was used (!) So I replaced the measurement with one based on tracemalloc, and got these results:
AFTER the lockfile change
|
Conclusion - won't fixThis regression is a moderate hindrance at worst, and is therefore only worth a limited amount of investigation. More detail: we continue to discover new possible avenues of investigation, without any promising end in sight. The worst expected memory demand is I'm writing a cautionary What's New entry to notify users of the increased memory demand. |
A bit more detail: The region data can be chunked, but the main data chunks must have a full mesh dimension, or indexing errors will occur (which needs fixing in the code - see below). We've now shown that smaller chunks can reduce the cost due to region-data handling (but not the main data). Hence, we believe the total cost is limited to about 3 * full-data-array-size. |
Raising on behalf of the GHA workflow, which was having troubles at the time this originally came up!
Benchmark comparison has identified performance shifts at
Please review the report below and take corrective/congratulatory action as appropriate 🙂
Performance shift report
The text was updated successfully, but these errors were encountered: