Use dask whenever possible in preprocessor to keep memory intake low #32

valeriupredoi · 2019-04-24T12:48:08Z

This is a followup from a lot of suggestions and work (mainly done by @bouweandela and @jvegasbsc ). Examples of high memory intake can be seen in various issues like #810 or #922 ; also work is underway in PRs like #1001 or initiated by issues like #949 . There are also the issues related to inherent changes of iris and handling of lazy data eg see #887 So far, actual work is done as follows:

Active work as PR

Reduce memory use of area preprocessors ESMValTool#1001 on _area.py
Make MEAN with weights lazy SciTools/iris#3299 for lazy weights on iris aggregators
Preprocessor memory usage is excessive and error messages unclear. #51

Let's add pull requests here that address the use of dask in other preprocessor modules and gradually close all the issues and PR's (upon acceptance and good code behavior wrt memory) that are listed above 🍺

The text was updated successfully, but these errors were encountered:

bjlittle · 2019-07-29T14:32:36Z

@valeriupredoi and @bouweandela You guys should be aware of SciTools/iris#3357

In a nutshell, if a netCDF variable has an UNLIMITED dimension, then netCDF automatically applies netCDF level chunking to the file, which in most cases will be detrimental to the performance of dask within iris i.e. the chunking specified by netCDF is really small, almost too small for dask, and as such there is a massive overhead in dask to deal with files that have tiny chunks.

A fix on our side will resolve this... just sayin 😉

bouweandela · 2020-06-12T14:35:14Z

Created a new overview issue: #674

valeriupredoi assigned zklaus, bouweandela and jvegreg Apr 24, 2019

mattiarighi transferred this issue from ESMValGroup/ESMValTool Jun 11, 2019

mattiarighi added enhancement New feature or request preprocessor Related to the preprocessor paper labels Jun 11, 2019

mattiarighi mentioned this issue Jun 11, 2019

Preprocessor memory usage is excessive and error messages unclear. #51

Closed

mattiarighi removed the paper label Jan 7, 2020

bouweandela closed this as completed Jun 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use dask whenever possible in preprocessor to keep memory intake low #32

Use dask whenever possible in preprocessor to keep memory intake low #32

valeriupredoi commented Apr 24, 2019 •

edited by bouweandela

Loading

bjlittle commented Jul 29, 2019 •

edited

Loading

bouweandela commented Jun 12, 2020

Use dask whenever possible in preprocessor to keep memory intake low #32

Use dask whenever possible in preprocessor to keep memory intake low #32

Comments

valeriupredoi commented Apr 24, 2019 • edited by bouweandela Loading

bjlittle commented Jul 29, 2019 • edited Loading

bouweandela commented Jun 12, 2020

valeriupredoi commented Apr 24, 2019 •

edited by bouweandela

Loading

bjlittle commented Jul 29, 2019 •

edited

Loading