Making the multi-model statistics preprocessor lazy #781

Peter9192 · 2020-09-15T12:51:09Z

The multi-model statistics preprocessor is currently sub-optimal. It operates on Numpy arrays because Iris cubes are quite strict in checking inconsistencies between the input cubes. This workaround impedes lazy evaluation and introduces some vulnerabilites (e.g. #267, #665) and a trail of ad-hoc fixes.

Inconsistent time coordinates were probably the biggest obstacle. However, starting from #677, and continuing in #685, we have already drastically reduced the complexity of time handling. Additionally, #744 proposes to move everything related to 'homogenizing' the time coordinate to the regrid_time preprocessor altogether. Moreover, Iris will soon support lazy regridding (SciTools/iris#3701), so in that case we could really benefit from lazy evaluation in multi-model statistics as well.

In #673, we have implemented a new function that computes 'multi-cube' statistics using Iris build-in functions. This already works for ensemble statistics, because in that case, the cubes are already very homogeneous.

It would be really nice if we could use this Iris 'engine' for the regular multi-model statistics preprocessor as well.

Would you be able to help out?
I might have some time, but it would be good to wait for #685, #744 and #673. In the meantime, this is a placeholder to keep the helicopter view of everything that's happening with the multi-model statistics preprocessor.

stefsmeets · 2020-09-15T13:09:23Z

Also related to #674

valeriupredoi · 2020-09-16T10:07:52Z

good call! we need to liaise with iris folk - @bjlittle - because a basic multimodel statistic (and lazy altogether) would best sit in iris. We can do it here first but migration to iris is something both us and them would benefit from 🍺

Peter9192 · 2020-09-16T11:31:07Z

So what we did in #673 is basically: iris.cube.cubeList(...).merge_cube().collapse() with some fiddling with auxcoords and equalizing attributes. I'm not sure if that warrants a dedicated function in iris or whether it's just a good application of existing functionality. So indeed, I'm curious what @bjlittle has to say about that.

valeriupredoi · 2020-09-16T11:34:53Z

no, I'm talking about a dedicated iris.analysis.multimodel_statistics(args=(cubes, statistic, ...), **kwargs) function

Peter9192 · 2020-09-16T12:50:52Z

I know 🍺. But what I'm saying is that you could also see it as a use case that is already supported by existing functionality in iris. So I'm not sure if really needs to be a dedicated function in iris. But of course I would be happy to see it migrate there eventually.

Btw, I would prefer to name it multi-cube statistics (like I did in #673), because iris is all about cubes, and different cubes are not necessarily different models.

valeriupredoi · 2020-09-16T17:09:37Z

I need to have a look at #673 👍

stefsmeets · 2020-11-05T12:22:27Z

We are working on better tests for the multimodel statistics here: #856
Might want to keep an eye on that, as I'm sure this will help with developing the code for this issue.

Peter9192 · 2021-01-14T10:05:58Z

#685 is merged!

schlunma · 2022-02-04T10:28:40Z

Moving this to v2.6 to follow the corresponding PR.

Peter9192 added the enhancement New feature or request label Sep 15, 2020

stefsmeets mentioned this issue Sep 15, 2020

Make preprocessor lazy #674

Open

62 tasks

bouweandela mentioned this issue Sep 30, 2020

Simplify time handling in multi-model statistics preprocessor #685

Merged

6 tasks

stefsmeets mentioned this issue Nov 5, 2020

Add multimodel tests using real data #856

Merged

27 tasks

This was referenced Jan 18, 2021

Refactor multi-model statistics code to facilitate ensemble stats and lazy evaluation #949

Merged

Add lazy 'engine' for multicube statistics #950

Closed

Peter9192 mentioned this issue Jan 28, 2021

Lazy implementation of multi_model_statistics and ensemble_statistics preprocessors #968

Merged

9 tasks

Peter9192 mentioned this issue May 28, 2021

Use native iris functions in multi-model statistics #1150

Merged

9 tasks

Peter9192 mentioned this issue Jun 30, 2021

New multimodel module can be slow and buggy for certain recipes #1201

Closed

zklaus mentioned this issue Jul 2, 2021

How should multi-model statistics handle daily data on different calendars? #1210

Open

zklaus added this to the v2.4.0 milestone Jul 5, 2021

zklaus modified the milestones: v2.4.0, v2.5.0 Oct 8, 2021

schlunma modified the milestones: v2.5.0, v2.6.0 Feb 4, 2022

bouweandela removed this from the v2.6.0 milestone May 30, 2022

bouweandela closed this as completed in #968 Jun 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making the multi-model statistics preprocessor lazy #781

Making the multi-model statistics preprocessor lazy #781

Peter9192 commented Sep 15, 2020

stefsmeets commented Sep 15, 2020 •

edited

Loading

valeriupredoi commented Sep 16, 2020

Peter9192 commented Sep 16, 2020

valeriupredoi commented Sep 16, 2020

Peter9192 commented Sep 16, 2020 •

edited

Loading

valeriupredoi commented Sep 16, 2020

stefsmeets commented Nov 5, 2020

Peter9192 commented Jan 14, 2021

schlunma commented Feb 4, 2022

Making the multi-model statistics preprocessor lazy #781

Making the multi-model statistics preprocessor lazy #781

Comments

Peter9192 commented Sep 15, 2020

stefsmeets commented Sep 15, 2020 • edited Loading

valeriupredoi commented Sep 16, 2020

Peter9192 commented Sep 16, 2020

valeriupredoi commented Sep 16, 2020

Peter9192 commented Sep 16, 2020 • edited Loading

valeriupredoi commented Sep 16, 2020

stefsmeets commented Nov 5, 2020

Peter9192 commented Jan 14, 2021

schlunma commented Feb 4, 2022

stefsmeets commented Sep 15, 2020 •

edited

Loading

Peter9192 commented Sep 16, 2020 •

edited

Loading