Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tests for multimodel statistics preprocessor #1023

Merged
merged 5 commits into from
Mar 1, 2021

Conversation

stefsmeets
Copy link
Contributor

@stefsmeets stefsmeets commented Feb 25, 2021

Description

Hi everyone, we are developing a new implementation of multi-model statistics preprocessor in #968. The goal of this PR is to to better chart out the requirements for the preprocessor, and to have a more complete description of the functionality through the unit tests, so that no functionality / features are lost.

The current set of tests does not cover all the functionality and is difficult to extend as it still makes use of the old unittest framework. This PR re-implements the functionality of the old tests using pytest.

  • Closes #issue_number
  • Link to documentation:

Before you get started

Checklist


To help with the number pull requests:

@stefsmeets stefsmeets added preprocessor Related to the preprocessor testing labels Feb 25, 2021
@stefsmeets
Copy link
Contributor Author

stefsmeets commented Feb 25, 2021

One test that is failing (marked with xfail) is related to passing cubes with non-overlapping timespans. This returns the same cubes in the current implementation, but raises an error in the new implementation.

Another issue that came up is related to the standard deviation and is described in #1024

Copy link
Contributor

@Peter9192 Peter9192 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job @stefsmeets! I think these modified tests make the requirements for a new implementation much more explicit. As far as I can see, the new tests cover all of the old tests except for two of them. Could you have a look at how we could add those within the new test framework?

Comment on lines +239 to +244
@pytest.mark.xfail(reason='Multimodel statistics returns the original cubes.')
def test_edge_case_time_no_overlap_fail():
"""Test case when time coords do not overlap using span='overlap'.

Expected behaviour: `multi_model_statistics` should fail if time
points are not overlapping.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the desired behaviour here is to fail. I would say so if I was designing it from scratch, but maybe this is actually considered a 'feature' by some users/recipes?? Any thoughts on this @valeriupredoi @LisaBock

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schlunma what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first intuition would also be to let it fail when there is no overlap, but I guess returning an empty cube (or no cube) could also be a reasonable option. I've never used that "feature", and I don't really know if anyone uses it. I would be fine with changing this behavior if it is not used anywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's leave it like this then 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remember the purpose of the test in general is to verify the integrity of the functional behavior, not to assume things on its own, so if the function fails if there is no overlap, the test should check for that 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point @valeriupredoi , the point of this PR is to establish the definition so that we have a better description of the functionality that we want. Currently there is no test that explicitly checks for this scenario, so the behaviour is not defined.

Therefore we want to clarify the desired behaviour with this PR. Since this return statement is inconsistent with the normal output (a list is returned rather than a dict with statistics), I marked this with XFAIL to indicate a known problem (which will be fixed with #968 😁).

If this is behaviour that we need with this function, that is also OK for me, then I will adjust the test to ensure that a list of cubes is given back if there is no overlap and update #968 accordingly.

Copy link
Contributor

@valeriupredoi valeriupredoi Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, good call! I am perfectly fine with this as it is here, discussing further in the other PR is cool, the only bit that I think needs addressing is that array equality func below, other than that - good work here 🍺

@schlunma
Copy link
Contributor

Hi guys, great job you're doing here!

I have a question regarding this PR: Should it already include tests for all functionalities that we want the future multimodel_statistics preprocessor to have (and mark those that fail right now with an xfail)? If no, you can stop reading 😄

If yes, do we want want to include tests that ensure that the preprocessor works with cube that have arbitrary coordinates? I think that's a really desired features (#1018 #891 #890). Right now, I think all the tests include the time coordinate somehow (please correct me if I'm wrong). I can add some tests to this branch directly if you agree!

@Peter9192
Copy link
Contributor

Hi @schlunma yes I think this test-driven development makes a lot of sense! It helps to map all the requirements for a new implementation.

@schlunma
Copy link
Contributor

Cool, sounds great! I will add some tests later today.

@stefsmeets
Copy link
Contributor Author

stefsmeets commented Feb 26, 2021

Hi @schlunma , we already have tests for #890 and #891

def test_multimodel_no_horizontal_dimension(timeseries_cubes_month):

def test_multimodel_no_time_dimension(timeseries_cubes_month):

@Peter9192
Copy link
Contributor

What's maybe even more important than tests for new desired features, is tests for 'hidden' requirements. We recently saw an example of this in #975. Are there any other 'features' or 'requirements' you can think are currently not covered by the tests?

@schlunma
Copy link
Contributor

Hi @schlunma , we already have tests for #890 and #891

Ahh, nice, so that should be covered then.

What's maybe even more important than tests for new desired features, is tests for 'hidden' requirements. We recently saw an example of this in #975. Are there any other 'features' or 'requirements' you can think are currently not covered by the tests?

Not really. One thing that recently came up is #1009 (handling of altitude coordinate), but once the preprocessor can handle arbitrary coordinates this should not be an issue anymore.

@schlunma
Copy link
Contributor

One suggestion: Since we want to support cubes with no time coordinate, I think that the span argument should be optional in the future. I really doesn't make sense to specify this when there is no time dimensions.

@Peter9192
Copy link
Contributor

One suggestion: Since we want to support cubes with no time coordinate, I think that the span argument should be optional in the future. I really doesn't make sense to specify this when there is no time dimensions.

I like the idea, but I'm not sure we'll address this in #968 already. I think it sounds more like a follow up. But we can certainly open an issue about it and maybe already start thinking about a test that belongs to it.

@stefsmeets
Copy link
Contributor Author

Hi @ESMValGroup/esmvaltool-coreteam , I think this PR is ready to be merged!

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefsmeets good job, tests are looking much more comprehensive now! Just a wee change related to the array equality checker func and a comment related to the overlap check from me 🍺

Comment on lines +239 to +244
@pytest.mark.xfail(reason='Multimodel statistics returns the original cubes.')
def test_edge_case_time_no_overlap_fail():
"""Test case when time coords do not overlap using span='overlap'.

Expected behaviour: `multi_model_statistics` should fail if time
points are not overlapping.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remember the purpose of the test in general is to verify the integrity of the functional behavior, not to assume things on its own, so if the function fails if there is no overlap, the test should check for that 👍

if np.ma.isMaskedArray(this) or np.ma.isMaskedArray(other):
np.testing.assert_array_equal(this.mask, other.mask)

np.testing.assert_allclose(this, other)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason why you add the same function as the one from the tests' init file here? well, you are adding the allclose check to it, but why do you need it? please use the imported function and if there is a need for an almost equal check, add it in the test that needs it

Copy link
Contributor Author

@stefsmeets stefsmeets Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @valeriupredoi , I'm not entirely sure how to import this file without messing with sys.path. The default way for defining imports in pytest is in conftest.py, which we do not use atm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to import per se it's already imported at module run level, just use it from there without duplicating it here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like you said, the functions are different (array_allclose vs array_equal), so the other one cannot be used directly. Instead, I changed the function name to better reflect its function. Does that work for you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK we should move this one upstream in the init at some point and just use it from there

@valeriupredoi
Copy link
Contributor

oh and another comment: could you please tell me what's the coverage percentage now compared to the previous module?

@valeriupredoi
Copy link
Contributor

One suggestion: Since we want to support cubes with no time coordinate, I think that the span argument should be optional in the future. I really doesn't make sense to specify this when there is no time dimensions.

we should account for a nospan or notime case, but making the argument fully optional may lead to user errors when they forget to set it and we'd have to make the function use a default span - there are major differences between, overlap and full, and even if you throw a warning that complains the span parameter was not set, defaulting to blah, users don't read warnings

@Peter9192
Copy link
Contributor

One suggestion: Since we want to support cubes with no time coordinate, I think that the span argument should be optional in the future. I really doesn't make sense to specify this when there is no time dimensions.

we should account for a nospan or notime case, but making the argument fully optional may lead to user errors when they forget to set it and we'd have to make the function use a default span - there are major differences between, overlap and full, and even if you throw a warning that complains the span parameter was not set, defaulting to blah, users don't read warnings

One way to do it is to add an additional (not the default) option that says span=None if you don't want it. But again, let's discuss that in another issue.

@valeriupredoi
Copy link
Contributor

One suggestion: Since we want to support cubes with no time coordinate, I think that the span argument should be optional in the future. I really doesn't make sense to specify this when there is no time dimensions.

we should account for a nospan or notime case, but making the argument fully optional may lead to user errors when they forget to set it and we'd have to make the function use a default span - there are major differences between, overlap and full, and even if you throw a warning that complains the span parameter was not set, defaulting to blah, users don't read warnings

One way to do it is to add an additional (not the default) option that says span=None if you don't want it. But again, let's discuss that in another issue.

yip, agreed on both counts!

@stefsmeets
Copy link
Contributor Author

oh and another comment: could you please tell me what's the coverage percentage now compared to the previous module?

For _multimodel.py,
this PR: statements: 196, missing: 13, coverage: 93%,
master: statements: 196, missing: 17, coverage: 91%

Note that the coverage says very little about the quality of the tests.

@valeriupredoi
Copy link
Contributor

shweet! yes, it's a number that measures brute coverage, just wanted to check there are no omissions in statement checks that were done before, this PR introduces a better quality test module 👍

@valeriupredoi valeriupredoi merged commit 2966f54 into master Mar 1, 2021
@valeriupredoi valeriupredoi deleted the mmstats_update_tests branch March 1, 2021 12:51
@valeriupredoi
Copy link
Contributor

cheers for the good work @stefsmeets 🍺

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preprocessor Related to the preprocessor testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants