-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use pytest and temporary directories #54
Conversation
When I run E OSError: System environment variable 'MARBL_ROOT' points toa github repository whose remote:
E ''
E does not match that expected by C-Star:
E https://github.com/marbl-ecosys/MARBL.git.Your environment may be misconfigured.
cstar/base/basemodel.py:221: OSError |
This is most likely because you deleted your |
That gets me further, thanks! |
What's the status on this? Let me know if I can help @TomNicholas |
@pytest.fixture | ||
def roms_marbl_blueprint_remote_datasets(tmp_path) -> Path: | ||
"""Creates blueprint yaml file in temporary directory""" | ||
|
||
with open(ROMS_MARBL_BASE_BLUEPRINT_PATH, 'r') as file: | ||
blueprint_dict = yaml.load(file, Loader=yaml.Loader) | ||
|
||
blueprint_filepath = tmp_path / 'blueprint.yaml' | ||
|
||
with open(blueprint_filepath, 'w') as file: | ||
yaml.dump(blueprint_dict, file) | ||
|
||
return blueprint_filepath | ||
|
||
|
||
# TODO similar fixture but the blueprint file contains paths to local files instead | ||
# TODO should these local files be in a given temporary directory? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure what the best way to handle blueprints is, but I think we should change the current approach of modifying a blueprint to use machine-specific local filepaths.
The point of this fixture was supposed to be so that we could start with one blueprint and modify it in-memory, then write the modified blueprint back out to a temporary directory, then pass the (temporary) path to that file back into the test.
However it would be better to just have a set of pre-defined blueprint files to read from. If paths in blueprints can be relative paths, then perhaps we can avoid having to make the blueprint specific to a machine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However it would be better to just have a set of pre-defined blueprint files to read from.
I'm not strictly against having two separate blueprints (one local, one remote) if that's what you mean here. However, I deliberately avoided this as these two files would differ only by a single substring throughout. Any change to the remote blueprint thus needs to be also tracked in the local version (this could be a lot if we switch all the input datasets, for example) whereas currently we only need to track the two substrings (base URL, base filepath).
I'm not entirely sure what the best way to handle blueprints is, but I think we should change the current approach of modifying a blueprint to use machine-specific local filepaths.
This is definitely fine for testing, and any user is free to make blueprints with relative paths, but I would maintain the behaviour that, within C-Star, and in any blueprints it writes out, absolute paths are used (to maintain the most exact information available about the location of a file). C-Star-produced blueprints should not break when moved from one directory to another because of the use of relative paths. Thus if we want to use relative paths in a blueprint describing local files produced by C-Star, we should have to manually modify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just stumbled on this discussion #46 which is similar. I forgot another reason for recycling the same blueprint was to test C-Star's ability to create a case from a C-Star-created blueprint. This is something we should definitely test if we aim to prepare separate local/remote blueprints. It may also be worth planning around that this will soon double to local/remote x netcdf/yaml ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not strictly against having two separate blueprints (one local, one remote) if that's what you mean here.
That was my second suggestion yes.
However, I deliberately avoided this as these two files would differ only by a single substring throughout. Any change to the remote blueprint thus needs to be also tracked in the local version (this could be a lot if we switch all the input datasets, for example)
Yeah, that's why I made this yaml fixture that reads the blueprint into memory. We could alter the yaml structure at this point, then parametrize that and auto-generate the different blueprints we need as we need them. The result would be similar to what you're already doing but hopefully less error-prone because the base blueprint is unchanging and the per-test blueprints are created (and deleted) as needed.
C-Star-produced blueprints should not break when moved from one directory to another because of the use of relative paths.
That's fair.
I forgot another reason for recycling the same blueprint was to test C-Star's ability to create a case from a C-Star-created blueprint.
We should test this, but in a dedicated test. I'm trying to think how to structure all this so that we don't just end up with a small number of mega-tests.
It may also be worth planning around that this will soon double to local/remote x netcdf/yaml ?
This is why I want to be able to test one thing per test with the blueprints etc. auto-generated as needed. Then we can just add more @pytest.mark.parametrize
decorators to tests and we will cover all the possibilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining, this makes a lot more sense to me now. Keeping a single yaml dict in memory then modifying it between tests is a very clean way to handle it, nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generation and modification of the yaml as a dict in-memory is now implemented.
…for saving to a filepath
# TODO define `make_local`function | ||
if make_local: | ||
# blueprint_dict = make_local(blueprint_dict) | ||
raise NotImplementedError() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dafyddstephenson do you think this approach will work? Is the difference between your two tests representable like this? I want the make_local
function to make the same changes that you do here:
C-Star/tests/test_roms_marbl_example.py
Line 58 in c83b1a1
# Modify the blueprint file to point to local paths whenever we have the files: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also how am I supposed to know what the value of these needs to be for any given test?
id_url_prefix = "https://github.com/CWorthy-ocean/input_datasets_roms_marbl_example/raw/main/"
ac_url = "https://github.com/CWorthy-ocean/cstar_blueprint_roms_marbl_example.git"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure what's going on here I don't really see why not represent that change as a function, it's just a string replacement (url -> path).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they're in the blueprint, those are just the strings to replace. If you're only working with one core blueprint, then these won't change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not represent that change as a function
I was just wondering if that was sufficient.
If you're only working with one core blueprint, then these won't change.
Okay, sounds like I can hard code these values for now and revisit this in a follow-up to generalize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've generalized these by putting them all in a top-level global EXAMPLE_BLUEPRINTS
dict.
The CI failure is #66 (separate from this PR). |
conftest.py
Outdated
"""Given the name of a pre-defined blueprint, returns it as a (temporary) path to an on-disk file.""" | ||
|
||
def _blueprint_as_path(name: str, make_local: bool = False) -> Path: | ||
# TODO add options to edit in-memory blueprints here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make_local
is an example of one such option.
roms_marbl_case.setup() | ||
|
||
# TODO why are we persisting this blueprint file then not using it again in the test? | ||
roms_marbl_case.persist(tmpdir / "test_blueprint.yaml") | ||
|
||
roms_marbl_case.build() | ||
roms_marbl_case.pre_run() | ||
roms_marbl_case.run() | ||
roms_marbl_case.post_run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an aside, I would like us to have tests for each of these steps individually.
We decided that we really need a way to generate local test data on demand in order to perform tests with local data. Now that roms-tools is integrated with C-Star, we can do that using roms-tools. This could take the form of a monster fixture that creates input data ready for C-Star to use. Given the size of that effort, it should happen in a different PR to this one. But we also don't want to leave this PR dangling. But if we merge it in we don't want to delete existing tests, as currently this PR has smaller coverage. So @dafyddstephenson and I decided to merge basically what is here but without altering the existing tests. There will be an awkward crossover period with two independent sets of tests existing in |
.github/workflows/tests.yaml
Outdated
coverage run --rcfile=coverage.toml -m pytest --verbose cstar/tests/* | ||
python -u -m coverage run --rcfile=coverage.toml tests/test_roms_marbl_example.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR now runs both the old and the new tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance we could separate these into two separate steps in the workflow? Hard to tell where one ends and the next begins
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. I actually should probably just make them separate CI jobs.
@dafyddstephenson I think this might be ready to merge now? It runs both sets of tests in the CI successfully, and you can run either one independently locally. #99 can be addressed in a follow-up PR. |
Sorry I should have taken a better look before your last commit - I think having |
Moved to |
WIP to use pytest for the tests.
EDIT: I'm trying to change our tests from a run-once python script to an actual extensible testing framework. In particular that means tests should fully clean up after themselves automatically, and example blueprint generation should be extensible and robust.