-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean new config #269
Clean new config #269
Conversation
The failing tests are due to a mistake on my end. I did an automated rename of something and I thought it applied only to a file, when it applied to the whole repo... will fix this. |
Pull Request Test Coverage Report for Build 4409663436
💛 - Coveralls |
So a few questions to guide the review:
|
If we do not want to raise errors, could we return an object that describes whether the simulation failed (and how) ? This would be good information to pass along via WPS process / logging.
I like |
Good questions.
Trying to setup my env to help evaluate, debug and contribute to this! |
I agree with Trevor and Richard comment on the way we should structure the code. Also, shouldn't we removed the part of Ostrich in base.py? I never installed the binary for it since we were suppose to rip it. So right now, if I'm not commenting the Ostrich code, 60 tests fails and if I comment I still get errors for Ostrich unit tests (16 failed tests) and 2 others which could be related but doesnt give Ostrich errors. So what do we do with this? Do ,we include the removal in this PR or in another PR? |
If we can get those changes underway, that removes one of the blockers for #264 (the other is being worked on currently). |
This PR is only about the new config, so yes, Ostrich is still there. I'll open a new PR to plug SPOTPY to the new config once this one is merged, and then we'll be able to remove Ostrich. The "build" step is shown in the PR description, configure the model We can return True if the simulation has completed, or the path of the output directory. |
I think what Richard meant is that we should have a dedicated class to do the raven job something like this : model = GR4JCN(...) It might be impossible to refactor this way now, but I think its the way he was talking about. |
That's very easy, I can whip something up for discussions. |
See the latest commit for a new Emulator class with methods:
|
I've tested the new push from yesterday and the only errors I have (non-related to Ostrich) is in test_read_from_netcdf and test_open_dataset_false_cache. In test_read_from_netcdf, we assert to have a message different than None, but we receive None. In test_open_dataset_false_cache, I get an error for Permission denied on accessing a file. |
Weird, no clear idea where this is coming from. Bugs notwithstanding, what I'm interested in is feedback on the API: the naming of functions and arguments, the workflow, etc. |
I'm currently trying to use the new_config in the spotpy test. I have 2 comments so far. First, I think the import of the model should be as claer as before : from ravenpy.models import GR4JCN and not from ravenpy.new_config.emulators.gr4jcn import GR4JCN. So we could import all models from .emulators or choose to import a specific by naming it (as for GR4JCN) Second, I dont understand why in the Emulator.build() we have overwrite as True for a default value, but in Config.build() the overwrite is set as False by default. I'll let you know if I have more comment as soon as I make it work in with spotpy |
Agree with your first point, but this poses the problem of how to deal with the old and new versions. Maybe when this PR is completed we wipe out the old versions and new_config becomes config. Good catch, changed overwrite to False in Emulator.build. |
test_spotpy_calibration should now run the calibration for gr4j. To run the other models that have a new config, we only need to add the low and high bounds. |
Ah yes, I understand the issue. I'll try spotpy with your new push. Ill let you know, how it goes! |
I dont see any file enforcing in the new config. Is it the "path" params in the Emulator constructor? |
Check tests/new_config/emulators There is a pytest fixture called |
Maybe I misunderstood the question, but calibration.py has a SpotSetup class using the new config. |
I was talking about the netCDF that we shoot for the training, but I think I found how you were passing it with the fixtures. Ill try to make it work without fixtures, but it seems good. It was a bad interpretation on my end! |
I've managed to make, spotpy work without fixture, but it doesnt do a rerun of raven for each evaluation. It keeps shooting the same nash -0.117301, so it doesnt try to balance the params. Im looking for the issue right now. |
So I found out in debugging, that params are changing, but the Nash is always the same, which is (-0.117301) as stated in the other comment. I'll let you know if I found other info that could help you find the source of the issue! |
I think I know what's wrong. We need to instantiate the model at each iteration, not update the parameters. |
Shouldnt this be the job of the build function? We do : self.config.params = list(x) While we were doing this before : self.model.config.update("params", np.array(x)) The only thing I see different is in the execute function where we do self.setup() before calling run(). I'll try adding it in the simulation too, and see what we get after! |
But the config object underneath is completely different. |
I've been testing and trying to run this new_config, and I have hit a few snags. I've been able to overcome some of them but others are blocking. 1- Gauge.from_nc : It seems it can only handle a single file with all variables merged. However, users that would use the ERA5 extraction tool (or have different files for different variables) would have those separated. Is it possible to send a list of files, as we could before with the ts forcing? Or maybe I just failed and it's already possible? I got errors saying it expects a str or path object, but not a set or tuple, etc. 2- Gauge.from_nc (again): When I take the spatial mean of the ERA5 weather data, I mean over latitude and longitude, meaning that those dimensions disappear. The code is failing because it expects getting a gauge lat/lon. Is this necessary? I think it was not required before? Can we have a system where if there is only 1 station, it just takes a random value (since it will not impact the results)? I can force it in the extras, but it seems to be a redundant step if it's not useful in the actual code. 3- GR4JCN has a parameter defined as AVG_ANNUAL_RUNOFF, but I don't think this is what GR4JCN actually needs. The parameter is G50 (unless it is something else entirely, in which case I am at a loss as to what it is) and represents the median annual snowpack depth. 4- We need to expose more clearly how to run a model from rv / nc files from a folder that a user would upload. Try as I might, I was unsuccessful. I was only able to run models that I had previously built. 5- One notebook (02_Extract_geographical_watershed_properties.ipynb) fails for obscure reasons, to investigate. I'll add more as I progress! |
|
@richardarsenault If you want to maintain a specific set of sections in the imports of a notebook: https://pycqa.github.io/isort/docs/configuration/action_comments.html#isort-split |
… into clean_new_config
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
This is a big refactor of the way models are configured and run.
This PR does not remove any of the original code, but creates a
new_config
directory with the new architecture.It also drastically simplifies the user interface, see
ravenpy/ravenpy.py
for the two main functions users will use:run
to run the model on an existing configurationparse
to read the outputs and create Python objectsSee
tests.conftest.gr4jcn_config
for an example of the configuration with the refactor:Writing the config files to disk is then done with
m.write(path)
.The configuration supports symbolic expressions, e.g.
I think the objective now would be to pinpoint non-intuitive stuff in the new configuration and fix those. Once we're happy, let's merge this PR and then create others adding the other emulators and functionalities we want to port.
Ping @Mayetea
This PR fixes #272