-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dependency in restart test case #95
Add dependency in restart test case #95
Conversation
TestingThe |
This will need to be rebased and conflicts fixed after #96 goes in. |
Because we don't know the filename of the restart file at setup time, we need to instead make the `full_run` step an explicit dependency of the `restart_run` step.
c88f20e
to
4f53281
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this another test with the PR suite against a baseline, and everything passed. I then ran the baroclinic channel restart test by manually running the init step, then the restart step, skipping the full run, and it crashed as expected:
Traceback (most recent call last):
File "/home/ac.althea/miniconda3/envs/polaris-test-2/bin/polaris", line 33, in <module>
sys.exit(load_entry_point('polaris', 'console_scripts', 'polaris')())
File "/gpfs/fs1/home/ac.althea/code/polaris/fix-restart-test-inputs-outputs/polaris/__main__.py", line 62, in main
commands[args.command]()
File "/gpfs/fs1/home/ac.althea/code/polaris/fix-restart-test-inputs-outputs/polaris/run/serial.py", line 176, in main
run_single_step(args.step_is_subprocess)
File "/gpfs/fs1/home/ac.althea/code/polaris/fix-restart-test-inputs-outputs/polaris/run/serial.py", line 134, in run_single_step
_run_test(test_case, available_resources)
File "/gpfs/fs1/home/ac.althea/code/polaris/fix-restart-test-inputs-outputs/polaris/run/serial.py", line 409, in _run_test
_run_step(test_case, step, test_case.new_step_log_file,
File "/gpfs/fs1/home/ac.althea/code/polaris/fix-restart-test-inputs-outputs/polaris/run/serial.py", line 499, in _run_step
raise OSError(
OSError: output file(s) missing in step restart_run of ocean/baroclinic_channel/10km/restart: ['/home/ac.althea/ac.althea/polaris_tests/baroclinic/fix-restart-test-inputs-outputs/ocean/baroclinic_channel/10km/restart/restart_run/output.nc']
Everything looks as it should, as far as I can tell.
@altheaden, that's odd. The error you see isn't what I expected or what I see when I try the same. I see:
That's what I was expecting to see -- it's complaining about an input rather than an output file. |
I'm going to go ahead and merge but it would be good to know what the workflow was that produced the results you saw. |
I can recreate it today and see what the results are. As far as I can tell, I did the same process that you did, but let me see if my results are different this time around. |
@xylar Here is a longer version of the error message I get (not sure how much is useful for you to see), still ending in the same error. Not sure what I'm doing differently.
|
@altheaden, is this in a directory where you already ran the command successfully once? Even if so, it's weird that it doesn't just run successfully and instead has errors. We would probably need to look at But it seems like you're seeing a rather different and more unexpected behavior than I was seeing. Maybe let's let it be for now. If we see this again, we can investigate further. |
@xylar I actually just made sure to update the submodules and re-make before setting up the test again. Every time, I have been setting up a new directory and just doing the workflow I showed (cd init, polaris serial, cd restart_run, polaris serial). I just did it again and got the same results. Then, I went and manually ran the full run step before running the restart run step and they were both successful. |
I just checked the error files from my restart_run test, and they all just say that the files in the |
|
@altheaden, those all look lik errors I would have expected to see before this branch. Any chance you were accidentally testing from a different branch (e.g. an earlier version of But also, like I said, it's not critical to figure this out if you'd rather let it go. |
@xylar I just did as you asked and now I'm getting the same error you were, a missing input file. No idea why it was different for me before... |
Because we don't know the filename of the restart file at setup time, we need to instead make the
full_run
step an explicit dependency of therestart_run
step.Checklist
Testing
comment in the PR documents testing used to verify the changes