Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

let preempted and resumed runs start their subsequent runs #1414

Merged
merged 3 commits into from
Sep 20, 2023

Conversation

orichters
Copy link
Contributor

@orichters orichters commented Sep 18, 2023

Purpose of this PR

  • see runs restarted after preempt don't start subsequent runs #1413
  • it is meanwhile asked in start.R whether a restarted run should start its subsequent runs as well
  • therefore, the automatic detection based on existence of full.gms is misleading, always leading to FALSE for preempted and resumed runs
  • so always start subsequent runs here, expect for coupled runs and if it was selected in start.R not to
  • make sure old log is saved also in case of preemption (I don't think that works the way I implemented it, as the file is probably already overwritten here)

Type of change

  • Bug fix

Checklist:

  • My code follows the coding etiquette
  • I performed a self-review of my own code
  • I explained my changes within the PR, particularly in hard-to-understand areas
  • All automated model tests pass (FAIL 0 in the output of make test)
  • The changelog CHANGELOG.md has been updated correctly

@orichters orichters linked an issue Sep 18, 2023 that may be closed by this pull request
@orichters
Copy link
Contributor Author

orichters commented Sep 19, 2023

I did it as suggested, and this seems to work. While requeued, rs2 shows:

Folder           Runtime      inSlurm   RunType      RunStatus          Iter              Conv                   modelstat              Mif  AppResults
testOneRegi      pending      priority  quick EUR    Run in progress    1/1               NA                     NA                     no   no

Being pending while Run in progress is not very intuitive, but also no nonsense, but we might want to adapt that if needed.

The log states, see less /p/tmp/oliverr/remind-smallfix/output/testOneRegi/log.txt

Starting REMIND...
GAMS will provide logging in full.log.
slurmstepd: error: *** JOB 27909012 ON cs-f14c02b07 CANCELLED AT 2023-09-19T17:38:06 DUE TO JOB REQUEUE ***
Global .Rprofile loaded! (R version 4.1.2 (2021-11-01))

@dklein-pik
Copy link
Contributor

Wenn looking into the file less /p/tmp/oliverr/remind-smallfix/output/testOneRegi/log.txt I can't find the slurmstep line. Is this a bad sign?

scripts/start/submit.R Outdated Show resolved Hide resolved
@orichters
Copy link
Contributor Author

I added the append stuff to the coupling scripts as well. Hope I got all that need it.

@orichters
Copy link
Contributor Author

Wenn looking into the file less /p/tmp/oliverr/remind-smallfix/output/testOneRegi/log.txt I can't find the slurmstep line. Is this a bad sign?

Can you look again? I restarted the run and therefore it went away, but now it should be back.

@orichters
Copy link
Contributor Author

hab noch ein bisschen Kosmetik gemacht und die ohnehin gelöschte deepEL config bei den Tests aus den zu überspringenden rausgenommen. make test lief durch, also mergen wir mal

@orichters orichters merged commit 420238a into remindmodel:develop Sep 20, 2023
2 checks passed
@orichters orichters deleted the fixpreempt branch January 31, 2024 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

runs restarted after preempt don't start subsequent runs
2 participants