Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch submission of many workflows #493

Open
lukasheinrich opened this issue Apr 16, 2021 · 3 comments
Open

batch submission of many workflows #493

lukasheinrich opened this issue Apr 16, 2021 · 3 comments

Comments

@lukasheinrich
Copy link
Member

lukasheinrich commented Apr 16, 2021

for RECAST scans often we want to submit N (N=100 or so) workflows in one go. While we can submit them using a pure loop in bash, it might be nice to be able to submit/manipulate a group of workflows (submit/download/status)

e.g.

analysis1/reana.yml,analysis1/pars1.json
analysis1/reana.yml,analysis1/pars2.json
analysis1/reana.yml,analysis1/pars3.json
analysis2/reana.yml,analysis2/pars1.json
analysis2/reana.yml,analysis2/pars2.json
analysis2/reana.yml,analysis2/pars3.json
@tiborsimko
Copy link
Member

tiborsimko commented May 6, 2021

WRT submit, if you try to run 100 of them in parallel, most would go into the incoming queue anyway, waiting for runtime slots to liberate. So out of 100 submitted workflows, there will be 10 running, and 90 queued; just a typical example. So submission could remain being done via a tiny outer shell loop, I guess.

WRT getting status, this is indeed useful, and currently not easily possible without outer shell loop either. We were musing about adding a filtering option to many reana-client commands that would allow to list only some workflows that you are interested in, for example:

$ reana-cllent list --filter name=bsm --include-progress
NAME              RUN_NUMBER   CREATED               STARTED               ENDED                 STATUS      PROGRESS
bsm09             2            2021-04-05T18:12:31   2021-04-05T18:12:32   -                     running         6/12
bsm08             2            2021-04-05T18:12:29   2021-04-05T18:12:30   2021-04-05T18:22:56   finished       12/12
bsm07             2            2021-04-05T18:12:22   2021-04-05T18:12:24   -                     running         8/12
...

would display the progress statuses only for those workflows that are named *bsm*. See #510. (CC @ParthS007)

WRT download, how are you picturing it? Imagine you have BSM workflow run 1, run 2 and run 3. Would you download some file into 1 and 2 subdirectiories in this case? I guess using a tiny outer shell loop may be easiest solution here...

@lukasheinrich
Copy link
Member Author

shell loop works of coursse but could result in many repeated API calls on the server. E.g. Condor has a ssimilar connecpt of
queue N instead of looping condor_submit .. the former is much faster than the later.

With a batch submission it could all be wrapped in a single API call.

@tiborsimko
Copy link
Member

Yes, one API call vs many API calls could make a difference if you are submitting say hundreds of workflows... What would be typical number?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants