Write a specification for unsatisfied task inputs. #359

rhpvorderman · 2020-03-06T07:51:58Z

This PR is a result of the discussion in #347 . This alternate PR was made because the scope shifted during the discussion.

These were some of the points made in the discussion:

Users should be able to know what inputs they should provide by looking at the input section of the top-level workflow. This should not be hindered by implicit optional inputs bubbling up.
For clinically validated workflows only a select number of inputs may ever be changed.
Allowing engines to pursue different models for computing inputs will lead to compatibility problems.
Requiring all task optional inputs to be filled in the top-level workflow input section is infeasible and an enormous hassle.

This PR attempts to address all these points by:

Requiring all required task/workflow inputs to be filled by the calling workflow. This way the top-level workflow inputs section will always contain all the inputs necessary for running the workflow. (Solves point 1)
Requiring workflows to only accept inputs for the top-level workflow by default (solves 2 and 3)
Give engines the flexibility to allow for bubbled-up optional inputs if a configuration option is provided. This gives users the flexibility needed to solve issue 4.

Supporting bubbled-up inputs optionally is feasible in Cromwell thanks to how @cjllanwarne implemented the input calculation. I don't know about the other engines, but since it is optional, it should not be an extra burden.

Pinging @cjllanwarne , @mlin , @orodeh and @DavyCats as they brought up these points in the discussion.

EDIT: I also updated the example to reflect changes in the WDL spec so far.

Checklist

Pull request details were added to CHANGELOG.md

…rkinputs

versions/development/SPEC.md

DavyCats · 2020-03-06T08:27:18Z

versions/development/SPEC.md

+* There is currently no way to supply a bubbled-up input in an outer workflow's 
+  call block.
+    * Example: one cannot say call my_workflow as subworkflow 
+      `{ inputs: my_task.my_task_input=... }`


Maybe this should be made allowed?

Yes maybe. But we should do that in another PR I think.

versions/development/SPEC.md

DavyCats · 2020-03-06T09:06:48Z

versions/development/SPEC.md

+* By default an engine only allows inputs that are specified in the input
+  section of the top-level workflow.
+* An engine may optionally support supplying bubbled-up optional inputs, but
+  this has to be explicitly enabled on the engine (via configuration, command
+  line flags or otherwise).


I still feel it might be better to add some form of syntax in the WDL file itself. That way one would be able to see that there may be interesting nested inputs in the WDL files and, thus, that it would be good to use an engine which supports them.

In addition, a workflow might be written without the intention for the bubble-up behavior to be used. If someone then uses such a flag or configuration to allow for the bubble-up, they might be able to set optional inputs which will mess up the workflow.

An alternative to the last two points in the list:

By default only inputs defined in the top-level workflow are made available to the user.

The keyword etc must be added to a workflow's input section to enable the bubble-up behavior described above. Only if this etc keyword is present in the workflow's input section may an engine allow for this bubble-up behavior and an engine may choose not to support to feature at all.

The etc keyword makes sense to me. However, why not just allow it in the workflow meta section? This way, we don't need new syntax.

@DavyCats good points! @orodeh good suggestion! Setting it in the workflow seems indeed better than setting it in the engine, and using the meta section for this seems the most logical thing to do.

Something like:

meta { bubbled-up-inputs: true }

Except that I don't like the bubbled-up word. It is in the PR, but more on "a lack of a better term" basis.

How about nested inputs?

meta { allow-nested-inputs: true }

Or some other suggestions:

meta { allow-hidden-inputs: true allow-subworkflow-inputs: true allow-extended-inputs: true }

I'm not sure that the meta section is the appropriate place for this. The meta section is just a set of key-value pairs that hold no relevance to the execution of a workflow. It is intended to be a place where you might store contact information or a link to the documentation. This was also discussed (briefly) in #351.

This feels more like something that would fit in the runtime section, except that there is no runtime section for workflows right now.

allow-nested-inputs seems fine to me, though I wonder if using -s might cause issues with the grammars.

Pinging @patmagee for the grammar question.

Well we could consider adding a runtime section for workflows, but that requires another spec/language change. (Which is why I wanted to set it engine-side) . Adding a runtime section could be useful, but there need to be other relevant things stored there. Having an entire runtime section for allow-nested-inputs seems a bit ludicrous to me.

I think the meta section is quite appropriate: It states that this workflow has nice nested inputs. A user does not have to use them. I think it is more documentation than runtime information, since an engine does not have to support nested-inputs per se.

These are actually really great examples of something that can be added to the hints section, and we leave it up to the engine to decide what to do. We tried to really trim down the runtime lock and make it the bare minimum requirements needed to run a workflow.

We already have the idea of "semi reserved" kw used in the hints section. These are hints that have similar meanings across all engines but do not need to be implemented.

I'd suggest doing something like:

hints { allowNestedInputs: true }

@patmagee, yes, that seems to be the best section but in the PR for the hints section I only see it at the task level. Will there also be a hints section at the the workflow level?

This doesn't feel like it quite matches the semantics of a "hint" to me to be honest.

A hint to me should mean "if an engine completely ignores this, it might miss out on some runtime optimization but the logical evaluation of the workflow should still be possible".

What we're talking about here is:

a warning to users that their workflow might not work on all engines.

a requirement to supporting engines that they should allow bubbled up inputs.

a signal for engines which don't support this feature to quickly stop processing the workflow/task

I'd almost like to add a way to flag plugins or non-standard-features or engine-requirements. I know Ohad has a number of other ideas for features which would be really cool but tricky for Cromwell. A way to flag these high up in the context of a WDL file (perhaps even above the workflow scope) would be really nice.

eg something like:

version 2.0 plugins { bubbled-up-inputs, streamable-input-files, ??? }

it's not 100% ideal because it allows engines to diverge, but at least it'd be obvious right at the top of the file whether your workflow will work with any particular engine (which would presumably be able to publish or otherwise indicate which features they support). It would also allow engines to "get out quickly" if they see required features which they didn't have.

orodeh · 2020-03-09T17:12:51Z

I have joined @patmagee in trying to build a grammar for WDL with ANTRL4. My part is writing scala code that builds an AST from a WDL program (https://github.com/dnanexus-rnd/wdlTools). I now have a much better appreciation for the amount of work it takes to modify and maintain the grammar. Hence my reluctance to add additional syntax, unless we really need it.

rhpvorderman · 2020-03-10T06:52:08Z

Given the feedback I coin the following proposal:

If there is no additional use for a hints or runtime section for workflows, then we should settle on the meta block for harboring a key called allowNestedInputs.

If in due time the need arises for a runtime or hints section before WDL2.0 is released, we can always change this. If the need arises after WDL2.0 is released, we can deprecate (but still support) the use of allowNestedInputs in the meta section to provide a smooth transition.

What do you think? If we are all in agreement that the meta section is (for now, in a pragmatic sense) the way to go I will go ahead and update the proposal.

orodeh · 2020-03-10T14:11:20Z

Sounds good to me, thanks.

cjllanwarne · 2020-03-11T03:41:28Z

@rhpvorderman I should have read further down before replying in thread!

Is the idea that an engine could choose either to allow bubbling up only if this flag was set, or else to fail the workflow when it sees that flag?

Your answer seems fine to me for now. The more general "how do we allow for diverging engine features" can be part of a future discussion.

rhpvorderman · 2020-03-11T06:30:33Z

Is the idea that an engine could choose either to allow bubbling up only if this flag was set, or else to fail the workflow when it sees that flag?

I think the engine should allow bubbling up. If it does not support bubbling up I think it should fail with a message, but only if bubbled-up namespaced inputs are actually used. Since all required inputs should be filled by the top-level workflow, this way all workflows can be run by all engines, regardless whether the flag is present or not.

Co-Authored-By: DavyCats <[email protected]>

rhpvorderman · 2020-03-11T06:52:33Z

I added the flag to the PR. I did not add any stuff about engines optionally supporting this feature yet. Do we want to specify this diverging behavior now, or can this wait for later since WDL2.0 is not released yet?
As it is in the PR currently, all engines should support it, as no exceptions are made.

DavyCats · 2020-03-11T08:41:37Z

Regarding diverging behavior, I see two viable stances that might be taken:

The spec never declares a feature as optional, but an engine may simply choose not to implement a certain feature. The engine wouldn't be fully spec-compliant, but if that's properly documented it is probably okay.
In cases where a feature might interfere with the possibility for a workflow to be run on certain infrastructure, such a feature may be declared in the spec as an optional feature that an engine may choose not to implement. Only features who's absence would not interfere with the runnability of a workflow may be optional, ie. even if the feature is not available in an engine the engine should still be able to run a workflow which would otherwise have made use of it, simply with less functionality/convenience/configurability.

rhpvorderman · 2020-03-18T07:27:49Z

I propose we follow @cjllanwarne's suggestion and discuss diverging engine behavior in a future discussion.

I think the PR is ready. Are there any points that need to be addressed on this PR before it gets into voting?

DavyCats · 2020-04-14T10:15:14Z

Should voting be opened on this? There hasn't been any further discussion in a while.

rhpvorderman · 2020-04-24T15:23:56Z

@patmagee Could you initiate the vote on this? It might be accepted or rejected, but it's getting a bit stale at the moment. I would do it myself if I had the label powers 😇. Thanks for your effort!

patmagee · 2020-04-29T12:44:51Z

@rhpvorderman can you please update the changelog.md :)? Once you do that, I can opn this up for voting

…nputs

…ionalinputs

rhpvorderman · 2020-05-04T06:44:38Z

@patmagee Done!

orodeh · 2020-05-06T16:30:47Z

👍

DavyCats · 2020-05-07T07:18:39Z

👍

pieterlukasse · 2020-05-11T14:04:35Z

versions/development/SPEC.md

+  again with a further-qualified name.
+    * Example: my_outer_workflow.my_workflow.my_task.my_task_input.
+* There is currently no way to supply a bubbled-up input in an outer workflow's 
+  call block.


Not sure I understand this point. Does it contradict the previous one? I.e. the previous one seems to introduce an option, but this point seems to state it is not possible. Can you please clarify?

The first point is about the input json (or whatever other input format)

The outer workflow's call block is in the WDL code itself. When you call a workflow from inside a workflow. In that area there is currently no way to supply nested inputs.

pieterlukasse · 2020-05-11T14:08:54Z

thanks @rhpvorderman . Looks good to me 👍
I had only 1 question regarding part of the documentation. Maybe you can take a look at that?

rhpvorderman · 2020-05-26T13:51:43Z

Cromwell implementation pending in broadinstitute/cromwell#5523.

I know this spec change does not generate a lot of buzz. However, it is important that it gets merged as a lot of the behavior it described was previously undefined. Cromwell, for example, allowed for calling tasks without supplying any inputs to the call block. These inputs could then be supplied by the inputs json of the workflow. This carries with it a lot of implicitness that makes workflows very hard to read.
In order to get the implementation to be similar across engines a clear spec is needed.

patmagee · 2020-06-24T14:48:43Z

This has passed with a 3-0 vote. @rhpvorderman would you be able to rebase the changelog so that this can be merged

Currently blocked by: broadinstitute/cromwell#5523

rhpvorderman · 2020-06-25T11:52:01Z

@patmagee done. Thanks

This implements part of the spec change that handles inputs openwdl/wdl#359

versions/development/SPEC.md

Co-authored-by: Miguel Covarrubias <[email protected]>

DavyCats · 2020-08-06T08:31:58Z

@rhpvorderman This was implemented in cromwell 52, wasn't it?

rhpvorderman · 2020-08-11T05:41:39Z

@DavyCats partially.
The optional inputs are allowed now. The very strict checking on whether all required options are satisfied by the top-level workflow was not.

jdidion · 2020-11-03T15:35:20Z

versions/development/SPEC.md

+  In this case, the inputs bubble up to become a nested input to the workflow 
+  instead.
+    * Example: an unsupplied input might have the fully-qualified name 
+      `my_workflow.my_task.my_task_input.`


Period should go outside the backtick, I think

Thanks for spotting this!

jdidion · 2020-11-13T14:05:13Z

versions/development/SPEC.md

+* By default an engine only allows inputs that are specified in the input
+  section of the top-level workflow.
+* If a workflow is suitable for use with nested inputs it should be explicitly
+  stated. This is done by setting `allowNestedInputs` to `true` in the meta


My problem with this is that things like allowNestedInputs don't really belong in meta - there's just no where else to put them. I am going to open a proposal to add a hints section to workflow with the same semantics as it has in task. I think things like this more appropriately belong in hints than in meta.

I think when designing the hints section, I ultimately planned on having one in the workflow as well. however it felt like a more natural transition to start with it in the task. In the lont term this is probably a good move. If we add the hints in 1.1, I feel like 2.0 can propogate the hints to the workflow

I agree! Meta was always a bit of a "let's put it where it doesn't hurt" manoeuver.

Not having this feature proved to be a major hindrance in getting actual work done. But at least it is implemented in cromwell now, so it is at least possible to get work done until this feature lands.

If the hints proposal is landed I will rewrite this PR.

jdidion · 2021-02-01T18:42:28Z

This is in v1.1 and will be propagated to development.

rhpvorderman added 9 commits December 11, 2019 08:19

lift requirement for all workflow inputs to be filled

10b055e

address @DavyCats review comments

dcbfafa

encourage filling in required inputs for tasks

ecd270d

Merge branch 'reworkinputs' of github.com:/rhpvorderman/wdl into rewo…

6ba8da5

…rkinputs

Integrate cjllanwarne's proposal

60fcf59

remove redundant new line

134f090

better wording

549eed6

require required inputs

2d19af2

update example

583e0a8

rhpvorderman mentioned this pull request Mar 6, 2020

Do not require for all task inputs to be filled by the calling workflow #347

Closed

DavyCats reviewed Mar 6, 2020

View reviewed changes

rhpvorderman mentioned this pull request Mar 10, 2020

Add support for namespaced inputs again. broadinstitute/cromwell#5317

Closed

illusional mentioned this pull request Mar 10, 2020

Support parameter_meta section PMCC-BioinformaticsCore/python-wdlgen#1

Closed

rhpvorderman and others added 2 commits March 11, 2020 07:33

Apply suggestions from code review

9bd28ac

Co-Authored-By: DavyCats <[email protected]>

add allowNestedInputs

1e28d58

patmagee added this to the v2.0 milestone Apr 3, 2020

patmagee added the Spec Change label Apr 3, 2020

rhpvorderman added 2 commits May 4, 2020 08:41

Merge remote-tracking branch 'remotes/upstream/master' into optionali…

cd69299

…nputs

update changelog

53daafa

Merge branch 'optionalinputs' of github.com:rhpvorderman/wdl into opt…

0e95a27

…ionalinputs

patmagee added the Voting Active label May 6, 2020

pieterlukasse reviewed May 11, 2020

View reviewed changes

Merge branch 'master' into optionalinputs

addd64f

rhpvorderman mentioned this pull request May 26, 2020

Implement input specification change broadinstitute/cromwell#5523

Closed

patmagee added Waiting for implementation and removed Voting Active labels Jun 24, 2020

Merge branch 'master' into optionalinputs

cd672b7

jdidion changed the base branch from master to main June 29, 2020 14:05

rhpvorderman added a commit to broadinstitute/cromwell that referenced this pull request Jul 1, 2020

Allow nested inputs when corresponding meta setting is set to true

aded97f

This implements part of the spec change that handles inputs openwdl/wdl#359

rhpvorderman mentioned this pull request Jul 1, 2020

Allow nested inputs when corresponding meta setting is set to true broadinstitute/cromwell#5562

Merged

mcovarr reviewed Jul 8, 2020

View reviewed changes

versions/development/SPEC.md Outdated Show resolved Hide resolved

Typo

653efbe

Co-authored-by: Miguel Covarrubias <[email protected]>

jdidion mentioned this pull request Nov 3, 2020

Review unsatisfied task inputs spec change, see if any dxCompiler changes are required dnanexus/dxCompiler#8

Closed

jdidion approved these changes Nov 3, 2020

View reviewed changes

Better period placement.

138b469

jdidion reviewed Nov 13, 2020

View reviewed changes

jdidion closed this Feb 1, 2021

Write a specification for unsatisfied task inputs. #359

Write a specification for unsatisfied task inputs. #359

Conversation

rhpvorderman commented Mar 6, 2020 • edited Loading

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rhpvorderman Mar 9, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cjllanwarne Mar 11, 2020 • edited Loading

Choose a reason for hiding this comment

cjllanwarne Mar 11, 2020 • edited Loading

Choose a reason for hiding this comment

orodeh commented Mar 9, 2020

rhpvorderman commented Mar 10, 2020 • edited Loading

orodeh commented Mar 10, 2020

cjllanwarne commented Mar 11, 2020

rhpvorderman commented Mar 11, 2020

rhpvorderman commented Mar 11, 2020

DavyCats commented Mar 11, 2020

rhpvorderman commented Mar 18, 2020

DavyCats commented Apr 14, 2020

rhpvorderman commented Apr 24, 2020 • edited Loading

patmagee commented Apr 29, 2020

rhpvorderman commented May 4, 2020

orodeh commented May 6, 2020

DavyCats commented May 7, 2020

pieterlukasse May 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pieterlukasse commented May 11, 2020

rhpvorderman commented May 26, 2020

patmagee commented Jun 24, 2020 • edited Loading

rhpvorderman commented Jun 25, 2020

DavyCats commented Aug 6, 2020

rhpvorderman commented Aug 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jdidion commented Feb 1, 2021

rhpvorderman commented Mar 6, 2020 •

edited

Loading

rhpvorderman Mar 9, 2020 •

edited

Loading

cjllanwarne Mar 11, 2020 •

edited

Loading

cjllanwarne Mar 11, 2020 •

edited

Loading

rhpvorderman commented Mar 10, 2020 •

edited

Loading

rhpvorderman commented Apr 24, 2020 •

edited

Loading

pieterlukasse May 11, 2020 •

edited

Loading

patmagee commented Jun 24, 2020 •

edited

Loading