Support Defining Environment Variables Within a Task #504

patmagee · 2022-05-13T20:36:42Z

This PR is the result of a few discussions that have been ongoing over the past while (#458 and #443) of different ways to make improve the security and usability of WDL.

Declare Environment Variables within a Task

This PR adds the ability to declare environment variables within the task scope, as well as to pass values for those variables within a call block. Most WDL engines rely on templating in values to the command section prior to command execution. While this is convenient and allows us to define our own expression language within a command block, it does mean that any WDL command is subject injection attacks or poor formatting during execution.

Using environment variables circumvents this issue by not relying on templating at all. Instead the execution engine safely escapes strings prior to setting them within the container (or other execution) environment at runtime. Environment variables are not usable within other WDL expressions and once defined are accessed using the normal shell semantics associated with the given interpreter.

Environment variables are all implicitly strings, and therefore can be set to any expression which either evaluates to a string, or is coercible to one.

task foo {
  input {
   String my_inp
  }
  env {
   MY_ENV_WITH_DEFAULT = my_inp
   MY_ENV
  }

  command <<<
    echo $MY_ENV $MY_ENV_WITH_DEFAULT
  >>>
}

workflow bar {
  
  call foo {
    input : my_inp = "foo"
    env: MY_ENV = "bar"
  }
}

Checklist

Pull request details were added to CHANGELOG.md

mlin · 2022-05-14T08:22:52Z

Thanks @patmagee for moving this along! As you know, I strongly support the general idea of enabling environment variables as an alternative (often a superior one) to ~{} text interpolations. As to the fine details of how to express it:

I'd prefer not to need the separate env: section of the calls. I think of the env entries as simply additional string inputs, which the engine will know to set in the container environment. If we make input & env more-separate concepts as proposed, then WDL implementations need to be extended to model them separately (for example- WDL.Task.available_inputs and WDL.Task.required_inputs need to mitosis into inputs & envs, and then all code consuming them needs to consider both). It'll be easier if we think of envs as just more inputs, that the runtime does something a bit different with.

So the call would just look like

  call foo {
    input : my_inp = "foo",
            MY_ENV = "bar"
  }

(Aside- I think we should get rid of the useless input: keyword in calls anyway, but that'll be another thread!)

Another idea worth considering was to use type specializations instead of a separate section, keeping the current declaration syntax, e.g.,

task foo {
  input {
   SomeStruct my_inp
   EnvString MY_ENV
  }

  EnvString STRUCT_JSON_FILE = write_json(my_inp)

  command <<<
    echo $MY_ENV $STRUCT_JSON_FILE
  >>>
}

workflow bar {
  
  call foo {
    input : my_inp = SomeStruct { ... },
            MY_ENV = "bar"
  }
}

This allows the task author to control which environment variables the caller can and can't override.

My feeling is that either of the above approaches do the job well and are somewhat simpler for both WDL authors and engine implementers.

mlin · 2022-05-14T08:46:54Z

Another angle on the prior comment: the container environment variables are an implementation detail that should be encapsulated inside the task. The caller doesn't really need to know whether a given input will turn into a container environment variable or not. The task author might want to change how they consume an input from interpolation to environment variable, and such a change needn't trigger required changes in calls too.

patmagee · 2022-05-14T13:18:43Z

@mlin that is a very valid point. The one thing I was trying to avoid was the automatically setting environment variables in the task based on input strings. Without a special prefix that would likely be error prone and potentially break many different tasks.

I must admit, I was initially torn between a type and a an env block. Both felt natural, the block felt like it encapsulated the concept of environment variables well. But having them buble up to the level of the workflow does feel odd to be honest. I guess that leaves us with a few approaches

Option 1: Keep the block in a task but require values to be set

This option would be a good way of being very declarative. We can keep the env block within a task that I created. The difference though is that every item must be assigned, either by a literal value or from an input value / private declaration.

The benefit to this approach is that it proves fine grained control over what gets exposed in the environment and it groups everything together.

task foo {
  input {
    String myInp
    Int maxNum
  }
  
  env {
   MYINP = myInp
   OTHER = "someValue"
   MAXNUM = maxNum
  }
}

The drawback to this approach is that it adds potentially more boilerplate to a task. The declared environment variables would still only be usable within the command block, but the workflow has no concept of environment variables

Option 2: Use a declared type

This would be following your suggestion and use a type like EnvString. It makes sense and I rather like it, especially if that string is treated like a normal string in any other wdl expression.

task foo {
  input {
   EnvString FOO
  }
}

The drawback of this approch is it adds a new type handled the same as a string but with special meaning

Option 3: add some sort of qualifier to the type

We could also go the route of adding an additional qualifier to the type or the declaration some ideas would be.

task foo {
  input {
    # explicitly state where to expose the string. this would be my favourite probably
    String FOO in env
    # use a type qualifier preceding the type. This one could also be my favorite
    env String FOO
    # Use some sort of qualifier coming after the type declaration
    String@env FOO
    String:env FOO
    String{env}

Type qualifiers have the drawback (or benefit) of being applicable to any type. Therefore any type could potentially be set in the env and it's the engines role to figure that out. Ie serialization of a struct or other complex type

I am definitely with you that we do not want to expand the concept of what constitutes an input, at least at this point.

mlin · 2022-05-15T03:54:41Z

# use a type qualifier preceding the type. This one could also be my favorite
env String FOO

I like this too!

One further question would be whether to allow env to prefix any type (Int, SomeStruct, etc.) and let the engine string-ify it (perhaps as JSON for compound types).

I think I'd say yes, but could be convinced otherwise

patmagee · 2022-05-15T16:47:36Z

@mlin I do not see a reason to limit this to only strings. I think allowing any type would gain a lot of support from the community. You could put it in a struct then easily use something like jq to manipulate the contents of the env variable

env Map[String,Array[String]] COMPLEX_TYPE
env MyStruct FOO

One question then is whether the env qualifier can be used outside of the task block. And whether it changes the type signature at all. For example would String and env String be equivalent? I think the answer is yes in the context of wdl expressions. Which if that is the case, the env keyword is less a qualifier and more a directive, telling the engine how to handle that variable

patmagee · 2022-05-15T16:57:31Z

Adding something like directives as a concept can generalize as well. For example in a task you could define readonly File myFile or remote File foo to indicate that a file should not be localized.

rhpvorderman · 2022-05-16T05:43:34Z

Using environment variables circumvents this issue ..."

"Environment variables are not usable within other WDL expressions"

Due to environment variables not being usable in WDL expressions, that limits their power quite a bit. For some input variables being able to derive other variables from them is quite essential for the good functioning of a task. Therefore environment variables are never a complete solution. As such there is not one way to define inputs, but two equally valid ways when this becomes part of the spec.

and once defined are accessed using the normal shell semantics associated with the given interpreter.

I find this a bit shaky ground, as this means that the engine and the shell will be more tightly coupled. I know the current spec defines bash as the standard shell, but some containers do not have bash. As such, on biowdl/tasks we try to avoid bashisms to make sure any task can run on other shells as well.

My main problem with adding environment variables is that instead of one, we create two ways of defining inputs. This is quite confusing, as it is quite arbitrary which method is to be preferred. If I may quote the Zen of Python: "There should be one-- and preferably only one --obvious way to do it." This change violates that principle and therefore it makes WDL less usable.

Given that the actual problem that we are trying to solve here is " injection attacks or poor formatting during execution" I think the solution should be more tailored towards that end. Another solution is to quote all WDL variables by default while templating and to provide a function that disables this (conveniently called unsafe(...)). That is a much smaller and more targeted solution to the problem at hand.

mlin · 2022-05-16T06:57:49Z

Environment variables are a universal mechanism that every shell/language interpreter will have an idiomatic way to consume, and will also be more familiar to authors learning WDL who do have other scripting experience.

I think shakier ground is the suggestion that engines could escape each value in a way that would be so universally safe and compatible. That suggestion works better if, on the contrary, we double-down on specifying that bash is the interpreter -- but even then, we'd face the common pattern of the bash script running an embedded heredoc in another language.

Re "Environment variables are not usable within other WDL expressions" -- I doubt that restriction is really needed. I view them as additional input declarations, so of course, they should be usable in other WDL expressions.

Re creating two ways of defining inputs -- it's an important point but my personal opinion is that the interpolation of strings directly into the command script was a minor design flaw from the beginning, and we should gradually deprecate them in favor of environment variables; but I don't expect everyone to share that opinion exactly...

rhpvorderman · 2022-05-16T07:12:59Z

@mlin. Agreed. Redesigning WDL to only use environment variables and deprecate the templating is the better option.

Re "Environment variables are not usable within other WDL expressions" -- I doubt that restriction is really needed. I view them as additional input declarations, so of course, they should be usable in other WDL expressions.

If I add up things correctly I come to the following conclusions:

There is no need for an additional syntax. Input variables can be declared in their current way.
Input variables can be used in expressions in any part of the WDL. Except the command block.
Inside the command block, only variable names can be used not expressions (i.e. ~{my_var} not {my_var * 5}. These variables are passed as environment variables.

This solves the templating/injection issue. While restricting the language instead of expanding it. That seems preferable to me.

cjllanwarne · 2022-05-16T18:55:22Z

Redesigning WDL to only use environment variables and deprecate the templating is the better option.

That's a really interesting concept. It would let us remove the (command) interpolation syntax entirely. And allow things like arrays to be supplied natively. Are there any use cases which would get worse if this were the case?

rhpvorderman · 2022-05-17T05:09:37Z

Are there any use cases which would get worse if this were the case?

Supplying optional variables comes to mind. Where you can do ~{"--optional-flag " + optional_value} now. That is a very common use case.

I guess that could still be made possible for optional values without templating? So that it either evaluates to "" or --optional-flag $OPTIONAL_VALUE.

In retrospect my statement "Input variables can be used in expressions in any part of the WDL. Except the command block." was a bit harsh. I guess you can still allow any sort of expression, as long as it evaluates to a singular variable that can be saved as an environment variable and does not depend on being templated to text.

rhpvorderman · 2022-05-27T12:50:24Z

And another thing that is hairy: How to work with tools that use the same flag multipe times to specify output files. For instance -o output1.fastq.gz -o output2.fastq.gz -o output3.fastq.gz.
Since you can have either R1, R1,R2, or R1,R2,R3 it makes sense to have the tool use a flexibel output flag like that. How to use that when disallowing templating? (I think templating is not too bad in this case, since the developer specifies the sep, the env vars can still be created as ENV_ONE, ENV_TWO etc.)

illusional · 2022-05-27T23:10:50Z

CWL has a pretty nice mechanism for constructing command lines, I think there's value in being separate from that though, for that reason I do like the command block in WDL, and generally making information available via env variables

patmagee · 2022-06-14T12:56:52Z

So, how about this for a concrete proposal:

Instead of the env { .... } block introduced in this PR, we add a new concept to WDL called Directives or Decorators (lets fight over what name seems better here). These are symbols that immediately preceed the type declaration and provide additional handling information for a specific Wdl Declaration.

In relation to environment variables, we can use the env direction/decorator/annotation to signal to a task that an environment variable should be generated from the given WDL declaration.

task foo {
  input {
    env String HELLO
  }

  # Also works
  String HELLO_FROM_BOB = HELLO + " from bob"

  command <<<
    echo ${HELLO}
    # Still works
    echo ~{HELLO}
  >>> 

}

mlin · 2022-06-17T08:23:20Z

@patmagee while I like the concrete proposal env String etc., as well as some of the other ideas mentioned, I'm a little wary of calling it an instance of a more-general feature (Directive/Decorator) without thinking through what else we might have that feature do, which might delay us.

As an example of the can of worms I think that could open, a long time ago @orodeh and I sketched out adding a metadata section for each input declaration, e.g.

input {
  File reads { help: "BAM file", patterns: ["*.bam"] }
}

To be clear: I'm NOT trying to advance that now, just mentioning the kind of thing that might bog us down if we get into use cases for a general concept like Directive/Decorator.

patmagee · 2022-06-17T14:26:55Z

@mlin good point, I am happy to keep this restricted to just the env keyword for now and see where we naturally extend this too

patmagee · 2022-08-10T20:09:59Z

I modified the PR to remove the explicit env {...} section to allow ANY task input or declaration to be set as an environment variable using the env modifier

patmagee · 2022-09-21T16:34:28Z

@mlin @illusional @rhpvorderman @cjllanwarne do you have any issues with the current approach?

illusional

Generally happy with this approach, except I don't think you should be able to mark an environment variable from a workflow call

I'm still not sure how you would achieve something like this with the new approach.

--param1 value1 --param2 value2

illusional · 2022-09-21T22:32:12Z

versions/development/SPEC.md

@@ -2791,7 +2868,7 @@ workflow main_workflow {

 Any required workflow inputs (i.e. those that are not initialized with a default value or expression) must have their values provided when invoking the workflow. Inputs may be specified for a workflow invocation using any mechanism supported by the execution engine, including the [standard JSON format](#json-input-format). 

-By default, all calls to subworkflows and tasks must have values provided for all required inputs by the caller. However, the execution engine may allow the workflow to leave some subworkflow/task inputs undefined - to be specified by the user at runtime - by setting the `allowNestedInputs` flag to `true` in the `meta` section of the top-level workflow. For example:
+By default, all calls to subworkflows and tasks must have values provided for all required inputs and environment variables by the caller. However, the execution engine may allow the workflow to leave some subworkflow/task inputs/environment variables undefined - to be specified by the user at runtime - by setting the `allowNestedInputs` flag to `true` in the `meta` section of the top-level workflow. For example:


Is this still relevant? I think it's more clear to say "any input marked with the env directive is required, or not too.

illusional · 2022-09-21T22:34:33Z

versions/development/SPEC.md

@@ -2576,14 +2652,15 @@ workflow wf {

  # Calls my_task with one required input - it is okay to not
  # specify a value for my_task.opt_string since it is optional.
-  call my_task { input: num = i }
+  call my_task { input: num = i env: NUM = i }


I don't super love this. I think all tool interactions should be handled by the tool wrapper and not be the caller.

I second this.

That should definitely be removed, I think its outdated from the previous iteration. I will update the PR

rhpvorderman · 2022-09-23T06:35:36Z

GOVERNANCE.md

 | Patrick Magee      | DNAstack                        | [patmagee](https://github.com/patmagee) |
 | Brian O'Connor     | University of California, Santa Cruz | [briandoconnor](https://github.com/briandoconnor) |
 | Geraldine Van der Auwera | Broad Institute | [vdauwera](https://github.com/vdauwera) |
 | Christopher Llanwarne | Broad Institute | [cjllanwarne](https://github.com/cjllanwarne) |
 | John Didion | DNAnexus | [jdidion](https://github.com/jdidion) |
+| Michael Franklin | Centre for Population Genomics | [illusional](https://github.com/illusional) |
+| Amy Paguirigan | Fred Hutch | [vortexing](https://github.com/vortexing) |
+| Ruben Vorderman | Leiden University Medical Center | [rhpvorderman](https://github.com/rhpvorderman) |


There is some PR cross-contamination here.

oops, will fix!

rhpvorderman · 2022-09-23T06:48:02Z

@patmagee, my primary comment is that I think this is a inverse approach by requiring the maintainer to put env in front off everything to make sure no shell code can be injected. Forget the env keyword one time? That is a security-hole.
If security is a concern, environment variables should be the default, and in cases where that leads to problems a template keyword should be used to distinguish variables that rely on the old behavior. EDIT: (This would of course be a change for 2.0 then, as it is not fully backwards-compatible.)

Speaking from a BioWDL perspective: we don't have any pressing security concerns since all our tasks are containerized, and only user inputs are mounted in the containers. The containers are run as the UID exclusively and not as root. Furthermore our pipelines are not run as a service, and run by individual users on our compute cluster. They can only break things they have the right to break in the first place so it makes no sense adding lots of guards against this. In practice security is a non-issue for us. I say this not to invalidate the PR, but to give the proper weight to my own opinion: very low. We have no stake in this.

rhpvorderman · 2022-09-23T07:23:12Z

I have been thinking somewhat more about this and I start to wonder: what does this feature protect you from that containerization does not already? If users can submit their own workflows to the system, then this addition is pointless. So this protects specifically those instances where an entity provides WDL as a service, but did not containerize the backend properly? (Instead opting to provide all bioinformatics tools under the sun in a single environment without dependency conflicts). That seems a very unlikely use case.
There is a clear trade-off at work here. The language will be more complex, but also more secure. But how much more secure? Without a detailed assessment of the security implications it is impossible to make a proper judgement. On the other hand, with the changes I proposed above (environment variables are default) the complexity cost is going to be so low that it is worth it, in my opinion.

patmagee · 2022-09-23T13:08:17Z

@rhpvorderman you make some good points. To start, containerization gets you pretty far down the road of security assuming that the system is using a secure container technology. Unfortunately even in the most restricted environments where users can only pick from a list of pre-approved workflows there is the opportunity for abuse. For example, If you happen to know the execution environment, you could construct a nefarious string that would run a specific attack within the container (not so bad), but more importantly within the network the container is running in potentially under any authority assigned to that container/vm. Google tools are great examples of this, where there is a metadata api allowing you to run gsutil within a container and actually retrieve a service account dynamically simply by running gsutil on a VM.

So all that is to say, I think in some circumstances the security concerns are warranted, but I agree that the overall impact to the user should be as minimal as possible. To be honest, the original motivator was to not override any existing variable within the container by automagically converting inputs into env variables. This actually may not be a big concern, especially not if this is a well documented feature. If all values were accessible either through the WDL semantics (~{foo}) AND their shell semantics ('${foo}`) it would probably be the most flexibly and lowest impact to the user.

mlin · 2022-11-25T10:03:58Z

Included this feature in miniwdl v1.8.0 version development; def invite anyone interested to hammer on it, can still change.

patmagee · 2022-12-22T15:38:28Z

@mlin I tried out this feature, and I have to say, really love it as defined. Honestly I am 100% happy with it :D

patmagee · 2022-12-22T16:17:29Z

I think this is ready for final review, and if everything looks good we can get it voted on

jdidion

LGTM - just a few small suggestions/questions

jdidion · 2022-12-22T17:08:48Z

versions/development/SPEC.md

+### Environment Variables
+
+Any input to a task may be converted into an environment variable that will be accessible from within the task's 
+shell environment during command execution. Unlike inputs, environment variables are not templated into the command when


I might say "interpolated" or "replaced in" rather than "templated into"

jdidion · 2022-12-22T17:09:19Z

versions/development/SPEC.md

+Any input to a task may be converted into an environment variable that will be accessible from within the task's 
+shell environment during command execution. Unlike inputs, environment variables are not templated into the command when
+preparing the execution script, but are actually available at runtime directly from the environment. They may be accessed using
+normal shell variable access semantics (ie `$FOO` or `${FOO}`).


jdidion · 2022-12-22T17:11:15Z

versions/development/SPEC.md

+environment, it does not change its access to normal WDL expressions within a command. That means you can refer to a declaration
+annotated with `env` either through the shell semantics (`${FOO}`) or through normal WDL semantics (`~{FOO}`).
+
+When an input is annotated with `env` it is the engines responsibility to serialize the value appropriately into a string.


jdidion · 2022-12-22T17:12:08Z

versions/development/SPEC.md

+annotated with `env` either through the shell semantics (`${FOO}`) or through normal WDL semantics (`~{FOO}`).
+
+When an input is annotated with `env` it is the engines responsibility to serialize the value appropriately into a string.
+Serializing complex values (`Map`, `Pair`, `Struct`, `Array`) should be equivalent to calling `write_json` on the given


Add a link [WDL Value Serialization and Deserialization](#appendix-a-wdl-value-serialization-and-deserialization)

Also, write_json writes the value to a file. Is that the desired behavior - that the environment variable will have a path to the file with the serialized value?

Great point to discuss:

On the one hand, if the developer definitely wants the JSON in a file then they can easily add a private declaration env File x_json = write_json(x). Given that's easy, maybe the environment variable for a compound value should contain the JSON directly.

On the other hand, JSON directly in the environment variable isn't very natural to work with in shell; often you'd end up just dumping it into a file to be parsed anyway. Also, developers should be discouraged from from cramming large compound values directly into environment variables, because there will usually be some system-imposed limit on environment size.

TBH, env File x_json = write_json(x) makes the intent much clearer to somebody reading the code, so whichever way we go on this question, I think we should actually discourage using env with compound values at all.

@mlin JSON directly as environment variables is actually not as cumbersome as you may think. We regularly use it in our own CI/CD infrastructure and pass values to and from tools like jq which are great at parsing json. We rarely ever need to use temporary files and just pipe input from one command to the other, storing intermediate entries in a environment variable.

I think pointing to the example of the write_json is actually not an accurate representation of what is happening. we probably want to just say that complex types will be serialized to JSON prior to being set as an environment variable.

There is of course the limitation on size of the environment variables. So Maybe we can provide guidance here: "Engines may impose limits on the total length a single environment variable is allowed to occupy as well as the number of environment variables that are allowed to be passed into a single task. If such limitations exist, engines should clearly document them for the end user."

patmagee · 2023-01-09T20:45:22Z

@mlin @jdidion I added a new sentence on limitations that may be imposed by engines and clarified the serialization, Ready for another look!

patmagee · 2023-02-13T19:19:01Z

I think with the recent updates, this PR is ready for voting and inclusion in the 1.2 Spec. If there is no one opposed, I will happily mark this as open for voting

mlin · 2023-03-26T23:31:19Z

@patmagee For your consideration...I asked GPT-4 to rewrite a few of the prose paragraphs more concisely. (Turning into a standard practice!)

Task inputs can be converted into environment variables accessible during command execution. Environment variables are accessed using shell variable syntax, like $FOO or ${FOO}. To access an input value as an environment variable, add the env modifier before a declaration within the task, including input sections and private declarations.

Despite the env modifier making declarations available in the environment, it does not alter access to WDL expressions within a command. Declarations annotated with env can be referred to using shell syntax (${FOO}) or WDL syntax (~{FOO}).

It is the engine's responsibility to serialize input values annotated with env into strings (see Serialization of WDL values) and set them as environment variables. Engines may impose limits on environment variable length and quantity, and should provide clear documentation on any limitations.

(129 words, down from 245)

mlin · 2023-03-26T23:54:31Z

I have some doubts about this:

When a value is declared with the env modifier, it becomes
the execution engine's responsibility to escape the string thus preventing any sort of interpolation. Functionally,
using an environment variable is the same as first escaping the string of any special characters and then wrapping it single quotes.
single_quote(escape(${variable}))

I think it's quite unusual to bind the environment variable to an escaped value. It means our environment variables have a peculiar "format", which seems to counteract the goal of using less-WDL-specific practices. I think the environment variables should be bound to the exact (stringified) value, so that task commands would inherit shell scripting's best practices and pitfalls for handling potentially-malicious environment variables securely. Chiefly, they should be used inside double-quotes, or if necessary explicitly escaped using printf "%q". Implicitly escaping the environment variable values would be a more-secure default, but the goal of the env feature IMHO should be to "let shell scripts be shell scripts."

Automatically escaping the stringified value feels to me like something I wish we'd done when we introduced the WDL-specific ~{} interpolation syntax. Maybe for 2.0?

patmagee · 2023-03-31T14:36:29Z

@mlin I think you are right, and the stanza you highlighted is really a bit of a misguided attempt and enforcing how env variables are set.

Fundamentally what I was trying to achieve with that statement is the ability to set environment variables without breaking scripts. This is probabyl more perscriptive then it needs to be, as all I really am trying to state is the engine should set the literal value similar to how the docker .env files works

jdidion · 2023-12-17T23:18:04Z

@patmagee please change target to wdl-1.2

patmagee · 2023-12-21T19:25:18Z

@jdidion I have update the spec against 1.2

markjschreiber · 2024-01-24T15:40:48Z

That being said, it is possible that behaviour could also become desirable. For example if you were using gcloud in a command, you could simply set an input called File GOOGLE_APPLICATION_CREDENTIALS and your gcloud would automatically authenticate with the right credentials.

This is probably not a use case that we would want to advertise, at least for containers on remote machines. Putting credentials into environment variables for containers is not a great idea. It is better to have the underlying host machine use an appropriate IAM role. For local execution it is probably OK (as long as you know the workflow is not going to echo them to logs or someone else's machines).

markjschreiber · 2024-01-24T15:47:16Z

When a task is run outside of a container then presumably the environment is applied to the shell executing that tasks command. Do we need to specify where the environment variable is applied for a containerized task? Would the environment variable be applied to the container (for example as part of some docker command) or would the variable be applied to the command shell that will run in the container?

patmagee · 2024-04-23T19:31:40Z

@markjschreiber is the distinction that the environment variable may change depending on whether it is injected into the task shell script vs into the docker image?

markjschreiber · 2024-05-17T15:16:32Z

@markjschreiber is the distinction that the environment variable may change depending on whether it is injected into the task shell script vs into the docker image?

Pretty much. I think it's fine to clarify with something like - if the task is run in a container the env var is "injected" into the container and not applied to the shell on the host that runs the container.

patmagee · 2024-05-22T05:46:22Z

@markjschreiber would saying something like the following work

The environment variable should be evaluated by the engine prior to injecting it into the execution environment. if the task is run in a container the env var is "injected" into the container and not applied to the shell on the host that runs the container.

markjschreiber · 2024-05-22T16:47:02Z

@markjschreiber would saying something like the following work

The environment variable should be evaluated by the engine prior to injecting it into the execution environment. if the task is run in a container the env var is "injected" into the container and not applied to the shell on the host that runs the container.

Perfect!

patmagee added enhancement RFC labels May 13, 2022

patmagee requested review from jdidion, mlin, illusional and rhpvorderman and removed request for jdidion September 21, 2022 16:34

illusional reviewed Sep 21, 2022

View reviewed changes

rhpvorderman reviewed Sep 23, 2022

View reviewed changes

patmagee force-pushed the env branch from ebe4e52 to 4f3f705 Compare December 22, 2022 16:12

patmagee requested review from rhpvorderman, illusional and jdidion December 22, 2022 16:17

jdidion approved these changes Dec 22, 2022

View reviewed changes

patmagee added this to the 1.2 milestone Feb 13, 2023

mlin mentioned this pull request Feb 18, 2023

prototype runtime.autoEnv chanzuckerberg/miniwdl#503

Closed

jdidion assigned patmagee Mar 23, 2023

patmagee changed the base branch from main to wdl-1.2 December 21, 2023 18:09

Rebased PR to wdl verison 1.2

a2cecdb

patmagee force-pushed the env branch from 8748908 to a2cecdb Compare December 21, 2023 19:23

Added comment on where evaluation of env vars should happen

f5901e7

markjschreiber approved these changes May 22, 2024

View reviewed changes

patmagee merged commit 32cb760 into openwdl:wdl-1.2 May 22, 2024

Support Defining Environment Variables Within a Task #504

Support Defining Environment Variables Within a Task #504

Conversation

patmagee commented May 13, 2022

Declare Environment Variables within a Task

Checklist

mlin commented May 14, 2022 • edited Loading

mlin commented May 14, 2022 • edited Loading

patmagee commented May 14, 2022 • edited Loading

Option 1: Keep the block in a task but require values to be set

Option 2: Use a declared type

Option 3: add some sort of qualifier to the type

mlin commented May 15, 2022

patmagee commented May 15, 2022

patmagee commented May 15, 2022

rhpvorderman commented May 16, 2022

mlin commented May 16, 2022

rhpvorderman commented May 16, 2022

cjllanwarne commented May 16, 2022

rhpvorderman commented May 17, 2022

rhpvorderman commented May 27, 2022

illusional commented May 27, 2022

patmagee commented Jun 14, 2022

mlin commented Jun 17, 2022

patmagee commented Jun 17, 2022

patmagee commented Aug 10, 2022

patmagee commented Sep 21, 2022

illusional left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rhpvorderman commented Sep 23, 2022 • edited Loading

rhpvorderman commented Sep 23, 2022

patmagee commented Sep 23, 2022

mlin commented Nov 25, 2022

patmagee commented Dec 22, 2022

patmagee commented Dec 22, 2022

jdidion left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patmagee commented Jan 9, 2023

patmagee commented Feb 13, 2023

mlin commented Mar 26, 2023

mlin commented Mar 26, 2023

patmagee commented Mar 31, 2023

jdidion commented Dec 17, 2023

patmagee commented Dec 21, 2023

markjschreiber commented Jan 24, 2024

markjschreiber commented Jan 24, 2024

patmagee commented Apr 23, 2024

markjschreiber commented May 17, 2024

patmagee commented May 22, 2024 • edited Loading

markjschreiber commented May 22, 2024

mlin commented May 14, 2022 •

edited

Loading

mlin commented May 14, 2022 •

edited

Loading

patmagee commented May 14, 2022 •

edited

Loading

rhpvorderman commented Sep 23, 2022 •

edited

Loading

patmagee commented May 22, 2024 •

edited

Loading