Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Add Optional() function for Objects and Maps. #174

Closed
rhpvorderman opened this issue Jan 2, 2018 · 8 comments
Closed

Request: Add Optional() function for Objects and Maps. #174

rhpvorderman opened this issue Jan 2, 2018 · 8 comments

Comments

@rhpvorderman
Copy link
Contributor

rhpvorderman commented Jan 2, 2018

Hi there.

I would like to request an Optional() function for map-like objects. If a key is not present in an object, normally a keyError returns. Optional(object.key) will return None when the key is not present. I could not find such a thing in the current WDL implementation (but I may have overlooked).

That would be useful in these types of situations:

task exampleTask {
  String requiredVariable
  String? optionalString = "sensible default here"
  Int? optionalInt = 3
  Boolean? optionalBool = false
  command {
  # insertcommandhere
  }
}
workflow exampleWf {
  Array[Object] objectList
  scatter(exampleObject in objectList) {
    call exampleTask {
      input: 
        requiredVariable=exampleObject.requiredVariable
        optionalString=Optional(exampleObject.optionalString)
        optionalInt=Optional(exampleObject.optionalInt)
        optionalBool=Optional(exampleObject.optionalBool)
     }
  }
}

Originally (without optional) your inputs.json would look like this:

{
  "exampleWf.objectList":
  [
    {
      "requiredVariable": "Sample1",
      "optionalString": "someOption",
      "optionalInt": 3,
      "optionalBool": false
    },
    {
      "requiredVariable": "Sample2",
      "optionalString":  "sensible default here",
      "optionalInt": 5,
      "optionalBool": false
    }
  ]
}

But with Optional() it can look like this:

{
  "exampleWf.objectList":
  [
    {
      "requiredVariable": "Sample1",
      "optionalString": "someOption"
    },
    {
      "requiredVariable": "Sample2",
      "optionalInt": 5
    }
  ]
}

When there are 10 optional values, this save a lot of typing and unnecessary duplication in the JSON. Why specify all these values if they have sensible defaults? There are quite some command line tools with many optional parameters that this feature seems very useful to me. Real word example:

task centrifugeDownload {
    String centrifugeOutput
    Array[String] domain
    String? seqTaxMap
    String? database = "refseq"
    String? centrifugeDownloadExecutable = "centrifuge-download"
    String? assemblyLevel = "Complete Genome"
    String? refseqCategory =  "any"
    Array[String]? taxIds = ["any"]
    Boolean? filterUnplaced = false
    Boolean? maskLowComplexRegions = false
    Boolean? downloadRnaSeqs = false
    Boolean? modifyHeader = false
    Boolean? downloadGiMap = false

    command {
        ${centrifugeDownloadExecutable} \
        -o ${centrifugeOutput} \
        -d ${sep=',' domain} \
        -a "${assemblyLevel}" \
        -c ${refseqCategory} \
        -t ${sep=',' taxIds} \
        ${true='-r' false='' downloadRnaSeqs} \
        ${true='-u' false='' filterUnplaced} \
        ${true='-m' false='' maskLowComplexRegions} \
        ${true='-l' false='' modifyHeader} \
        ${true='-g' false='' downloadGiMap} \
        ${database} \
        ${">> " + seqTaxMap}
    }
 }

I happen to have some experience with programming in Scala, so if you give me some pointers, I could possibly implement this myself.

@patmagee
Copy link
Member

By convention map values are optional, as there are no restrictions on what the key value can be. The key error is really an access error which is what you would expect to receive in the event of a missing value.

What I think would make a better approach would be a single engine function or modifying an existing one, to test for a key in a map.

task {
  Map[String,Int] my_values
  Int code = if contains_key(my_values,"key") then my_values["key"] else 0
  Int? code_2 = if contains_key(my_values,"key_2") then my_values["key"] else null
}

@geoffjentry @cjllanwarne has there ever been any talk of including a check like this?

@rhpvorderman
Copy link
Contributor Author

I just discovered that when using a Map[String,String] missing keys evaluate to null by default. That solves my issue!

@patmagee
Copy link
Member

@rhpvorderman is this still an issue?

@rhpvorderman
Copy link
Contributor Author

@patmagee
Sorry, for my late reply.
Solved the issue partly by using Map[String, String?]. I was able to get working pipelines this way.
But still I had to set null values in the JSON.

I think this issue will be solved as soon as Structs are implemented. Which allow such things.
#160

@rhpvorderman
Copy link
Contributor Author

Structs are implemented now, so away we go.

@freeseek
Copy link
Contributor

The following hack allows to solve the issue without having to set null values in the JSON:

scatter (key in keys(map)) { Boolean? is_key = if key == "xxx" then true else None }
Boolean is_key_in_map = select_first(flatten([is_key, [false]]))
if (is_key_in_map) { ... }

It is a little elaborate, but it basically works as a contains_key(map, "xxx") function

@freeseek
Copy link
Contributor

The following works a bit better:

scatter (key in keys(map)) { Boolean? is_key = if key == "xxx" then true else None }
if (length(select_all(is_key))>0) { ... }

@freeseek
Copy link
Contributor

There is actually a better way to imitate an is_in(X, Array[X]) function without having to use a scatter:

version development
workflow main {
  input {
    Array[String] array = ["a", "b", "c"]
    String element = "d"
  }
  output {
    Boolean is_in = length(collect_by_key(zip(flatten([array,[element]]),range(length(array)+1)))[element])>1
  }
}

To imitate a map_contains_key(X, Map[X,Y]) function, you can use the function keys() to extract an Array from a Map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants