RFC: The WDL Extended Library #488
Replies: 5 comments 19 replies
-
As a member of "team simplicity", I think the cognitive load of an extended official library is much higher than adding a few functions. As for the functions. I see that I indeed mentioned |
Beta Was this translation helpful? Give feedback.
-
I disagree on the cognitive load of an extended library. Since it is completely optional, a user can choose to ignore it and simply code their own tasks if they wish to do so. |
Beta Was this translation helpful? Give feedback.
-
If the tasks are executed by the server this could have a negative impact on scalability and separation of concerns. Ideally the server/ head process should only be interested in interpreting a workflow and distributing work to task executors. Executing UDFs on the server or head process is going to place unpredictable requirements on the size and capabilities of that node which isn't going to be able to resize or provision resources on demand like a task can. This would also reduce portability. |
Beta Was this translation helpful? Give feedback.
-
I think this is a nice compromise on the tension of how to optimize common tasks. Provided the standard library sticks to, well, common tasks - this would provide a set of WDL tasks that stay static enough that an implementation could choose to do something clever with them, but yet other implementations could treat as standard tasks. I picture this as an evolution on hints. |
Beta Was this translation helpful? Give feedback.
-
I feel this is a slipper slope to what BioWDL is. I don't know the user-experience for referencing tasks from a repo is, but maybe documenting that explicitly, then making BioWDL an official "bioinformatics" channel of common tools, then this would be an explicit library of generic common operations, you could have one for other domains too. |
Beta Was this translation helpful? Give feedback.
-
Problem
There has always been (and probably will always be) a tension in the WDL community between those who emphasize simplicity and those who emphasize ease of use. This tension is most commonly observed in discussions around the standard library: team simplicity prefers to avoid adding functions that aren't absolutely necessary while team usability wants to add functions that implement common operations, even if those could otherwise be implemented as tasks. There have been many proposals that attempt to address this, such as various approaches to user-defined functions, but all of these have met with at least some opposition from among the governance team.
Arguments for simplicity
Arguments for usability
values
function that would take aMap
and return the values of theMap
as anArray
. To implement this as a task, one would need to:Map
to JSON.Proposal
I believe that both sides can be satisfied, and the benefits of both approaches can largely be achieved, by providing an "official" library of WDL tasks independently of the WDL specification.
In practice this would be a GitHub repository (perhaps this one, but probably better if it's separate) that contains a set of WDL tasks. I propose that all tasks (and any related structs) be contained in a single WDL file, but there could also be an argument made for each task being self-contained in its own WDL file.
Tasks would be added to the repository by the community. There would be development practices and standards that would have to be followed.
Ideally, the tasks will be importable using a short, easy to remember URL. I propose that we alias
https://openwdl.org/<version>
to the repository, so a user couldimport "https://openwdl.org/1.1/lib.wdl" as lib
.Importantly, once a task is added to the official library, it cannot be renamed or removed (though it can be deprecated). This enables a runtime to optionally provide a built-in version of the task as an optimization. For example, when the runtime sees the import above, it can choose to replace any call to a task in the
lib
namespace with a call to a built-in function.Beta Was this translation helpful? Give feedback.
All reactions