WDL (Workflow Description Language) is a standardized language designed for bioinformatics workflows. It is designed to be portable and reproducible.
You can use the Terra webservice to easily run WDL workflows on GCP. You can also run them locally, in the cloud, or on an HPC using Cromwell or miniwdl.
These instructions assume you already have an account on Terra and a billing project set up.
- Go to the Dockstore entry of the workflow (as an example, here's myco Dockstore entry)
- On the right hand side of the Dockstore entry, select Terra under the heading "Launch with"
- Select which Terra workspace you wish to import into, or create a new one -- you'll then be taken to Terra
- In Terra, go to the workflow tab (it's on the top below the bright green header bar), select your workflow to run it
If you use Dockstore, and the author of the workflow has set up the repo to automatically sync with Dockstore, the workflow will be automatically updated on Terra when the author pushes a change to their workflow on GitHub/Dockstore. Additionally, you can select any branches or tagged versions of the workflow within Terra's workflow setup UI without needing to re-import the workflow.
- On Terra, go to the workflows tab (it's on the top below the bright green header bar), and select "Find a workflow"
- In the popup, under "Find additional workflows," select "Broad Methods Repository"
- Press the blue button in the top right corner that says "Create new method..."
- Fill out the namespace and workflow name (they do not need to match anything in your workspace), then copy-paste the WDL into the large text box
- Click the blue upload button
- Select which Terra workspace you wish to import into, or create a new one -- you'll then be taken to Terra
- In Terra, go to the workflow tab, select your workflow to run it
Using the Broad Methods Repository will not transfer over git versioning, nor will your copy of the workflow keep up-to-date automatically. If you want a new version of your workflow, you will need to copy-paste it into the BMR again.
You will need:
- miniwdl or the Dockstore CLI or Cromwell
- Python 3 if you're using miniwdl, or Java 11 (OpenJDK recommended) if using Dockstore CLI/Cromwell
- Docker Engine or Docker Desktop
- if you are on a Linux machine, it is advised not to use Docker Desktop -- use Docker Engine instead
- The WDL file (optional if using the Dockstore CLI and the WDL is on Dockstore)
- A JSON file that describes your inputs -- if any of them are files, use relative (to the workdir you will be running from) paths
- (Macs only) An overriden
TMPDIR
environment variable (e.g.export TMPDIR=/whatever
) to prevent Docker shenanigans
miniwdl run your_workflow.wdl -i your_inputs.json
miniwdl, unlike Cromwell, does not copy input files by default. If your WDL modifies input files such as trying to mv
them, you must use the --copy-input-files
option, or else you will get "device or resource busy" errors.
java -jar /Applications/cromwell-xx.jar run your_workflow.wdl -i your_inputs.json
Cromwell, unlike miniwdl, does not handle resouces on local backends very well by default. Cromwell's default behavior causes it to attempt to run multiple tasks/multiple instances of scattered tasks at the same time. This tends to cause tasks getting sigkilled, or for the Docker daemon to stop responding entirely. If you are running a WDL that uses scattered tasks, it is highly recommend to follow these instructions to make Cromwell/the Dockstore CLI only do one thing at a time.
The Dockstore CLI wraps Cromwell, so most Cromwell caveats and instructions apply to it too. However, the Dockstore CLI does add the ability to localize input files from a Google bucket using gs:// URIs, and can run WDLs directly from Dockstore.
Running a local WDL:
dockstore workflow launch --local-entry your_workflow.wdl --json your_inputs.json
Running a WDL from Dockstore:
dockstore workflow launch --entry full_dockstore_entry_name/your_workflow.wdl --json your_inputs.json
Dockstore entry names are sometimes GitHub URLs, for example:
dockstore workflow launch --entry github.com/aofarrel/myco/myco_sra:2.0.1 --json your_inputs.json
You can use --wdl-output-target
to put your workflow outputs into a remote path, such as an S3 bucket.
Many (not all) institutes do not allow Docker to run on their HPC systems for security reasons. Strictly speaking, there are ways to get around this limitation, but if you can run your WDL on a system that supports Docker, you have a greater chance of things working correctly.