Import and run CWL workflows on DNAnexus
THIS IS AN ALPHA-PHASE PROJECT. Please use at your own risk or contact DNAnexus if you are interested.
We have tested this implementation on a few practical workflows of varying complexity and are working towards more complete support of the specification. More tests, documentation, and improvements to the user experience to come shortly.
The motivation behind dx-cwl
is to compile a CWL workflow definition to a DNAnexus workflow. This approach enables the user to execute a CWL workflow on DNAnexus and take advantage of the platform's many features including secure execution on multiple regions/clouds. We use a reference CWL implementation and data structures when possible to adhere maximally to the standard. CWL types are mapped directly to DNAnexus types when possible and when not, these structures exist as a general JSON data types within the platform.
Coming soon.
- Ensure you have recent version of dx-toolkit
- Install cwltool
- Install PyYAML
- Clone this repository and run
./get-cwltool.sh
to obtain the appropriate cwltool for DNAnexus applications - Please create an API token and select a project ID that you would like to compile the workflow in
To compile a workflow, simply point dx-cwl
to a local workflow on your platform and be sure to provide your authentication token and project name.
The example below is a test CWL of a bcbio workflow.
python dx-cwl compile-workflow examples/test_bcbio_cwl/somatic/somatic-workflow/main-somatic.cwl --token $TOKEN --project $PROJECT
To execute a workflow much like you would with the reference implementation, simply upload the data files and CWL input file onto the platform and run this command on your local installation of dx-cwl
.
python dx-cwl run-workflow main-somatic/main-somatic test_bcbio_cwl/somatic/somatic-workflow/main-somatic-samples.json
Here main-somatic
is the workflow that was compiled to DNAnexus and it is contained in the main-somatic/
directory on the platform along with other applications and resources required for the workflow. test_bcbio_cwl/
is literally a copy of the files in that repository on the DNAnexus cloud.
Note that the compiled workflow can be used directly as a typical workflow on DNAnexus as well.
Please see the ENCODE example for a more detailed walk-through.
- Influenced by dxWDL
- Brad Chapman, Bioinformatics Core at Harvard School of Public Health