Skip to content

Latest commit

 

History

History
55 lines (38 loc) · 3.55 KB

File metadata and controls

55 lines (38 loc) · 3.55 KB

Workflows and Package Management

Reproducibility and package management techniques: workflow languages (CWL, Snakemake, and Conda). This course introduces some of the approaches for package management and how to create reproducible workflows or pipelines.

Competencies

This session seeks to impart the following competencies:

  1. Knowledge and skills: Bioinformatics tools and their usage.
  2. Knowledge and Skills: Command line and scripting based computing skills appropriate to the discipline.

Learning Outcomes

By the end of this session, and the projects that follow, the learner should be able to:

  1. Select the best workflow and package managers based on the task at hand
  2. Implement a genomic pipeline in at least one workflow manager
  3. Set up a reproducible analysis environment

Outline

  • Introduce the high-level concept of workflows and high throughput data analysis
  • Hands-on activities for setting up the packages
  • Introduce package management and how we can use conda to increase reproducibility with workflows
  • Introduce the theory of workflows: with emphasis on one language (say, snakemake)
  • Hands-on activities of developing workflows

Slides

  1. Using Bioconda to streamline software installation for bioinformatics
  2. Workflows and Pipelines
  3. Package mgmt| resource mgmt | reproducibility

Tutorials

  1. Package Management with conda
  2. Workflow with Snakemake will provide a quick introduction then we'll dive deeper using Reproducible Research tutorial.See this tutorial also
  3. Nextflow and Singularity tutorial
  4. Docker Tutorial
  5. Common Workflow language tutorial. We will not cover this, but we provide links to useful tutorials for you to explore and learn further. Also see this and this(https://andrewjesaitis.com/2017/02/common-workflow-language---a-tutorial-on-making-bioinformatics-repeatable/) walkthroughs.
  6. Resource management on HPC

Reading resources

Some resources and articles you can make use in this course:

  1. Awesome pipelines: A curated list of pipelines and workflow languages

  2. Existing Workflow systems: Computational Data Analysis Workflow Systems

  3. Papers: